From isanbard at gmail.com Mon Feb 11 01:09:01 2008 From: isanbard at gmail.com (Bill Wendling) Date: Sun, 10 Feb 2008 23:09:01 -0800 Subject: [llvm-commits] [llvm] r46922 - /llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp In-Reply-To: <241F94F0-D3FB-4D23-B552-6A5C9E66F868@apple.com> References: <200802100810.m1A8AOS2007775@zion.cs.uiuc.edu> <241F94F0-D3FB-4D23-B552-6A5C9E66F868@apple.com> Message-ID: <84E5BEBC-5D19-4941-B711-1E6DAD6196AE@gmail.com> On Feb 10, 2008, at 7:46 PM, Evan Cheng wrote: > What is this fixing? The code as it was was a no-op. This makes it do what the comment says it's supposed to do. > I would expect SelectionDAG to constant > evaluating it to a constant node? > Perhaps. But then why is it here at all then? -bw > On Feb 10, 2008, at 12:10 AM, Bill Wendling wrote: > >> Author: void >> Date: Sun Feb 10 02:10:24 2008 >> New Revision: 46922 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=46922&view=rev >> Log: >> Return "(c1 + c2)" instead of yet another ADD node (which made this a >> no-op). >> >> ===================================================================== >> = >> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) >> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Sun Feb 10 >> 02:10:24 2008 >> @@ -923,7 +923,7 @@ >> return N1; >> // fold (add c1, c2) -> c1+c2 >> if (N0C && N1C) >> - return DAG.getNode(ISD::ADD, VT, N0, N1); >> + return DAG.getConstant(N0C->getValue() + N1C->getValue(), VT); >> // canonicalize constant to RHS >> if (N0C && !N1C) >> return DAG.getNode(ISD::ADD, VT, N1, N0); >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From sabre at nondot.org Mon Feb 11 01:21:09 2008 From: sabre at nondot.org (Chris Lattner) Date: Mon, 11 Feb 2008 07:21:09 -0000 Subject: [llvm-commits] [llvm] r46951 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <200802110721.m1B7LArr015184@zion.cs.uiuc.edu> Author: lattner Date: Mon Feb 11 01:21:08 2008 New Revision: 46951 URL: http://llvm.org/viewvc/llvm-project?rev=46951&view=rev Log: updates from Evan Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=46951&r1=46950&r2=46951&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Feb 11 01:21:08 2008 @@ -97,7 +97,8 @@ LLVM 2.1, llvm-gcc 4.2 was beta). Since LLVM 2.1, the llvm-gcc 4.2 front-end has made leaps and bounds and is now at least as good as 4.0 in virtually every area, and is better in several areas (for example, exception handling -correctness, support for Ada and Fortran). We strongly recommend that you +correctness, support for Ada and Fortran, better ABI compatibility, etc). We +strongly recommend that you migrate from llvm-gcc 4.0 to llvm-gcc 4.2 in this release cycle because LLVM 2.2 is the last release that will support llvm-gcc 4.0: LLVM 2.3 will only support the llvm-gcc 4.2 front-end.
@@ -265,7 +266,8 @@ llvm-gcc by default, but can be accessed through 'opt'.LLVM 2.2 includes several major new capabilities:
Total unconfirmed: 17
+Total unconfirmed: 23
If your name is misspelled, or organization affiliation isn't correct, please email us, and we'll correct it. From gohman at apple.com Mon Feb 11 17:45:15 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 11 Feb 2008 23:45:15 -0000 Subject: [llvm-commits] [llvm] r46978 - /llvm/trunk/include/llvm/ADT/APInt.h Message-ID: <200802112345.m1BNjFum020045@zion.cs.uiuc.edu> Author: djg Date: Mon Feb 11 17:45:14 2008 New Revision: 46978 URL: http://llvm.org/viewvc/llvm-project?rev=46978&view=rev Log: Correct the order of the arguments in the examples in the comments for APInt::getBitsSet. And fix an off-by-one bug in "wrapping" mode. Modified: llvm/trunk/include/llvm/ADT/APInt.h Modified: llvm/trunk/include/llvm/ADT/APInt.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/APInt.h?rev=46978&r1=46977&r2=46978&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/APInt.h (original) +++ llvm/trunk/include/llvm/ADT/APInt.h Mon Feb 11 17:45:14 2008 @@ -371,9 +371,9 @@ /// Constructs an APInt value that has a contiguous range of bits set. The /// bits from loBit to hiBit will be set. All other bits will be zero. For - /// example, with parameters(32, 15, 0) you would get 0x0000FFFF. If hiBit is + /// example, with parameters(32, 0, 15) you would get 0x0000FFFF. If hiBit is /// less than loBit then the set bits "wrap". For example, with - /// parameters (32, 3, 28), you would get 0xF000000F. + /// parameters (32, 28, 3), you would get 0xF000000F. /// @param numBits the intended bit width of the result /// @param loBit the index of the lowest bit set. /// @param hiBit the index of the highest bit set. @@ -384,7 +384,7 @@ assert(loBit < numBits && "loBit out of range"); if (hiBit < loBit) return getLowBitsSet(numBits, hiBit+1) | - getHighBitsSet(numBits, numBits-loBit+1); + getHighBitsSet(numBits, numBits-loBit); return getLowBitsSet(numBits, hiBit-loBit+1).shl(loBit); } From natebegeman at mac.com Mon Feb 11 17:47:56 2008 From: natebegeman at mac.com (Nate Begeman) Date: Mon, 11 Feb 2008 23:47:56 -0000 Subject: [llvm-commits] [llvm] r46979 - /llvm/trunk/lib/Target/IA64/README Message-ID: <200802112347.m1BNluHv020133@zion.cs.uiuc.edu> Author: sampo Date: Mon Feb 11 17:47:56 2008 New Revision: 46979 URL: http://llvm.org/viewvc/llvm-project?rev=46979&view=rev Log: Stuff noticed while grepping code Modified: llvm/trunk/lib/Target/IA64/README Modified: llvm/trunk/lib/Target/IA64/README URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/IA64/README?rev=46979&r1=46978&r2=46979&view=diff ============================================================================== --- llvm/trunk/lib/Target/IA64/README (original) +++ llvm/trunk/lib/Target/IA64/README Mon Feb 11 17:47:56 2008 @@ -1,59 +1,11 @@ -*** README for the LLVM IA64 Backend "Version 0.01" - March 18, 2005 -*** Quote for this version: - - "Kaori and Hitomi are naughty!!" - - -Congratulations, you have found: - -**************************************************************** -* @@@ @@@ @@@ @@@ @@@@@@@@@@ * -* @@@ @@@ @@@ @@@ @@@@@@@@@@@ * -* @@! @@! @@! @@@ @@! @@! @@! * -* !@! !@! !@! @!@ !@! !@! !@! * -* @!! @!! @!@ !@! @!! !!@ @!@ * -* !!! !!! !@! !!! !@! ! !@! * -* !!: !!: :!: !!: !!: !!: * -* :!: :!: ::!!:! :!: :!: * -* :: :::: :: :::: :::: ::: :: * -* : :: : : : :: : : : : : * -* * -* * -* @@@@@@ @@@ @@@ @@@ @@@@@@ @@@@@@ @@@ * -* @@@@@@@@ @@@@ @@@ @@@ @@@@@@@@ @@@@@@@ @@@@ * -* @@! @@@ @@!@!@@@ @@! @@! @@@ !@@ @@!@! * -* !@! @!@ !@!!@!@! !@! !@! @!@ !@! !@!!@! * -* @!@ !@! @!@ !!@! !!@ @!@!@!@! !!@@!@! @!! @!! * -* !@! !!! !@! !!! !!! !!!@!!!! @!!@!!!! !!! !@! * -* !!: !!! !!: !!! !!: !!: !!! !:! !:! :!!:!:!!: * -* :!: !:! :!: !:! :!: :!: !:! :!: !:! !:::!!::: * -* ::::: :: :: :: :: :: ::: :::: ::: ::: * -* : : : :: : : : : : :: : : ::: * -* * -**************************************************************** -* Bow down, bow down, before the power of IA64! Or be crushed, * -* be crushed, by its jolly registers of doom!! * -**************************************************************** - -DEVELOPMENT PLAN: - - _ you are 2005 maybe 2005 2006 2006 and - / here | | | beyond - v v v v | - v -CLEAN UP ADD INSTRUCTION ADD PLAY WITH -INSTRUCTION --> SCHEDULING AND --> JIT --> DYNAMIC --> FUTURE WORK -SELECTION BUNDLING SUPPORT REOPTIMIZATION - -DISCLAIMER AND PROMISE: - -As of the time of this release, you are probably better off using Intel C/C++ -or GCC. The performance of the code emitted right now is, in a word, -terrible. Check back in a few months - the story will be different then, -I guarantee it. - TODO: - + - Un-bitrot ISel + - Hook up If-Conversion a la ARM target + - Hook up all branch analysis functions + - Instruction scheduling + - Bundling + - Dynamic Optimization + - Testing and bugfixing - stop passing FP args in both FP *and* integer regs when not required - allocate low (nonstacked) registers more aggressively - clean up and thoroughly test the isel patterns. @@ -65,14 +17,11 @@ (we will avoid the mess that is: http://gcc.gnu.org/ml/gcc/2003-12/msg00832.html ) - instruction scheduling (hmmmm! ;) - - write truly inspirational documentation - - if-conversion (predicate database/knowledge? etc etc) - counted loop support - make integer + FP mul/div more clever (we have fixed pseudocode atm) - track and use comparison complements INFO: - - we are strictly LP64 here, no support for ILP32 on HP-UX. Linux users don't need to worry about this. - i have instruction scheduling/bundling pseudocode, that really works @@ -80,7 +29,6 @@ so, before you go write your own, send me an email! KNOWN DEFECTS AT THE CURRENT TIME: - - C++ vtables contain naked function pointers, not function descriptors, which is bad. see http://llvm.cs.uiuc.edu/bugs/show_bug.cgi?id=406 - varargs are broken @@ -90,17 +38,11 @@ these will probably be fixed soon. ACKNOWLEDGEMENTS: - - Chris Lattner (x100) - Other LLVM developers ("hey, that looks familiar") CONTACT: - - You can email me at duraid at octopus.com.au. If you find a small bug, just email me. If you find a big bug, please file a bug report in bugzilla! http://llvm.cs.uiuc.edu is your one stop shop for all things LLVM. - - - - From tonic at nondot.org Mon Feb 11 20:42:55 2008 From: tonic at nondot.org (Tanya Lattner) Date: Tue, 12 Feb 2008 02:42:55 -0000 Subject: [llvm-commits] [llvm] r46981 - /llvm/trunk/docs/GettingStarted.html Message-ID: <200802120242.m1C2gtbr025036@zion.cs.uiuc.edu> Author: tbrethou Date: Mon Feb 11 20:42:55 2008 New Revision: 46981 URL: http://llvm.org/viewvc/llvm-project?rev=46981&view=rev Log: Add 2.2 release tag. Modified: llvm/trunk/docs/GettingStarted.html Modified: llvm/trunk/docs/GettingStarted.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/GettingStarted.html?rev=46981&r1=46980&r2=46981&view=diff ============================================================================== --- llvm/trunk/docs/GettingStarted.html (original) +++ llvm/trunk/docs/GettingStarted.html Mon Feb 11 20:42:55 2008 @@ -710,6 +710,7 @@ subdirectories of the 'tags' directory:
Written by the LLVM Team
+
This document contains the release notes for the LLVM compiler -infrastructure, release 2.1. Here we describe the status of LLVM, including +infrastructure, release 2.2. Here we describe the status of LLVM, including major improvements from the previous release and any known problems. All LLVM releases may be downloaded from the LLVM releases web site.
@@ -58,122 +59,110 @@This is the twelfth public release of the LLVM Compiler Infrastructure. -It includes many features and refinements from LLVM 2.0.
+This is the thirteenth public release of the LLVM Compiler Infrastructure. +It includes many features and refinements from LLVM 2.1.
LLVM 2.1 brings two new beta C front-ends. First, a new version of llvm-gcc -based on GCC 4.2, innovatively called "llvm-gcc-4.2". This promises to bring -FORTRAN and Ada support to LLVM as well as features like atomic builtins and -OpenMP. None of these actually work yet, but don't let that stop you checking -it out!
- -Second, LLVM now includes its own native C and Objective-C front-end (C++ is -in progress, but is not very far along) code named "clang". This front-end has a number of great -features, primarily aimed at source-level analysis and speeding up compile-time. -At this point though, the LLVM Code Generator component is still very early in -development, so it's mostly useful for people looking to build source-level -analysis tools or source-to-source translators.
+This is the last LLVM release to support llvm-gcc 4.0, llvm-upgrade, and +llvmc in its current form. llvm-gcc 4.0 has been replaced with llvm-gcc 4.2. +llvm-upgrade is useful for upgrading llvm 1.9 files to llvm 2.x syntax, but you +can always use an old release to do this. llvmc is currently mostly useless in +llvm 2.2, and will be redesigned or removed in llvm 2.3.
Some of the most noticable feature improvements this release have been in the -optimizer, speeding it up and making it more aggressive. For example:
- -LLVM 2.2 fully supports both the llvm-gcc 4.0 and llvm-gcc 4.2 front-ends (in +LLVM 2.1, llvm-gcc 4.2 was beta). Since LLVM 2.1, the llvm-gcc 4.2 front-end +has made leaps and bounds and is now at least as good as 4.0 in virtually every +area, and is better in several areas (for example, exception handling +correctness, support for Ada and Fortran, better ABI compatibility, etc). We +strongly recommend that you +migrate from llvm-gcc 4.0 to llvm-gcc 4.2 in this release cycle because +LLVM 2.2 is the last release that will support llvm-gcc 4.0: LLVM 2.3 +will only support the llvm-gcc 4.2 front-end.
+ +The clang project is an effort to build +a set of new 'llvm native' front-end technologies for the LLVM optimizer +and code generator. Currently, its C and Objective-C support is maturing +nicely, and it has advanced source-to-source analysis and transformation +capabilities. If you are interested in building source-level tools for C and +Objective-C (and eventually C++), you should take a look. However, note that +clang is not an official part of the LLVM 2.2 release. If you are interested in +this project, please see its web site.
One of the main focuses of this release was performance tuning and bug - fixing. In addition to these, several new major changes occurred:
+LLVM 2.2 includes several major new capabilities:
We put a significant amount of work into the code generator infrastructure, +which allows us to implement more aggressive algorithms and make it run +faster:
+ +New features include: -
+ +In addition to a huge array of bug fixes and minor performance tweaks, the +LLVM 2.2 optimizers support a few major enhancements:
New features include: +
New target-specific features include:
llvm-gcc4 does not currently support Link-Time -Optimization on most platforms "out-of-the-box". Please inquire on the +
llvm-gcc does not currently support Link-Time +Optimization on most platforms "out-of-the-box". Please inquire on the llvmdev mailing list if you are interested.
"long double" is silently transformed by the front-end into "double". There -is no support for floating point data types of any size other than 32 and 64 -bits.
llvm-gcc does not support __builtin_apply yet. See Constructing Calls: Dispatching a call to another function.
Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt + to determine whether or not two pointers ever can point to the same object in + memory. There are many different algorithms for alias analysis and many + different ways of classifying them: flow-sensitive vs flow-insensitive, + context-sensitive vs context-insensitive, field-sensitive vs field-insensitive, + unification-based vs subset-based, etc. Traditionally, alias analyses respond + to a query with a Must, May, or No alias response, + indicating that two pointers always point to the same object, might point to the + same object, or are known to never point to the same object.
+ +The LLVM AliasAnalysis + class is the primary interface used by clients and implementations of alias + analyses in the LLVM system. This class is the common interface between clients + of alias analysis information and the implementations providing it, and is + designed to support a wide range of implementations and clients (but currently + all clients are assumed to be flow-insensitive). In addition to simple alias + analysis information, this class exposes Mod/Ref information from those + implementations which can provide it, allowing for powerful analyses and + transformations to work well together.
+ +This document contains information necessary to successfully implement this + interface, use it, and to test both sides. It also explains some of the finer + points about what exactly results mean. If you feel that something is unclear + or should be added, please let me + know.
+ +The AliasAnalysis + class defines the interface that the various alias analysis implementations + should support. This class exports two important enums: AliasResult + and ModRefResult which represent the result of an alias query or a + mod/ref query, respectively.
+ +The AliasAnalysis interface exposes information about memory, + represented in several different ways. In particular, memory objects are + represented as a starting address and size, and function calls are represented + as the actual call or invoke instructions that performs the + call. The AliasAnalysis interface also exposes some helper methods + which allow you to get mod/ref information for arbitrary instructions.
+ +Most importantly, the AliasAnalysis class provides several methods + which are used to query whether or not two memory objects alias, whether + function calls can modify or read a memory object, etc. For all of these + queries, memory objects are represented as a pair of their starting address (a + symbolic LLVM Value*) and a static size.
+ +Representing memory objects as a starting address and a size is critically + important for correct Alias Analyses. For example, consider this (silly, but + possible) C code:
+ +
+ int i;
+ char C[2];
+ char A[10];
+ /* ... */
+ for (i = 0; i != 10; ++i) {
+ C[0] = A[i]; /* One byte store */
+ C[1] = A[9-i]; /* One byte store */
+ }
+
+ In this case, the basicaa pass will disambiguate the stores to + C[0] and C[1] because they are accesses to two distinct + locations one byte apart, and the accesses are each one byte. In this case, the + LICM pass can use store motion to remove the stores from the loop. In + constrast, the following code:
+ +
+ int i;
+ char C[2];
+ char A[10];
+ /* ... */
+ for (i = 0; i != 10; ++i) {
+ ((short*)C)[0] = A[i]; /* Two byte store! */
+ C[1] = A[9-i]; /* One byte store */
+ }
+
+ In this case, the two stores to C do alias each other, because the access to + the &C[0] element is a two byte access. If size information wasn't + available in the query, even the first case would have to conservatively assume + that the accesses alias.
+ +An Alias Analysis implementation can return one of three responses: + MustAlias, MayAlias, and NoAlias. The No and May alias results are obvious: if + the two pointers can never equal each other, return NoAlias, if they might, + return MayAlias.
+ +The MustAlias response is trickier though. In LLVM, the Must Alias response + may only be returned if the two memory objects are guaranteed to always start at + exactly the same location. If two memory objects overlap, but do not start at + the same location, return MayAlias.
+ +The getModRefInfo methods return information about whether the + execution of an instruction can read or modify a memory location. Mod/Ref + information is always conservative: if an instruction might read or write + a location, ModRef is returned.
+ +The AliasAnalysis class also provides a getModRefInfo + method for testing dependencies between function calls. This method takes two + call sites (CS1 & CS2), returns NoModRef if the two calls refer to disjoint + memory locations, Ref if CS1 reads memory written by CS2, Mod if CS1 writes to + memory read or written by CS2, or ModRef if CS1 might read or write memory + accessed by CS2. Note that this relation is not commutative. Clients that use + this method should be predicated on the hasNoModRefInfoForCalls() + method, which indicates whether or not an analysis can provide mod/ref + information for function call pairs (most can not). If this predicate is false, + the client shouldn't waste analysis time querying the getModRefInfo + method many times.
+ ++ Several other tidbits of information are often collected by various alias + analysis implementations and can be put to good use by various clients. +
+ +The getMustAliases method returns all values that are known to + always must alias a pointer. This information can be provided in some cases for + important objects like the null pointer and global values. Knowing that a + pointer always points to a particular function allows indirect calls to be + turned into direct calls, for example.
+ +The pointsToConstantMemory method returns true if and only if the + analysis can prove that the pointer only points to unchanging memory locations + (functions, constant global variables, and the null pointer). This information + can be used to refine mod/ref information: it is impossible for an unchanging + memory location to be modified.
+ +These methods are used to provide very simple mod/ref information for + function calls. The doesNotAccessMemory method returns true for a + function if the analysis can prove that the function never reads or writes to + memory, or if the function only reads from constant memory. Functions with this + property are side-effect free and only depend on their input arguments, allowing + them to be eliminated if they form common subexpressions or be hoisted out of + loops. Many common functions behave this way (e.g., sin and + cos) but many others do not (e.g., acos, which modifies the + errno variable).
+ +The onlyReadsMemory method returns true for a function if analysis + can prove that (at most) the function only reads from non-volatile memory. + Functions with this property are side-effect free, only depending on their input + arguments and the state of memory when they are called. This property allows + calls to these functions to be eliminated and moved around, as long as there is + no store instruction that changes the contents of memory. Note that all + functions that satisfy the doesNotAccessMemory method also satisfies + onlyReadsMemory.
+ +Writing a new alias analysis implementation for LLVM is quite + straight-forward. There are already several implementations that you can use + for examples, and the following information should help fill in any details. + For a examples, take a look at the various alias analysis + implementations included with LLVM.
+ +The first step to determining what type of LLVM pass you need to use for your Alias + Analysis. As is the case with most other analyses and transformations, the + answer should be fairly obvious from what type of problem you are trying to + solve:
+ +In addition to the pass that you subclass, you should also inherit from the + AliasAnalysis interface, of course, and use the + RegisterAnalysisGroup template to register as an implementation of + AliasAnalysis.
+ +Your subclass of AliasAnalysis is required to invoke two methods on + the AliasAnalysis base class: getAnalysisUsage and + InitializeAliasAnalysis. In particular, your implementation of + getAnalysisUsage should explicitly call into the + AliasAnalysis::getAnalysisUsage method in addition to doing any + declaring any pass dependencies your pass has. Thus you should have something + like this:
+ +
+ void getAnalysisUsage(AnalysisUsage &AU) const {
+ AliasAnalysis::getAnalysisUsage(AU);
+ // declare your dependencies here.
+ }
+
+ Additionally, your must invoke the InitializeAliasAnalysis method + from your analysis run method (run for a Pass, + runOnFunction for a FunctionPass, or InitializePass + for an ImmutablePass). For example (as part of a Pass):
+ +
+ bool run(Module &M) {
+ InitializeAliasAnalysis(this);
+ // Perform analysis here...
+ return false;
+ }
+
+ All of the AliasAnalysis + virtual methods default to providing chaining to another + alias analysis implementation, which ends up returning conservatively correct + information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries + respectively). Depending on the capabilities of the analysis you are + implementing, you just override the interfaces you can improve.
+ +With only two special exceptions (the basicaa and no-aa + passes) every alias analysis pass chains to another alias analysis + implementation (for example, the user can specify "-basicaa -ds-aa + -anders-aa -licm" to get the maximum benefit from the three alias + analyses). The alias analysis class automatically takes care of most of this + for methods that you don't override. For methods that you do override, in code + paths that return a conservative MayAlias or Mod/Ref result, simply return + whatever the superclass computes. For example:
+ +
+ AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size,
+ const Value *V2, unsigned V2Size) {
+ if (...)
+ return NoAlias;
+ ...
+
+ // Couldn't determine a must or no-alias result.
+ return AliasAnalysis::alias(V1, V1Size, V2, V2Size);
+ }
+
+ In addition to analysis queries, you must make sure to unconditionally pass + LLVM update notification methods to the superclass as + well if you override them, which allows all alias analyses in a change to be + updated.
+ ++ Alias analysis information is initially computed for a static snapshot of the + program, but clients will use this information to make transformations to the + code. All but the most trivial forms of alias analysis will need to have their + analysis results updated to reflect the changes made by these transformations. +
+ ++ The AliasAnalysis interface exposes two methods which are used to + communicate program changes from the clients to the analysis implementations. + Various alias analysis implementations should use these methods to ensure that + their internal data structures are kept up-to-date as the program changes (for + example, when an instruction is deleted), and clients of alias analysis must be + sure to call these interfaces appropriately. +
+From the LLVM perspective, the only thing you need to do to provide an + efficient alias analysis is to make sure that alias analysis queries are + serviced quickly. The actual calculation of the alias analysis results (the + "run" method) is only performed once, but many (perhaps duplicate) queries may + be performed. Because of this, try to move as much computation to the run + method as possible (within reason).
+ +There are several different ways to use alias analysis results. In order of + preference, these are...
+ +The load-vn pass uses alias analysis to provide value numbering + information for load instructions and pointer values. If your analysis + or transformation can be modeled in a form that uses value numbering + information, you don't have to do anything special to handle load instructions: + just use the load-vn pass, which uses alias analysis.
+ +Many transformations need information about alias sets that are active + in some scope, rather than information about pairwise aliasing. The AliasSetTracker class + is used to efficiently build these Alias Sets from the pairwise alias analysis + information provided by the AliasAnalysis interface.
+ +First you initialize the AliasSetTracker by using the "add" methods + to add information about various potentially aliasing instructions in the scope + you are interested in. Once all of the alias sets are completed, your pass + should simply iterate through the constructed alias sets, using the + AliasSetTracker begin()/end() methods.
+ +The AliasSets formed by the AliasSetTracker are guaranteed + to be disjoint, calculate mod/ref information and volatility for the set, and + keep track of whether or not all of the pointers in the set are Must aliases. + The AliasSetTracker also makes sure that sets are properly folded due to call + instructions, and can provide a list of pointers in each set.
+ +As an example user of this, the Loop + Invariant Code Motion pass uses AliasSetTrackers to calculate alias + sets for each loop nest. If an AliasSet in a loop is not modified, + then all load instructions from that set may be hoisted out of the loop. If any + alias sets are stored to and are must alias sets, then the stores may be + sunk to outside of the loop, promoting the memory location to a register for the + duration of the loop nest. Both of these transformations only apply if the + pointer argument is loop-invariant.
+ +The AliasSetTracker class is implemented to be as efficient as possible. It + uses the union-find algorithm to efficiently merge AliasSets when a pointer is + inserted into the AliasSetTracker that aliases multiple sets. The primary data + structure is a hash table mapping pointers to the AliasSet they are in.
+ +The AliasSetTracker class must maintain a list of all of the LLVM Value*'s + that are in each AliasSet. Since the hash table already has entries for each + LLVM Value* of interest, the AliasesSets thread the linked list through these + hash-table nodes to avoid having to allocate memory unnecessarily, and to make + merging alias sets extremely efficient (the linked list merge is constant time). +
+ +You shouldn't need to understand these details if you are just a client of + the AliasSetTracker, but if you look at the code, hopefully this brief + description will help make sense of why things are designed the way they + are.
+ +If neither of these utility class are what your pass needs, you should use + the interfaces exposed by the AliasAnalysis class directly. Try to use + the higher-level methods when possible (e.g., use mod/ref information instead of + the alias method directly if possible) to get the + best precision and efficiency.
+ +If you're going to be working with the LLVM alias analysis infrastructure, + you should know what clients and implementations of alias analysis are + available. In particular, if you are implementing an alias analysis, you should + be aware of the the clients that are useful + for monitoring and evaluating different implementations.
+ +This section lists the various implementations of the AliasAnalysis + interface. With the exception of the -no-aa and + -basicaa implementations, all of these chain to other alias analysis implementations.
+ +The -no-aa pass is just like what it sounds: an alias analysis that + never returns any useful information. This pass can be useful if you think that + alias analysis is doing something wrong and are trying to narrow down a + problem.
+ +The -basicaa pass is the default LLVM alias analysis. It is an + aggressive local analysis that "knows" many important facts:
+ +This pass implements a simple context-sensitive mod/ref and alias analysis + for internal global variables that don't "have their address taken". If a + global does not have its address taken, the pass knows that no pointers alias + the global. This pass also keeps track of functions that it knows never access + memory or never read memory. This allows certain optimizations (e.g. GCSE) to + eliminate call instructions entirely. +
+ +The real power of this pass is that it provides context-sensitive mod/ref + information for call instructions. This allows the optimizer to know that + calls to a function do not clobber or read the value of the global, allowing + loads and stores to be eliminated.
+ +Note that this pass is somewhat limited in its scope (only support + non-address taken globals), but is very quick analysis.
+The -anders-aa pass implements the well-known "Andersen's algorithm" + for interprocedural alias analysis. This algorithm is a subset-based, + flow-insensitive, context-insensitive, and field-insensitive alias analysis that + is widely believed to be fairly precise. Unfortunately, this algorithm is also + O(N3). The LLVM implementation currently does not implement any of + the refinements (such as "online cycle elimination" or "offline variable + substitution") to improve its efficiency, so it can be quite slow in common + cases. +
+ +The -steens-aa pass implements a variation on the well-known + "Steensgaard's algorithm" for interprocedural alias analysis. Steensgaard's + algorithm is a unification-based, flow-insensitive, context-insensitive, and + field-insensitive alias analysis that is also very scalable (effectively linear + time).
+ +The LLVM -steens-aa pass implements a "speculatively + field-sensitive" version of Steensgaard's algorithm using the Data + Structure Analysis framework. This gives it substantially more precision than + the standard algorithm while maintaining excellent analysis scalability.
+ +Note that -steens-aa is available in the optional "poolalloc" + module, it is not part of the LLVM core.
+ +The -ds-aa pass implements the full Data Structure Analysis + algorithm. Data Structure Analysis is a modular unification-based, + flow-insensitive, context-sensitive, and speculatively + field-sensitive alias analysis that is also quite scalable, usually at + O(n*log(n)).
+ +This algorithm is capable of responding to a full variety of alias analysis + queries, and can provide context-sensitive mod/ref information as well. The + only major facility not implemented so far is support for must-alias + information.
+ +Note that -ds-aa is available in the optional "poolalloc" + module, it is not part of the LLVM core.
+ +The -adce pass, which implements Aggressive Dead Code Elimination + uses the AliasAnalysis interface to delete calls to functions that do + not have side-effects and are not used.
+ +The -licm pass implements various Loop Invariant Code Motion related + transformations. It uses the AliasAnalysis interface for several + different transformations:
+ ++ The -argpromotion pass promotes by-reference arguments to be passed in + by-value instead. In particular, if pointer arguments are only loaded from it + passes in the value loaded instead of the address to the function. This pass + uses alias information to make sure that the value loaded from the argument + pointer is not modified between the entry of the function and any load of the + pointer.
+The -load-vn pass uses alias analysis to "value + number" loads and pointers values, which is used by the GCSE pass to + eliminate instructions. The -load-vn pass relies on alias information + and must-alias information. This combination of passes can make the following + transformations:
+ +These passes are useful for evaluating the various alias analysis + implementations. You can use them with commands like 'opt -anders-aa -ds-aa + -aa-eval foo.bc -disable-output -stats'.
+ +The -print-alias-sets pass is exposed as part of the + opt tool to print out the Alias Sets formed by the AliasSetTracker class. This is useful if you're using + the AliasSetTracker class. To use it, use something like:
+ ++ % opt -ds-aa -print-alias-sets -disable-output ++
The -count-aa pass is useful to see how many queries a particular + pass is making and what responses are returned by the alias analysis. As an + example,
+ ++ % opt -basicaa -count-aa -ds-aa -count-aa -licm ++
will print out how many queries (and what responses are returned) by the + -licm pass (of the -ds-aa pass) and how many queries are made + of the -basicaa pass by the -ds-aa pass. This can be useful + when debugging a transformation or an alias analysis implementation.
+ +The -aa-eval pass simply iterates through all pairs of pointers in a + function and asks an alias analysis whether or not the pointers alias. This + gives an indication of the precision of the alias analysis. Statistics are + printed indicating the percent of no/may/must aliases found (a more precise + algorithm will have a lower number of may aliases).
+ +If you're just looking to be a client of alias analysis information, consider + using the Memory Dependence Analysis interface instead. MemDep is a lazy, + caching layer on top of alias analysis that is able to answer the question of + what preceding memory operations a given instruction depends on, either at an + intra- or inter-block level. Because of its laziness and caching + policy, using MemDep can be a significant performance win over accessing alias + analysis directly.
+ +This document describes the LLVM bitstream file format and the encoding of + the LLVM IR into it.
+ ++ What is commonly known as the LLVM bitcode file format (also, sometimes + anachronistically known as bytecode) is actually two things: a bitstream container format + and an encoding of LLVM IR into the container format.
+ ++ The bitstream format is an abstract encoding of structured data, very + similar to XML in some ways. Like XML, bitstream files contain tags, and nested + structures, and you can parse the file without having to understand the tags. + Unlike XML, the bitstream format is a binary encoding, and unlike XML it + provides a mechanism for the file to self-describe "abbreviations", which are + effectively size optimizations for the content.
+ +This document first describes the LLVM bitstream format, then describes the + record structure used by LLVM IR files. +
+ ++ The bitstream format is literally a stream of bits, with a very simple + structure. This structure consists of the following concepts: +
+ +Note that the llvm-bcanalyzer tool can be + used to dump and inspect arbitrary bitstreams, which is very useful for + understanding the encoding.
+ +The first two bytes of a bitcode file are 'BC' (0x42, 0x43). + The second two bytes are an application-specific magic number. Generic + bitcode tools can look at only the first two bytes to verify the file is + bitcode, while application-specific programs will want to look at all four.
+ ++ A bitstream literally consists of a stream of bits, which are read in order + starting with the least significant bit of each byte. The stream is made up of a + number of primitive values that encode a stream of unsigned integer values. + These + integers are are encoded in two ways: either as Fixed + Width Integers or as Variable Width + Integers. +
+ +Fixed-width integer values have their low bits emitted directly to the file. + For example, a 3-bit integer value encodes 1 as 001. Fixed width integers + are used when there are a well-known number of options for a field. For + example, boolean values are usually encoded with a 1-bit wide integer. +
+ +Variable-width integer (VBR) values encode values of arbitrary size, + optimizing for the case where the values are small. Given a 4-bit VBR field, + any 3-bit value (0 through 7) is encoded directly, with the high bit set to + zero. Values larger than N-1 bits emit their bits in a series of N-1 bit + chunks, where all but the last set the high bit.
+ +For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a + vbr4 value. The first set of four bits indicates the value 3 (011) with a + continuation piece (indicated by a high bit of 1). The next word indicates a + value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value + 27. +
+ +6-bit characters encode common characters into a fixed 6-bit field. They + represent the following characters with the following 6-bit values:
+ +This encoding is only suitable for encoding characters and strings that + consist only of the above characters. It is completely incapable of encoding + characters not in the set.
+ +Occasionally, it is useful to emit zero bits until the bitstream is a + multiple of 32 bits. This ensures that the bit position in the stream can be + represented as a multiple of 32-bit words.
+ ++ A bitstream is a sequential series of Blocks and + Data Records. Both of these start with an + abbreviation ID encoded as a fixed-bitwidth field. The width is specified by + the current block, as described below. The value of the abbreviation ID + specifies either a builtin ID (which have special meanings, defined below) or + one of the abbreviation IDs defined by the stream itself. +
+ ++ The set of builtin abbrev IDs is: +
+ +Abbreviation IDs 4 and above are defined by the stream itself, and specify + an abbreviated record encoding.
+ ++ Blocks in a bitstream denote nested regions of the stream, and are identified by + a content-specific id number (for example, LLVM IR uses an ID of 12 to represent + function bodies). Block IDs 0-7 are reserved for standard blocks + whose meaning is defined by Bitcode; block IDs 8 and greater are + application specific. Nested blocks capture the hierachical structure of the data + encoded in it, and various properties are associated with blocks as the file is + parsed. Block definitions allow the reader to efficiently skip blocks + in constant time if the reader wants a summary of blocks, or if it wants to + efficiently skip data they do not understand. The LLVM IR reader uses this + mechanism to skip function bodies, lazily reading them on demand. +
+ ++ When reading and encoding the stream, several properties are maintained for the + block. In particular, each block maintains: +
+ +As sub blocks are entered, these properties are saved and the new sub-block + has its own set of abbreviations, and its own abbrev id width. When a sub-block + is popped, the saved values are restored.
+ +[ENTER_SUBBLOCK, blockidvbr8, newabbrevlenvbr4, + <align32bits>, blocklen32]
+ ++ The ENTER_SUBBLOCK abbreviation ID specifies the start of a new block record. + The blockid value is encoded as a 8-bit VBR identifier, and indicates + the type of block being entered (which can be a standard + block or an application-specific block). The + newabbrevlen value is a 4-bit VBR which specifies the + abbrev id width for the sub-block. The blocklen is a 32-bit aligned + value that specifies the size of the subblock, in 32-bit words. This value + allows the reader to skip over the entire block in one jump. +
+ +[END_BLOCK, <align32bits>]
+ ++ The END_BLOCK abbreviation ID specifies the end of the current block record. + Its end is aligned to 32-bits to ensure that the size of the block is an even + multiple of 32-bits.
+ ++ Data records consist of a record code and a number of (up to) 64-bit integer + values. The interpretation of the code and values is application specific and + there are multiple different ways to encode a record (with an unabbrev record + or with an abbreviation). In the LLVM IR format, for example, there is a record + which encodes the target triple of a module. The code is MODULE_CODE_TRIPLE, + and the values of the record are the ascii codes for the characters in the + string.
+ +[UNABBREV_RECORD, codevbr6, numopsvbr6, + op0vbr6, op1vbr6, ...]
+ +An UNABBREV_RECORD provides a default fallback encoding, which is both + completely general and also extremely inefficient. It can describe an arbitrary + record, by emitting the code and operands as vbrs.
+ +For example, emitting an LLVM IR target triple as an unabbreviated record + requires emitting the UNABBREV_RECORD abbrevid, a vbr6 for the + MODULE_CODE_TRIPLE code, a vbr6 for the length of the string (which is equal to + the number of operands), and a vbr6 for each character. Since there are no + letters with value less than 32, each letter would need to be emitted as at + least a two-part VBR, which means that each letter would require at least 12 + bits. This is not an efficient encoding, but it is fully general.
+ +[<abbrevid>, fields...]
+ +An abbreviated record is a abbreviation id followed by a set of fields that + are encoded according to the abbreviation + definition. This allows records to be encoded significantly more densely + than records encoded with the UNABBREV_RECORD + type, and allows the abbreviation types to be specified in the stream itself, + which allows the files to be completely self describing. The actual encoding + of abbreviations is defined below. +
+ ++ Abbreviations are an important form of compression for bitstreams. The idea is + to specify a dense encoding for a class of records once, then use that encoding + to emit many records. It takes space to emit the encoding into the file, but + the space is recouped (hopefully plus some) when the records that use it are + emitted. +
+ ++ Abbreviations can be determined dynamically per client, per file. Since the + abbreviations are stored in the bitstream itself, different streams of the same + format can contain different sets of abbreviations if the specific stream does + not need it. As a concrete example, LLVM IR files usually emit an abbreviation + for binary operators. If a specific LLVM module contained no or few binary + operators, the abbreviation does not need to be emitted. +
+[DEFINE_ABBREV, numabbrevopsvbr5, abbrevop0, abbrevop1, + ...]
+ +A DEFINE_ABBREV record adds an abbreviation to the list of currently + defined abbreviations in the scope of this block. This definition only + exists inside this immediate block -- it is not visible in subblocks or + enclosing blocks. + Abbreviations are implicitly assigned IDs + sequentially starting from 4 (the first application-defined abbreviation ID). + Any abbreviations defined in a BLOCKINFO record receive IDs first, in order, + followed by any abbreviations defined within the block itself. + Abbreviated data records reference this ID to indicate what abbreviation + they are invoking.
+ +An abbreviation definition consists of the DEFINE_ABBREV abbrevid followed + by a VBR that specifies the number of abbrev operands, then the abbrev + operands themselves. Abbreviation operands come in three forms. They all start + with a single bit that indicates whether the abbrev operand is a literal operand + (when the bit is 1) or an encoding operand (when the bit is 0).
+ +The possible operand encodings are:
+ +For example, target triples in LLVM modules are encoded as a record of the + form [TRIPLE, 'a', 'b', 'c', 'd']. Consider if the bitstream emitted + the following abbrev entry:
+ +When emitting a record with this abbreviation, the above entry would be + emitted as:
+ +[4abbrevwidth, 24, 4vbr6, + 06, 16, 26, 36]
+ +These values are:
+ +With this abbreviation, the triple is emitted with only 37 bits (assuming a + abbrev id width of 3). Without the abbreviation, significantly more space would + be required to emit the target triple. Also, since the TRIPLE value is not + emitted as a literal in the abbreviation, the abbreviation can also be used for + any other string value. +
+ ++ In addition to the basic block structure and record encodings, the bitstream + also defines specific builtin block types. These block types specify how the + stream is to be decoded or other metadata. In the future, new standard blocks + may be added. Block IDs 0-7 are reserved for standard blocks. +
+ +The BLOCKINFO block allows the description of metadata for other blocks. The + currently specified records are:
+ ++ The SETBID record indicates which block ID is being described. SETBID + records can occur multiple times throughout the block to change which + block ID is being described. There must be a SETBID record prior to + any other records. +
+ ++ Standard DEFINE_ABBREV records can occur inside BLOCKINFO blocks, but unlike + their occurrence in normal blocks, the abbreviation is defined for blocks + matching the block ID we are describing, not the BLOCKINFO block itself. + The abbreviations defined in BLOCKINFO blocks receive abbreviation ids + as described in DEFINE_ABBREV. +
+ ++ Note that although the data in BLOCKINFO blocks is described as "metadata," the + abbreviations they contain are essential for parsing records from the + corresponding blocks. It is not safe to skip them. +
+ +LLVM IR is encoded into a bitstream by defining blocks and records. It uses + blocks for things like constant pools, functions, symbol tables, etc. It uses + records for things like instructions, global variable descriptors, type + descriptions, etc. This document does not describe the set of abbreviations + that the writer uses, as these are fully self-described in the file, and the + reader is not allowed to build in any knowledge of this.
+ ++ The magic number for LLVM IR files is: +
+ +[0x04, 0xC4, 0xE4, 0xD4]
+ +When combined with the bitcode magic number and viewed as bytes, this is "BC 0xC0DE".
+ ++ Variable Width Integers are an efficient way to + encode arbitrary sized unsigned values, but is an extremely inefficient way to + encode signed values (as signed values are otherwise treated as maximally large + unsigned values).
+ +As such, signed vbr values of a specific width are emitted as follows:
+ +With this encoding, small positive and small negative values can both be + emitted efficiently.
+ ++ LLVM IR is defined with the following blocks: +
+ ++
+ +bugpoint narrows down the source of problems in LLVM tools and + passes. It can be used to debug three types of failures: optimizer crashes, + miscompilations by optimizers, or bad native code generation (including problems + in the static and JIT compilers). It aims to reduce large test cases to small, + useful ones. For example, if opt crashes while optimizing a + file, it will identify the optimization (or combination of optimizations) that + causes the crash, and reduce the file down to a small example which triggers the + crash.
+ +For detailed case scenarios, such as debugging opt, + llvm-ld, or one of the LLVM code generators, see How To Submit a Bug Report document.
+ +bugpoint is designed to be a useful tool without requiring any + hooks into the LLVM infrastructure at all. It works with any and all LLVM + passes and code generators, and does not need to "know" how they work. Because + of this, it may appear to do stupid things or miss obvious + simplifications. bugpoint is also designed to trade off programmer + time for computer time in the compiler-debugging process; consequently, it may + take a long period of (unattended) time to reduce a test case, but we feel it + is still worth it. Note that bugpoint is generally very quick unless + debugging a miscompilation where each test of the program (which requires + executing it) takes a long time.
+ +bugpoint reads each .bc or .ll file specified on + the command line and links them together into a single module, called the test + program. If any LLVM passes are specified on the command line, it runs these + passes on the test program. If any of the passes crash, or if they produce + malformed output (which causes the verifier to abort), bugpoint starts + the crash debugger.
+ +Otherwise, if the -output option was not specified, + bugpoint runs the test program with the C backend (which is assumed to + generate good code) to generate a reference output. Once bugpoint has + a reference output for the test program, it tries executing it with the + selected code generator. If the selected code generator crashes, + bugpoint starts the crash debugger on the + code generator. Otherwise, if the resulting output differs from the reference + output, it assumes the difference resulted from a code generator failure, and + starts the code generator debugger.
+ +Finally, if the output of the selected code generator matches the reference + output, bugpoint runs the test program after all of the LLVM passes + have been applied to it. If its output differs from the reference output, it + assumes the difference resulted from a failure in one of the LLVM passes, and + enters the miscompilation debugger. + Otherwise, there is no problem bugpoint can debug.
+ +If an optimizer or code generator crashes, bugpoint will try as hard + as it can to reduce the list of passes (for optimizer crashes) and the size of + the test program. First, bugpoint figures out which combination of + optimizer passes triggers the bug. This is useful when debugging a problem + exposed by opt, for example, because it runs over 38 passes.
+ +Next, bugpoint tries removing functions from the test program, to + reduce its size. Usually it is able to reduce a test program to a single + function, when debugging intraprocedural optimizations. Once the number of + functions has been reduced, it attempts to delete various edges in the control + flow graph, to reduce the size of the function as much as possible. Finally, + bugpoint deletes any individual LLVM instructions whose absence does + not eliminate the failure. At the end, bugpoint should tell you what + passes crash, give you a bitcode file, and give you instructions on how to + reproduce the failure with opt or llc.
+ +The code generator debugger attempts to narrow down the amount of code that + is being miscompiled by the selected code generator. To do this, it takes the + test program and partitions it into two pieces: one piece which it compiles + with the C backend (into a shared object), and one piece which it runs with + either the JIT or the static LLC compiler. It uses several techniques to + reduce the amount of code pushed through the LLVM code generator, to reduce the + potential scope of the problem. After it is finished, it emits two bitcode + files (called "test" [to be compiled with the code generator] and "safe" [to be + compiled with the C backend], respectively), and instructions for reproducing + the problem. The code generator debugger assumes that the C backend produces + good code.
+ +The miscompilation debugger works similarly to the code generator debugger. + It works by splitting the test program into two pieces, running the + optimizations specified on one piece, linking the two pieces back together, and + then executing the result. It attempts to narrow down the list of passes to + the one (or few) which are causing the miscompilation, then reduce the portion + of the test program which is being miscompiled. The miscompilation debugger + assumes that the selected code generator is working properly.
+ ++ +
bugpoint can generate a lot of output and run for a long period + of time. It is often useful to capture the output of the program to file. + For example, in the C shell, you can run:
+ +bugpoint ... |& tee bugpoint.log
+to get a copy of bugpoint's output in the file + bugpoint.log, as well as on your terminal.
+ +This section describes how to aquire and build llvm-gcc4, which is based on + the GCC 4.0.1 front-end. This front-end supports C, C++, Objective-C, and + Objective-C++. Note that the instructions for building this front-end are + completely different (and much easier!) than those for building llvm-gcc3 in + the past.
+ +Retrieve the appropriate llvm-gcc4-x.y.source.tar.gz archive from the + llvm web site.
+ +It is also possible to download the sources of the llvm-gcc4 front end + from a read-only mirror using subversion. To check out the code the + first time use:
+ ++ svn co http://llvm.org/svn/llvm-project/llvm-gcc-4.0/trunk dst-directory ++
After that, the code can be be updated in the destination directory + using:
+ +svn update+
The mirror is brought up to date every evening.
+ The LLVM GCC frontend is licensed to you under the GNU General Public License + and the GNU Lesser General Public License. Please see the files COPYING and + COPYING.LIB for more details. +
+ ++ More information is available in the FAQ. +
+Warning: This is a work in progress.
+The LLVM target-independent code generator is a framework that provides a + suite of reusable components for translating the LLVM internal representation to + the machine code for a specified target—either in assembly form (suitable + for a static compiler) or in binary machine code format (usable for a JIT + compiler). The LLVM target-independent code generator consists of five main + components:
+ ++ Depending on which part of the code generator you are interested in working on, + different pieces of this will be useful to you. In any case, you should be + familiar with the target description and machine code representation classes. If you want to add + a backend for a new target, you will need to implement the + target description classes for your new target and understand the LLVM code representation. If you are interested in + implementing a new code generation algorithm, it + should only depend on the target-description and machine code representation + classes, ensuring that it is portable. +
+ +The two pieces of the LLVM code generator are the high-level interface to the + code generator and the set of reusable components that can be used to build + target-specific backends. The two most important interfaces (TargetMachine and TargetData) are the only ones that are + required to be defined for a backend to fit into the LLVM system, but the others + must be defined if the reusable code generator components are going to be + used.
+ +This design has two important implications. The first is that LLVM can + support completely non-traditional code generation targets. For example, the C + backend does not require register allocation, instruction selection, or any of + the other standard components provided by the system. As such, it only + implements these two interfaces, and does its own thing. Another example of a + code generator like this is a (purely hypothetical) backend that converts LLVM + to the GCC RTL form and uses GCC to emit machine code for a target.
+ +This design also implies that it is possible to design and + implement radically different code generators in the LLVM system that do not + make use of any of the built-in components. Doing so is not recommended at all, + but could be required for radically different targets that do not fit into the + LLVM machine description model: FPGAs for example.
+ +The LLVM target-independent code generator is designed to support efficient and + quality code generation for standard register-based microprocessors. Code + generation in this model is divided into the following stages:
+ +The code generator is based on the assumption that the instruction selector + will use an optimal pattern matching selector to create high-quality sequences of + native instructions. Alternative code generator designs based on pattern + expansion and aggressive iterative peephole optimization are much slower. This + design permits efficient compilation (important for JIT environments) and + aggressive optimization (used when generating code offline) by allowing + components of varying levels of sophistication to be used for any step of + compilation.
+ +In addition to these stages, target implementations can insert arbitrary + target-specific passes into the flow. For example, the X86 target uses a + special pass to handle the 80x87 floating point stack architecture. Other + targets with unusual requirements can be supported with custom passes as + needed.
+ +The target description classes require a detailed description of the target + architecture. These target descriptions often have a large amount of common + information (e.g., an add instruction is almost identical to a + sub instruction). + In order to allow the maximum amount of commonality to be factored out, the LLVM + code generator uses the TableGen tool to + describe big chunks of the target machine, which allows the use of + domain-specific and target-specific abstractions to reduce the amount of + repetition.
+ +As LLVM continues to be developed and refined, we plan to move more and more + of the target description to the .td form. Doing so gives us a + number of advantages. The most important is that it makes it easier to port + LLVM because it reduces the amount of C++ code that has to be written, and the + surface area of the code generator that needs to be understood before someone + can get something working. Second, it makes it easier to change things. In + particular, if tables and other things are all emitted by tblgen, we + only need a change in one place (tblgen) to update all of the targets + to a new interface.
+ +The LLVM target description classes (located in the + include/llvm/Target directory) provide an abstract description of the + target machine independent of any particular client. These classes are + designed to capture the abstract properties of the target (such as the + instructions and registers it has), and do not incorporate any particular pieces + of code generation algorithms.
+ +All of the target description classes (except the TargetData class) are designed to be subclassed by + the concrete target implementation, and have virtual methods implemented. To + get to these implementations, the TargetMachine class provides accessors that + should be implemented by the target.
+ +The TargetMachine class provides virtual methods that are used to + access the target-specific implementations of the various target description + classes via the get*Info methods (getInstrInfo, + getRegisterInfo, getFrameInfo, etc.). This class is + designed to be specialized by + a concrete target implementation (e.g., X86TargetMachine) which + implements the various virtual methods. The only required target description + class is the TargetData class, but if the + code generator components are to be used, the other interfaces should be + implemented as well.
+ +The TargetData class is the only required target description class, + and it is the only class that is not extensible (you cannot derived a new + class from it). TargetData specifies information about how the target + lays out memory for structures, the alignment requirements for various data + types, the size of pointers in the target, and whether the target is + little-endian or big-endian.
+ +The TargetLowering class is used by SelectionDAG based instruction + selectors primarily to describe how LLVM code should be lowered to SelectionDAG + operations. Among other things, this class indicates:
+ +The MRegisterInfo class (which will eventually be renamed to + TargetRegisterInfo) is used to describe the register file of the + target and any interactions between the registers.
+ +Registers in the code generator are represented in the code generator by + unsigned integers. Physical registers (those that actually exist in the target + description) are unique small numbers, and virtual registers are generally + large. Note that register #0 is reserved as a flag value.
+ +Each register in the processor description has an associated + TargetRegisterDesc entry, which provides a textual name for the + register (used for assembly output and debugging dumps) and a set of aliases + (used to indicate whether one register overlaps with another). +
+ +In addition to the per-register description, the MRegisterInfo class + exposes a set of processor specific register classes (instances of the + TargetRegisterClass class). Each register class contains sets of + registers that have the same properties (for example, they are all 32-bit + integer registers). Each SSA virtual register created by the instruction + selector has an associated register class. When the register allocator runs, it + replaces virtual registers with a physical register in the set.
+ ++ The target-specific implementations of these classes is auto-generated from a TableGen description of the register file. +
+ +The TargetInstrInfo class is used to describe the machine + instructions supported by the target. It is essentially an array of + TargetInstrDescriptor objects, each of which describes one + instruction the target supports. Descriptors define things like the mnemonic + for the opcode, the number of operands, the list of implicit register uses + and defs, whether the instruction has certain target-independent properties + (accesses memory, is commutable, etc), and holds any target-specific + flags.
+The TargetFrameInfo class is used to provide information about the + stack frame layout of the target. It holds the direction of stack growth, + the known stack alignment on entry to each function, and the offset to the + local area. The offset to the local area is the offset from the stack + pointer on function entry to the first location where function data (local + variables, spill locations) can be stored.
+The TargetSubtarget class is used to provide information about the + specific chip set being targeted. A sub-target informs code generation of + which instructions are supported, instruction latencies and instruction + execution itinerary; i.e., which processing units are used, in what order, and + for how long.
+The TargetJITInfo class exposes an abstract interface used by the + Just-In-Time code generator to perform target-specific activities, such as + emitting stubs. If a TargetMachine supports JIT code generation, it + should provide one of these objects through the getJITInfo + method.
+At the high-level, LLVM code is translated to a machine specific + representation formed out of + MachineFunction, + MachineBasicBlock, and MachineInstr instances + (defined in include/llvm/CodeGen). This representation is completely + target agnostic, representing instructions in their most abstract form: an + opcode and a series of operands. This representation is designed to support + both an SSA representation for machine code, as well as a register allocated, + non-SSA form.
+ +Target machine instructions are represented as instances of the + MachineInstr class. This class is an extremely abstract way of + representing machine instructions. In particular, it only keeps track of + an opcode number and a set of operands.
+ +The opcode number is a simple unsigned integer that only has meaning to a + specific backend. All of the instructions for a target should be defined in + the *InstrInfo.td file for the target. The opcode enum values + are auto-generated from this description. The MachineInstr class does + not have any information about how to interpret the instruction (i.e., what the + semantics of the instruction are); for that you must refer to the + TargetInstrInfo class.
+ +The operands of a machine instruction can be of several different types: + a register reference, a constant integer, a basic block reference, etc. In + addition, a machine operand should be marked as a def or a use of the value + (though only registers are allowed to be defs).
+ +By convention, the LLVM code generator orders instruction operands so that + all register definitions come before the register uses, even on architectures + that are normally printed in other orders. For example, the SPARC add + instruction: "add %i1, %i2, %i3" adds the "%i1", and "%i2" registers + and stores the result into the "%i3" register. In the LLVM code generator, + the operands should be stored as "%i3, %i1, %i2": with the destination + first.
+ +Keeping destination (definition) operands at the beginning of the operand + list has several advantages. In particular, the debugging printer will print + the instruction like this:
+ ++ %r3 = add %i1, %i2 ++
Also if the first operand is a def, it is easier to create instructions whose only def is the first + operand.
+ +Machine instructions are created by using the BuildMI functions, + located in the include/llvm/CodeGen/MachineInstrBuilder.h file. The + BuildMI functions make it easy to build arbitrary machine + instructions. Usage of the BuildMI functions look like this:
+ ++ // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42') + // instruction. The '1' specifies how many operands will be added. + MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42); + + // Create the same instr, but insert it at the end of a basic block. + MachineBasicBlock &MBB = ... + BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42); + + // Create the same instr, but insert it before a specified iterator point. + MachineBasicBlock::iterator MBBI = ... + BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42); + + // Create a 'cmp Reg, 0' instruction, no destination reg. + MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0); + // Create an 'sahf' instruction which takes no operands and stores nothing. + MI = BuildMI(X86::SAHF, 0); + + // Create a self looping branch instruction. + BuildMI(MBB, X86::JNE, 1).addMBB(&MBB); ++
The key thing to remember with the BuildMI functions is that you + have to specify the number of operands that the machine instruction will take. + This allows for efficient memory allocation. You also need to specify if + operands default to be uses of values, not definitions. If you need to add a + definition operand (other than the optional destination register), you must + explicitly mark it as such:
+ ++ MI.addReg(Reg, MachineOperand::Def); ++
One important issue that the code generator needs to be aware of is the + presence of fixed registers. In particular, there are often places in the + instruction stream where the register allocator must arrange for a + particular value to be in a particular register. This can occur due to + limitations of the instruction set (e.g., the X86 can only do a 32-bit divide + with the EAX/EDX registers), or external factors like calling + conventions. In any case, the instruction selector should emit code that + copies a virtual register into or out of a physical register when needed.
+ +For example, consider this simple LLVM example:
+ +
+ int %test(int %X, int %Y) {
+ %Z = div int %X, %Y
+ ret int %Z
+ }
+
+ The X86 instruction selector produces this machine code for the div + and ret (use + "llc X.bc -march=x86 -print-machineinstrs" to get this):
+ ++ ;; Start of div + %EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX + %reg1027 = sar %reg1024, 31 + %EDX = mov %reg1027 ;; Sign extend X into EDX + idiv %reg1025 ;; Divide by Y (in reg1025) + %reg1026 = mov %EAX ;; Read the result (Z) out of EAX + + ;; Start of ret + %EAX = mov %reg1026 ;; 32-bit return value goes in EAX + ret ++
By the end of code generation, the register allocator has coalesced + the registers and deleted the resultant identity moves producing the + following code:
+ ++ ;; X is in EAX, Y is in ECX + mov %EAX, %EDX + sar %EDX, 31 + idiv %ECX + ret ++
This approach is extremely general (if it can handle the X86 architecture, + it can handle anything!) and allows all of the target specific + knowledge about the instruction stream to be isolated in the instruction + selector. Note that physical registers should have a short lifetime for good + code generation, and all physical registers are assumed dead on entry to and + exit from basic blocks (before register allocation). Thus, if you need a value + to be live across basic block boundaries, it must live in a virtual + register.
+ +MachineInstr's are initially selected in SSA-form, and + are maintained in SSA-form until register allocation happens. For the most + part, this is trivially simple since LLVM is already in SSA form; LLVM PHI nodes + become machine code PHI nodes, and virtual registers are only allowed to have a + single definition.
+ +After register allocation, machine code is no longer in SSA-form because there + are no virtual registers left in the code.
+ +The MachineBasicBlock class contains a list of machine instructions + (MachineInstr instances). It roughly + corresponds to the LLVM code input to the instruction selector, but there can be + a one-to-many mapping (i.e. one LLVM basic block can map to multiple machine + basic blocks). The MachineBasicBlock class has a + "getBasicBlock" method, which returns the LLVM basic block that it + comes from.
+ +The MachineFunction class contains a list of machine basic blocks + (MachineBasicBlock instances). It + corresponds one-to-one with the LLVM function input to the instruction selector. + In addition to a list of basic blocks, the MachineFunction contains a + a MachineConstantPool, a MachineFrameInfo, a + MachineFunctionInfo, and a MachineRegisterInfo. See + include/llvm/CodeGen/MachineFunction.h for more information.
+ +This section documents the phases described in the high-level design of the code generator. It + explains how they work and some of the rationale behind their design.
+ ++ Instruction Selection is the process of translating LLVM code presented to the + code generator into target-specific machine instructions. There are several + well-known ways to do this in the literature. LLVM uses a SelectionDAG based + instruction selector. +
+ +Portions of the DAG instruction selector are generated from the target + description (*.td) files. Our goal is for the entire instruction + selector to be generated from these .td files, though currently + there are still things that require custom C++ code.
+The SelectionDAG provides an abstraction for code representation in a way + that is amenable to instruction selection using automatic techniques + (e.g. dynamic-programming based optimal pattern matching selectors). It is also + well-suited to other phases of code generation; in particular, + instruction scheduling (SelectionDAG's are very close to scheduling DAGs + post-selection). Additionally, the SelectionDAG provides a host representation + where a large variety of very-low-level (but target-independent) + optimizations may be + performed; ones which require extensive information about the instructions + efficiently supported by the target.
+ +The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the + SDNode class. The primary payload of the SDNode is its + operation code (Opcode) that indicates what operation the node performs and + the operands to the operation. + The various operation node types are described at the top of the + include/llvm/CodeGen/SelectionDAGNodes.h file.
+ +Although most operations define a single value, each node in the graph may + define multiple values. For example, a combined div/rem operation will define + both the dividend and the remainder. Many other situations require multiple + values as well. Each node also has some number of operands, which are edges + to the node defining the used value. Because nodes may define multiple values, + edges are represented by instances of the SDOperand class, which is + a <SDNode, unsigned> pair, indicating the node and result + value being used, respectively. Each value produced by an SDNode has + an associated MVT::ValueType indicating what type the value is.
+ +SelectionDAGs contain two different kinds of values: those that represent + data flow and those that represent control flow dependencies. Data values are + simple edges with an integer or floating point value type. Control edges are + represented as "chain" edges which are of type MVT::Other. These edges + provide an ordering between nodes that have side effects (such as + loads, stores, calls, returns, etc). All nodes that have side effects should + take a token chain as input and produce a new one as output. By convention, + token chain inputs are always operand #0, and chain results are always the last + value produced by an operation.
+ +A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is + always a marker node with an Opcode of ISD::EntryToken. The Root node + is the final side-effecting node in the token chain. For example, in a single + basic block function it would be the return node.
+ +One important concept for SelectionDAGs is the notion of a "legal" vs. + "illegal" DAG. A legal DAG for a target is one that only uses supported + operations and supported types. On a 32-bit PowerPC, for example, a DAG with + a value of type i1, i8, i16, or i64 would be illegal, as would a DAG that uses a + SREM or UREM operation. The + legalize phase is responsible for turning + an illegal DAG into a legal DAG.
+ +SelectionDAG-based instruction selection consists of the following steps:
+ +After all of these steps are complete, the SelectionDAG is destroyed and the + rest of the code generation passes are run.
+ +One great way to visualize what is going on here is to take advantage of a + few LLC command line options. In particular, the -view-isel-dags + option pops up a window with the SelectionDAG input to the Select phase for all + of the code compiled (if you only get errors printed to the console while using + this, you probably need to configure + your system to add support for it). The -view-sched-dags option + views the SelectionDAG output from the Select phase and input to the Scheduler + phase. The -view-sunit-dags option views the ScheduleDAG, which is + based on the final SelectionDAG, with nodes that must be scheduled as a unit + bundled together into a single node, and with immediate operands and other + nodes that aren't relevent for scheduling omitted. +
+ +The initial SelectionDAG is naïvely peephole expanded from the LLVM + input by the SelectionDAGLowering class in the + lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp file. The intent of this + pass is to expose as much low-level, target-specific details to the SelectionDAG + as possible. This pass is mostly hard-coded (e.g. an LLVM add turns + into an SDNode add while a geteelementptr is expanded into the + obvious arithmetic). This pass requires target-specific hooks to lower calls, + returns, varargs, etc. For these features, the + TargetLowering interface is used.
+ +The Legalize phase is in charge of converting a DAG to only use the types and + operations that are natively supported by the target. This involves two major + tasks:
+ +Convert values of unsupported types to values of supported types.
+There are two main ways of doing this: converting small types to + larger types ("promoting"), and breaking up large integer types + into smaller ones ("expanding"). For example, a target might require + that all f32 values are promoted to f64 and that all i1/i8/i16 values + are promoted to i32. The same target might require that all i64 values + be expanded into i32 values. These changes can insert sign and zero + extensions as needed to make sure that the final code has the same + behavior as the input.
+A target implementation tells the legalizer which types are supported + (and which register class to use for them) by calling the + addRegisterClass method in its TargetLowering constructor.
+Eliminate operations that are not supported by the target.
+Targets often have weird constraints, such as not supporting every + operation on every supported datatype (e.g. X86 does not support byte + conditional moves and PowerPC does not support sign-extending loads from + a 16-bit memory location). Legalize takes care of this by open-coding + another sequence of operations to emulate the operation ("expansion"), by + promoting one type to a larger type that supports the operation + ("promotion"), or by using a target-specific hook to implement the + legalization ("custom").
+A target implementation tells the legalizer which operations are not + supported (and which of the above three actions to take) by calling the + setOperationAction method in its TargetLowering + constructor.
+Prior to the existance of the Legalize pass, we required that every target + selector supported and handled every + operator and type even if they are not natively supported. The introduction of + the Legalize phase allows all of the cannonicalization patterns to be shared + across targets, and makes it very easy to optimize the cannonicalized code + because it is still in the form of a DAG.
+ +The SelectionDAG optimization phase is run twice for code generation: once + immediately after the DAG is built and once after legalization. The first run + of the pass allows the initial code to be cleaned up (e.g. performing + optimizations that depend on knowing that the operators have restricted type + inputs). The second run of the pass cleans up the messy code generated by the + Legalize pass, which allows Legalize to be very simple (it can focus on making + code legal instead of focusing on generating good and legal code).
+ +One important class of optimizations performed is optimizing inserted sign + and zero extension instructions. We currently use ad-hoc techniques, but could + move to more rigorous techniques in the future. Here are some good papers on + the subject:
+ +
+ "Widening
+ integer arithmetic"
+ Kevin Redwine and Norman Ramsey
+ International Conference on Compiler Construction (CC) 2004
+
+ "Effective
+ sign extension elimination"
+ Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani
+ Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design
+ and Implementation.
+
The Select phase is the bulk of the target-specific code for instruction + selection. This phase takes a legal SelectionDAG as input, pattern matches the + instructions supported by the target to this DAG, and produces a new DAG of + target code. For example, consider the following LLVM fragment:
+ ++ %t1 = add float %W, %X + %t2 = mul float %t1, %Y + %t3 = add float %t2, %Z ++
This LLVM code corresponds to a SelectionDAG that looks basically like + this:
+ ++ (fadd:f32 (fmul:f32 (fadd:f32 W, X), Y), Z) ++
If a target supports floating point multiply-and-add (FMA) operations, one + of the adds can be merged with the multiply. On the PowerPC, for example, the + output of the instruction selector might look like this DAG:
+ ++ (FMADDS (FADDS W, X), Y, Z) ++
The FMADDS instruction is a ternary instruction that multiplies its + first two operands and adds the third (as single-precision floating-point + numbers). The FADDS instruction is a simple binary single-precision + add instruction. To perform this pattern match, the PowerPC backend includes + the following instruction definitions:
+ ++ def FMADDS : AForm_1<59, 29, + (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRC, F4RC:$FRB), + "fmadds $FRT, $FRA, $FRC, $FRB", + [(set F4RC:$FRT, (fadd (fmul F4RC:$FRA, F4RC:$FRC), + F4RC:$FRB))]>; + def FADDS : AForm_2<59, 21, + (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRB), + "fadds $FRT, $FRA, $FRB", + [(set F4RC:$FRT, (fadd F4RC:$FRA, F4RC:$FRB))]>; ++
The portion of the instruction definition in bold indicates the pattern used + to match the instruction. The DAG operators (like fmul/fadd) + are defined in the lib/Target/TargetSelectionDAG.td file. + "F4RC" is the register class of the input and result values.
+ +
The TableGen DAG instruction selector generator reads the instruction + patterns in the .td file and automatically builds parts of the pattern + matching code for your target. It has the following strengths:
+ ++ // Arbitrary immediate support. Implement in terms of LIS/ORI. + def : Pat<(i32 imm:$imm), + (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>; ++
While it has many strengths, the system currently has some limitations, + primarily because it is a work in progress and is not yet finished:
+ +Despite these limitations, the instruction selector generator is still quite + useful for most of the binary and logical operations in typical instruction + sets. If you run into any problems or can't figure out how to do something, + please let Chris know!
+ +The scheduling phase takes the DAG of target instructions from the selection + phase and assigns an order. The scheduler can pick an order depending on + various constraints of the machines (i.e. order for minimal register pressure or + try to cover instruction latencies). Once an order is established, the DAG is + converted to a list of MachineInstrs and + the SelectionDAG is destroyed.
+ +Note that this phase is logically separate from the instruction selection + phase, but is tied to it closely in the code because it operates on + SelectionDAGs.
+ +To Be Written
Live Intervals are the ranges (intervals) where a variable is live. + They are used by some register allocator passes to + determine if two or more virtual registers which require the same physical + register are live at the same point in the program (i.e., they conflict). When + this situation occurs, one virtual register must be spilled.
+ +The first step in determining the live intervals of variables is to + calculate the set of registers that are immediately dead after the + instruction (i.e., the instruction calculates the value, but it is + never used) and the set of registers that are used by the instruction, + but are never used after the instruction (i.e., they are killed). Live + variable information is computed for each virtual register and + register allocatable physical register in the function. This + is done in a very efficient manner because it uses SSA to sparsely + compute lifetime information for virtual registers (which are in SSA + form) and only has to track physical registers within a block. Before + register allocation, LLVM can assume that physical registers are only + live within a single basic block. This allows it to do a single, + local analysis to resolve physical register lifetimes within each + basic block. If a physical register is not register allocatable (e.g., + a stack pointer or condition codes), it is not tracked.
+ +Physical registers may be live in to or out of a function. Live in values + are typically arguments in registers. Live out values are typically return + values in registers. Live in values are marked as such, and are given a dummy + "defining" instruction during live intervals analysis. If the last basic block + of a function is a return, then it's marked as using all live out + values in the function.
+ +PHI nodes need to be handled specially, because the calculation + of the live variable information from a depth first traversal of the CFG of + the function won't guarantee that a virtual register used by the PHI + node is defined before it's used. When a PHI node is encounted, only + the definition is handled, because the uses will be handled in other basic + blocks.
+ +For each PHI node of the current basic block, we simulate an + assignment at the end of the current basic block and traverse the successor + basic blocks. If a successor basic block has a PHI node and one of + the PHI node's operands is coming from the current basic block, + then the variable is marked as alive within the current basic block + and all of its predecessor basic blocks, until the basic block with the + defining instruction is encountered.
+ +We now have the information available to perform the live intervals analysis + and build the live intervals themselves. We start off by numbering the basic + blocks and machine instructions. We then handle the "live-in" values. These + are in physical registers, so the physical register is assumed to be killed by + the end of the basic block. Live intervals for virtual registers are computed + for some ordering of the machine instructions [1, N]. A live interval + is an interval [i, j), where 1 <= i <= j < N, for which a + variable is live.
+ +More to come...
+ +The Register Allocation problem consists in mapping a program + Pv, that can use an unbounded number of virtual + registers, to a program Pp that contains a finite + (possibly small) number of physical registers. Each target architecture has + a different number of physical registers. If the number of physical + registers is not enough to accommodate all the virtual registers, some of + them will have to be mapped into memory. These virtuals are called + spilled virtuals.
+ +In LLVM, physical registers are denoted by integer numbers that + normally range from 1 to 1023. To see how this numbering is defined + for a particular architecture, you can read the + GenRegisterNames.inc file for that architecture. For + instance, by inspecting + lib/Target/X86/X86GenRegisterNames.inc we see that the 32-bit + register EAX is denoted by 15, and the MMX register + MM0 is mapped to 48.
+ +Some architectures contain registers that share the same physical + location. A notable example is the X86 platform. For instance, in the + X86 architecture, the registers EAX, AX and + AL share the first eight bits. These physical registers are + marked as aliased in LLVM. Given a particular architecture, you + can check which registers are aliased by inspecting its + RegisterInfo.td file. Moreover, the method + MRegisterInfo::getAliasSet(p_reg) returns an array containing + all the physical registers aliased to the register p_reg.
+ +Physical registers, in LLVM, are grouped in Register Classes. + Elements in the same register class are functionally equivalent, and can + be interchangeably used. Each virtual register can only be mapped to + physical registers of a particular class. For instance, in the X86 + architecture, some virtuals can only be allocated to 8 bit registers. + A register class is described by TargetRegisterClass objects. + To discover if a virtual register is compatible with a given physical, + this code can be used: +
+ +
+ bool RegMapping_Fer::compatible_class(MachineFunction &mf,
+ unsigned v_reg,
+ unsigned p_reg) {
+ assert(MRegisterInfo::isPhysicalRegister(p_reg) &&
+ "Target register must be physical");
+ const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg);
+ return trc->contains(p_reg);
+ }
+
+ Sometimes, mostly for debugging purposes, it is useful to change + the number of physical registers available in the target + architecture. This must be done statically, inside the + TargetRegsterInfo.td file. Just grep for + RegisterClass, the last parameter of which is a list of + registers. Just commenting some out is one simple way to avoid them + being used. A more polite way is to explicitly exclude some registers + from the allocation order. See the definition of the + GR register class in + lib/Target/IA64/IA64RegisterInfo.td for an example of this + (e.g., numReservedRegs registers are hidden.)
+ +Virtual registers are also denoted by integer numbers. Contrary to + physical registers, different virtual registers never share the same + number. The smallest virtual register is normally assigned the number + 1024. This may change, so, in order to know which is the first virtual + register, you should access + MRegisterInfo::FirstVirtualRegister. Any register whose + number is greater than or equal to + MRegisterInfo::FirstVirtualRegister is considered a virtual + register. Whereas physical registers are statically defined in a + TargetRegisterInfo.td file and cannot be created by the + application developer, that is not the case with virtual registers. + In order to create new virtual registers, use the method + MachineRegisterInfo::createVirtualRegister(). This method will return a + virtual register with the highest code. +
+ +Before register allocation, the operands of an instruction are + mostly virtual registers, although physical registers may also be + used. In order to check if a given machine operand is a register, use + the boolean function MachineOperand::isRegister(). To obtain + the integer code of a register, use + MachineOperand::getReg(). An instruction may define or use a + register. For instance, ADD reg:1026 := reg:1025 reg:1024 + defines the registers 1024, and uses registers 1025 and 1026. Given a + register operand, the method MachineOperand::isUse() informs + if that register is being used by the instruction. The method + MachineOperand::isDef() informs if that registers is being + defined.
+ +We will call physical registers present in the LLVM bitcode before + register allocation pre-colored registers. Pre-colored + registers are used in many different situations, for instance, to pass + parameters of functions calls, and to store results of particular + instructions. There are two types of pre-colored registers: the ones + implicitly defined, and those explicitly + defined. Explicitly defined registers are normal operands, and can be + accessed with MachineInstr::getOperand(int)::getReg(). In + order to check which registers are implicitly defined by an + instruction, use the + TargetInstrInfo::get(opcode)::ImplicitDefs, where + opcode is the opcode of the target instruction. One important + difference between explicit and implicit physical registers is that + the latter are defined statically for each instruction, whereas the + former may vary depending on the program being compiled. For example, + an instruction that represents a function call will always implicitly + define or use the same set of physical registers. To read the + registers implicitly used by an instruction, use + TargetInstrInfo::get(opcode)::ImplicitUses. Pre-colored + registers impose constraints on any register allocation algorithm. The + register allocator must make sure that none of them is been + overwritten by the values of virtual registers while still alive.
+ +There are two ways to map virtual registers to physical registers (or to + memory slots). The first way, that we will call direct mapping, + is based on the use of methods of the classes MRegisterInfo, + and MachineOperand. The second way, that we will call + indirect mapping, relies on the VirtRegMap class in + order to insert loads and stores sending and getting values to and from + memory.
+ +The direct mapping provides more flexibility to the developer of + the register allocator; however, it is more error prone, and demands + more implementation work. Basically, the programmer will have to + specify where load and store instructions should be inserted in the + target function being compiled in order to get and store values in + memory. To assign a physical register to a virtual register present in + a given operand, use MachineOperand::setReg(p_reg). To insert + a store instruction, use + MRegisterInfo::storeRegToStackSlot(...), and to insert a load + instruction, use MRegisterInfo::loadRegFromStackSlot.
+ +The indirect mapping shields the application developer from the + complexities of inserting load and store instructions. In order to map + a virtual register to a physical one, use + VirtRegMap::assignVirt2Phys(vreg, preg). In order to map a + certain virtual register to memory, use + VirtRegMap::assignVirt2StackSlot(vreg). This method will + return the stack slot where vreg's value will be located. If + it is necessary to map another virtual register to the same stack + slot, use VirtRegMap::assignVirt2StackSlot(vreg, + stack_location). One important point to consider when using the + indirect mapping, is that even if a virtual register is mapped to + memory, it still needs to be mapped to a physical register. This + physical register is the location where the virtual register is + supposed to be found before being stored or after being reloaded.
+ +If the indirect strategy is used, after all the virtual registers + have been mapped to physical registers or stack slots, it is necessary + to use a spiller object to place load and store instructions in the + code. Every virtual that has been mapped to a stack slot will be + stored to memory after been defined and will be loaded before being + used. The implementation of the spiller tries to recycle load/store + instructions, avoiding unnecessary instructions. For an example of how + to invoke the spiller, see + RegAllocLinearScan::runOnMachineFunction in + lib/CodeGen/RegAllocLinearScan.cpp.
+ +With very rare exceptions (e.g., function calls), the LLVM machine + code instructions are three address instructions. That is, each + instruction is expected to define at most one register, and to use at + most two registers. However, some architectures use two address + instructions. In this case, the defined register is also one of the + used register. For instance, an instruction such as ADD %EAX, + %EBX, in X86 is actually equivalent to %EAX = %EAX + + %EBX.
+ +In order to produce correct code, LLVM must convert three address + instructions that represent two address instructions into true two + address instructions. LLVM provides the pass + TwoAddressInstructionPass for this specific purpose. It must + be run before register allocation takes place. After its execution, + the resulting code may no longer be in SSA form. This happens, for + instance, in situations where an instruction such as %a = ADD %b + %c is converted to two instructions such as:
+ ++ %a = MOVE %b + %a = ADD %a %b ++
Notice that, internally, the second instruction is represented as + ADD %a[def/use] %b. I.e., the register operand %a is + both used and defined by the instruction.
+ +An important transformation that happens during register allocation is called + the SSA Deconstruction Phase. The SSA form simplifies many + analyses that are performed on the control flow graph of + programs. However, traditional instruction sets do not implement + PHI instructions. Thus, in order to generate executable code, compilers + must replace PHI instructions with other instructions that preserve their + semantics.
+ +There are many ways in which PHI instructions can safely be removed + from the target code. The most traditional PHI deconstruction + algorithm replaces PHI instructions with copy instructions. That is + the strategy adopted by LLVM. The SSA deconstruction algorithm is + implemented in nlib/CodeGen/>PHIElimination.cpp. In order to + invoke this pass, the identifier PHIEliminationID must be + marked as required in the code of the register allocator.
+ +Instruction folding is an optimization performed during + register allocation that removes unnecessary copy instructions. For + instance, a sequence of instructions such as:
+ ++ %EBX = LOAD %mem_address + %EAX = COPY %EBX ++
can be safely substituted by the single instruction: + +
+ %EAX = LOAD %mem_address ++
Instructions can be folded with the + MRegisterInfo::foldMemoryOperand(...) method. Care must be + taken when folding instructions; a folded instruction can be quite + different from the original instruction. See + LiveIntervals::addIntervalsForSpills in + lib/CodeGen/LiveIntervalAnalysis.cpp for an example of its use.
+ +The LLVM infrastructure provides the application developer with + three different register allocators:
+ +The type of register allocator used in llc can be chosen with the + command line option -regalloc=...:
+ ++ $ llc -f -regalloc=simple file.bc -o sp.s; + $ llc -f -regalloc=local file.bc -o lc.s; + $ llc -f -regalloc=linearscan file.bc -o ln.s; ++
To Be Written
To Be Written
To Be Written
To Be Written
For the JIT or .o file writer
+This section of the document explains features or design decisions that + are specific to the code generator for a particular target.
+ +The X86 code generator lives in the lib/Target/X86 directory. This + code generator is capable of targeting a variety of x86-32 and x86-64 + processors, and includes support for ISA extensions such as MMX and SSE. +
+ +The following are the known target triples that are supported by the X86 + backend. This is not an exhaustive list, and it would be useful to add those + that people test.
+ +The folowing target-specific calling conventions are known to backend:
+ +The x86 has a very flexible way of accessing memory. It is capable of + forming memory addresses of the following expression directly in integer + instructions (which use ModR/M addressing):
+ ++ Base + [1,2,4,8] * IndexReg + Disp32 ++
In order to represent this, LLVM tracks no less than 4 operands for each + memory operand of this form. This means that the "load" form of 'mov' + has the following MachineOperands in this order:
+ ++ Index: 0 | 1 2 3 4 + Meaning: DestReg, | BaseReg, Scale, IndexReg, Displacement + OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg, SignExtImm ++ +
Stores, and all other instructions, treat the four memory operands in the + same way and in the same order.
+ +An instruction name consists of the base name, a default operand size, and a + a character per operand with an optional special size. For example:
+ +
+ ADD8rr -> add, 8-bit register, 8-bit register
+ IMUL16rmi -> imul, 16-bit register, 16-bit memory, 16-bit immediate
+ IMUL16rmi8 -> imul, 16-bit register, 16-bit memory, 8-bit immediate
+ MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory
+
The PowerPC code generator lives in the lib/Target/PowerPC directory. The + code generation is retargetable to several variations or subtargets of + the PowerPC ISA; including ppc32, ppc64 and altivec. +
+LLVM follows the AIX PowerPC ABI, with two deviations. LLVM uses a PC + relative (PIC) or static addressing for accessing global values, so no TOC (r2) + is used. Second, r31 is used as a frame pointer to allow dynamic growth of a + stack frame. LLVM takes advantage of having no TOC to provide space to save + the frame pointer in the PowerPC linkage area of the caller frame. Other + details of PowerPC ABI can be found at PowerPC ABI. Note: This link describes the 32 bit ABI. The + 64 bit ABI is similar except space for GPRs are 8 bytes wide (not 4) and r13 is + reserved for system use.
+The size of a PowerPC frame is usually fixed for the duration of a + function’s invocation. Since the frame is fixed size, all references into + the frame can be accessed via fixed offsets from the stack pointer. The + exception to this is when dynamic alloca or variable sized arrays are present, + then a base pointer (r31) is used as a proxy for the stack pointer and stack + pointer is free to grow or shrink. A base pointer is also used if llvm-gcc is + not passed the -fomit-frame-pointer flag. The stack pointer is always aligned to + 16 bytes, so that space allocated for altivec vectors will be properly + aligned.
+An invocation frame is layed out as follows (low memory at top);
+| Linkage |
+
| Parameter area |
+
| Dynamic area |
+
| Locals area |
+
| Saved registers area |
+
| Previous Frame |
+
The linkage area is used by a callee to save special registers prior + to allocating its own frame. Only three entries are relevant to LLVM. The + first entry is the previous stack pointer (sp), aka link. This allows probing + tools like gdb or exception handlers to quickly scan the frames in the stack. A + function epilog can also use the link to pop the frame from the stack. The + third entry in the linkage area is used to save the return address from the lr + register. Finally, as mentioned above, the last entry is used to save the + previous frame pointer (r31.) The entries in the linkage area are the size of a + GPR, thus the linkage area is 24 bytes long in 32 bit mode and 48 bytes in 64 + bit mode.
+32 bit linkage area
+| 0 | +Saved SP (r1) | +
| 4 | +Saved CR | +
| 8 | +Saved LR | +
| 12 | +Reserved | +
| 16 | +Reserved | +
| 20 | +Saved FP (r31) | +
64 bit linkage area
+| 0 | +Saved SP (r1) | +
| 8 | +Saved CR | +
| 16 | +Saved LR | +
| 24 | +Reserved | +
| 32 | +Reserved | +
| 40 | +Saved FP (r31) | +
The parameter area is used to store arguments being passed to a callee + function. Following the PowerPC ABI, the first few arguments are actually + passed in registers, with the space in the parameter area unused. However, if + there are not enough registers or the callee is a thunk or vararg function, + these register arguments can be spilled into the parameter area. Thus, the + parameter area must be large enough to store all the parameters for the largest + call sequence made by the caller. The size must also be mimimally large enough + to spill registers r3-r10. This allows callees blind to the call signature, + such as thunks and vararg functions, enough space to cache the argument + registers. Therefore, the parameter area is minimally 32 bytes (64 bytes in 64 + bit mode.) Also note that since the parameter area is a fixed offset from the + top of the frame, that a callee can access its spilt arguments using fixed + offsets from the stack pointer (or base pointer.)
+Combining the information about the linkage, parameter areas and alignment. A + stack frame is minimally 64 bytes in 32 bit mode and 128 bytes in 64 bit + mode.
+The dynamic area starts out as size zero. If a function uses dynamic + alloca then space is added to the stack, the linkage and parameter areas are + shifted to top of stack, and the new space is available immediately below the + linkage and parameter areas. The cost of shifting the linkage and parameter + areas is minor since only the link value needs to be copied. The link value can + be easily fetched by adding the original frame size to the base pointer. Note + that allocations in the dynamic space need to observe 16 byte aligment.
+The locals area is where the llvm compiler reserves space for local + variables.
+The saved registers area is where the llvm compiler spills callee saved + registers on entry to the callee.
+The llvm prolog and epilog are the same as described in the PowerPC ABI, with + the following exceptions. Callee saved registers are spilled after the frame is + created. This allows the llvm epilog/prolog support to be common with other + targets. The base pointer callee saved register r31 is saved in the TOC slot of + linkage area. This simplifies allocation of space for the base pointer and + makes it convenient to locate programatically and during debugging.
+TODO - More to come.
+This document attempts to describe a few coding standards that are being used + in the LLVM source tree. Although no coding standards should be regarded as + absolute requirements to be followed in all instances, coding standards can be + useful.
+ +This document intentionally does not prescribe fixed standards for religious + issues such as brace placement and space usage. For issues like this, follow + the golden rule:
+ ++ + + ++ +
The ultimate goal of these guidelines is the increase readability and + maintainability of our common source base. If you have suggestions for topics to + be included, please mail them to Chris.
+ +Comments are one critical part of readability and maintainability. Everyone + knows they should comment, so should you. Although we all should probably + comment our code more than we do, there are a few very critical places that + documentation is very useful:
+ + File Headers + +Every source file should have a header on it that describes the basic + purpose of the file. If a file does not have a header, it should not be + checked into Subversion. Most source trees will probably have a standard + file header format. The standard format for the LLVM source tree looks like + this:
+ ++ //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===// + // + // The LLVM Compiler Infrastructure + // + // This file is distributed under the University of Illinois Open Source + // License. See LICENSE.TXT for details. + // + //===----------------------------------------------------------------------===// + // + // This file contains the declaration of the Instruction class, which is the + // base class for all of the VM instructions. + // + //===----------------------------------------------------------------------===// ++
A few things to note about this particular format: The "-*- C++ + -*-" string on the first line is there to tell Emacs that the source file + is a C++ file, not a C file (Emacs assumes .h files are C files by default). + Note that this tag is not necessary in .cpp files. The name of the file is also + on the first line, along with a very short description of the purpose of the + file. This is important when printing out code and flipping though lots of + pages.
+ +The next section in the file is a concise note that defines the license + that the file is released under. This makes it perfectly clear what terms the + source code can be distributed under and should not be modified in any way.
+ +The main body of the description does not have to be very long in most cases. + Here it's only two lines. If an algorithm is being implemented or something + tricky is going on, a reference to the paper where it is published should be + included, as well as any notes or "gotchas" in the code to watch out for.
+ + Class overviews + +Classes are one fundamental part of a good object oriented design. As such, + a class definition should have a comment block that explains what the class is + used for... if it's not obvious. If it's so completely obvious your grandma + could figure it out, it's probably safe to leave it out. Naming classes + something sane goes a long ways towards avoiding writing documentation.
+ + + Method information + +Methods defined in a class (as well as any global functions) should also be + documented properly. A quick note about what it does any a description of the + borderline behaviour is all that is necessary here (unless something + particularly tricky or insideous is going on). The hope is that people can + figure out how to use your interfaces without reading the code itself... that is + the goal metric.
+ +Good things to talk about here are what happens when something unexpected + happens: does the method return null? Abort? Format your hard disk?
+ +In general, prefer C++ style (//) comments. They take less space, + require less typing, don't have nesting problems, etc. There are a few cases + when it is useful to use C style (/* */) comments however:
+ +To comment out a large block of code, use #if 0 and #endif. + These nest properly and are better behaved in general than C style comments.
+ +Immediately after the header file comment (and + include guards if working on a header file), the minimal list of #includes required by the + file should be listed. We prefer these #includes to be listed in this + order:
+ +... and each catagory should be sorted by name.
+ +The "Main Module Header" file applies to .cpp file + which implement an interface defined by a .h file. This #include + should always be included first regardless of where it lives on the file + system. By including a header file first in the .cpp files that implement the + interfaces, we ensure that the header does not have any hidden dependencies + which are not explicitly #included in the header, but should be. It is also a + form of documentation in the .cpp file to indicate where the interfaces it + implements are defined.
+ +Write your code to fit within 80 columns of text. This helps those of us who + like to print out code and look at your code in an xterm without resizing + it.
+ +In all cases, prefer spaces to tabs in source files. People have different + prefered indentation levels, and different styles of indentation that they + like... this is fine. What isn't is that different editors/viewers expand tabs + out to different tab stops. This can cause your code to look completely + unreadable, and it is not worth dealing with.
+ +As always, follow the Golden Rule above: follow the + style of existing code if your are modifying and extending it. If you like four + spaces of indentation, DO NOT do that in the middle of a chunk of code + with two spaces of indentation. Also, do not reindent a whole source file: it + makes for incredible diffs that are absolutely worthless.
+ +Okay, your first year of programming you were told that indentation is + important. If you didn't believe and internalize this then, now is the time. + Just do it.
+ +If your code has compiler warnings in it, something is wrong: you aren't + casting values correctly, your have "questionable" constructs in your code, or + you are doing something legitimately wrong. Compiler warnings can cover up + legitimate errors in output and make dealing with a translation unit + difficult.
+ +It is not possible to prevent all warnings from all compilers, nor is it + desirable. Instead, pick a standard compiler (like gcc) that provides + a good thorough set of warnings, and stick to them. At least in the case of + gcc, it is possible to work around any spurious errors by changing the + syntax of the code slightly. For example, an warning that annoys me occurs when + I write code like this:
+ +
+ if (V = getValue()) {
+ ...
+ }
+
+ gcc will warn me that I probably want to use the == + operator, and that I probably mistyped it. In most cases, I haven't, and I + really don't want the spurious errors. To fix this particular problem, I + rewrite the code like this:
+ +
+ if ((V = getValue())) {
+ ...
+ }
+
+ ...which shuts gcc up. Any gcc warning that annoys you can + be fixed by massaging the code appropriately.
+ +These are the gcc warnings that I prefer to enable: -Wall + -Winline -W -Wwrite-strings -Wno-unused
+ +In almost all cases, it is possible and within reason to write completely + portable code. If there are cases where it isn't possible to write portable + code, isolate it behind a well defined (and well documented) interface.
+ +In practice, this means that you shouldn't assume much about the host + compiler, including its support for "high tech" features like partial + specialization of templates. In fact, Visual C++ 6 could be an important target + for our work in the future, and we don't want to have to rewrite all of our code + to support it.
+ +In C++, the class and struct keywords can be used almost + interchangeably. The only difference is when they are used to declare a class: + class makes all members private by default while struct makes + all members public by default.
+ +Unfortunately, not all compilers follow the rules and some will generate + different symbols based on whether class or struct was used to + declare the symbol. This can lead to problems at link time.
+ +So, the rule for LLVM is to always use the class keyword, unless + all members are public, in which case struct is allowed.
+ +C++ doesn't do too well in the modularity department. There is no real + encapsulation or data hiding (unless you use expensive protocol classes), but it + is what we have to work with. When you write a public header file (in the LLVM + source tree, they live in the top level "include" directory), you are defining a + module of functionality.
+ +Ideally, modules should be completely independent of each other, and their + header files should only include the absolute minimum number of headers + possible. A module is not just a class, a function, or a namespace: it's a collection + of these that defines an interface. This interface may be several + functions, classes or data structures, but the important issue is how they work + together.
+ +In general, a module should be implemented with one or more .cpp + files. Each of these .cpp files should include the header that defines + their interface first. This ensure that all of the dependences of the module + header have been properly added to the module header itself, and are not + implicit. System headers should be included after user headers for a + translation unit.
+ +#include hurts compile time performance. Don't do it unless you + have to, especially in header files.
+ +But wait, sometimes you need to have the definition of a class to use it, or + to inherit from it. In these cases go ahead and #include that header + file. Be aware however that there are many cases where you don't need to have + the full definition of a class. If you are using a pointer or reference to a + class, you don't need the header file. If you are simply returning a class + instance from a prototyped function or method, you don't need it. In fact, for + most cases, you simply don't need the definition of a class... and not + #include'ing speeds up compilation.
+ +It is easy to try to go too overboard on this recommendation, however. You + must include all of the header files that you are using -- you can + include them either directly + or indirectly (through another header file). To make sure that you don't + accidently forget to include a header file in your module header, make sure to + include your module header first in the implementation file (as mentioned + above). This way there won't be any hidden dependencies that you'll find out + about later...
+ +Many modules have a complex implementation that causes them to use more than + one implementation (.cpp) file. It is often tempting to put the + internal communication interface (helper classes, extra functions, etc) in the + public module header file. Don't do this.
+ +If you really need to do something like this, put a private header file in + the same directory as the source files, and include it locally. This ensures + that your private interface remains private and undisturbed by outsiders.
+ +Note however, that it's okay to put extra implementation methods a public + class itself... just make them private (or protected), and all is well.
+ +The use of #include <iostream> in library files is + hereby forbidden. The primary reason for doing this is to + support clients using LLVM libraries as part of larger systems. In particular, + we statically link LLVM into some dynamic libraries. Even if LLVM isn't used, + the static c'tors are run whenever an application start up that uses the dynamic + library. There are two problems with this:
+ +| Old Way | +New Way | +
|---|---|
#include <iostream> |
+ #include "llvm/Support/Streams.h" |
+
DEBUG(std::cerr << ...); + DEBUG(dump(std::cerr)); |
+ DOUT << ...; + DEBUG(dump(DOUT)); |
+
std::cerr << "Hello world\n"; |
+ llvm::cerr << "Hello world\n"; |
+
std::cout << "Hello world\n"; |
+ llvm::cout << "Hello world\n"; |
+
std::cin >> Var; |
+ llvm::cin >> Var; |
+
std::ostream |
+ llvm::OStream |
+
std::istream |
+ llvm::IStream |
+
std::stringstream |
+ llvm::StringStream |
+
void print(std::ostream &Out); + // ... + print(std::cerr); |
+ void print(llvm::OStream Out);1 + // ... + print(llvm::cerr);+ + |
1llvm::OStream is a light-weight class so it should never + be passed by reference. This is important because in some configurations, + DOUT is an rvalue.
+Use the "assert" function to its fullest. Check all of your + preconditions and assumptions, you never know when a bug (not neccesarily even + yours) might be caught early by an assertion, which reduces debugging time + dramatically. The "<cassert>" header file is probably already + included by the header files you are using, so it doesn't cost anything to use + it.
+ +To further assist with debugging, make sure to put some kind of error message + in the assertion statement (which is printed if the assertion is tripped). This + helps the poor debugging make sense of why an assertion is being made and + enforced, and hopefully what to do about it. Here is one complete example:
+ +
+ inline Value *getOperand(unsigned i) {
+ assert(i < Operands.size() && "getOperand() out of range!");
+ return Operands[i];
+ }
+
+ Here are some examples:
+ ++ assert(Ty->isPointerType() && "Can't allocate a non pointer type!"); + + assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!"); + + assert(idx < getNumSuccessors() && "Successor # out of range!"); + + assert(V1.getType() == V2.getType() && "Constant types must be identical!"); + + assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!"); ++
You get the idea...
+ +In LLVM, we prefer to explicitly prefix all identifiers from the standard + namespace with an "std::" prefix, rather than rely on + "using namespace std;".
+ +In header files, adding a 'using namespace XXX' directive pollutes + the namespace of any source file that includes the header. This is clearly a + bad thing.
+ +In implementation files (e.g. .cpp files), the rule is more of a stylistic + rule, but is still important. Basically, using explicit namespace prefixes + makes the code clearer, because it is immediately obvious what facilities + are being used and where they are coming from, and more portable, because + namespace clashes cannot occur between LLVM code and other namespaces. The + portability rule is important because different standard library implementations + expose different symbols (potentially ones they shouldn't), and future revisions + to the C++ standard will add more symbols to the std namespace. As + such, we never use 'using namespace std;' in LLVM.
+ +The exception to the general rule (i.e. it's not an exception for + the std namespace) is for implementation files. For example, all of + the code in the LLVM project implements code that lives in the 'llvm' namespace. + As such, it is ok, and actually clearer, for the .cpp files to have a 'using + namespace llvm' directive at their top, after the #includes. The + general form of this rule is that any .cpp file that implements code in any + namespace may use that namespace (and its parents'), but should not use any + others.
+ +If a class is defined in a header file and has a v-table (either it has + virtual methods or it derives from classes with virtual methods), it must + always have at least one out-of-line virtual method in the class. Without + this, the compiler will copy the vtable and RTTI into every .o file that + #includes the header, bloating .o file sizes and increasing link times. +
+ +Hard fast rule: Preincrement (++X) may be no slower than + postincrement (X++) and could very well be a lot faster than it. Use + preincrementation whenever possible.
+ +The semantics of postincrement include making a copy of the value being + incremented, returning it, and then preincrementing the "work value". For + primitive types, this isn't a big deal... but for iterators, it can be a huge + issue (for example, some iterators contains stack and set objects in them... + copying an iterator could invoke the copy ctor's of these as well). In general, + get in the habit of always using preincrement, and you won't have a problem.
+ +The std::endl modifier, when used with iostreams outputs a newline + to the output stream specified. In addition to doing this, however, it also + flushes the output stream. In other words, these are equivalent:
+ ++ std::cout << std::endl; + std::cout << '\n' << std::flush; ++
Most of the time, you probably have no reason to flush the output stream, so + it's better to use a literal '\n'.
+ +A lot of these comments and recommendations have been culled for other + sources. Two particularly important books for our work are:
+ +If you get some free time, and you haven't read them: do so, you might learn + something.
+ +This document describes the CommandLine argument processing library. It will + show you how to use it, and what it can do. The CommandLine library uses a + declarative approach to specifying the command line options that your program + takes. By default, these options declarations implicitly hold the value parsed + for the option declared (of course this can be + changed).
+ +Although there are a lot of command line argument parsing libraries + out there in many different languages, none of them fit well with what I needed. + By looking at the features and problems of other libraries, I designed the + CommandLine library to have the following features:
+ +This document will hopefully let you jump in and start using CommandLine in + your utility quickly and painlessly. Additionally it should be a simple + reference manual to figure out how stuff works. If it is failing in some area + (or you want an extension to the library), nag the author, Chris Lattner.
+ +This section of the manual runs through a simple CommandLine'ification of a + basic compiler tool. This is intended to show you how to jump into using the + CommandLine library in your own program, and show you some of the cool things it + can do.
+ +To start out, you need to include the CommandLine header file into your + program:
+ ++ #include "llvm/Support/CommandLine.h" +
Additionally, you need to add this as the first line of your main + program:
+ +
+ int main(int argc, char **argv) {
+ cl::ParseCommandLineOptions(argc, argv);
+ ...
+ }
+ ... which actually parses the arguments and fills in the variable + declarations.
+ +Now that you are ready to support command line arguments, we need to tell the + system which ones we want, and what type of arguments they are. The CommandLine + library uses a declarative syntax to model command line arguments with the + global variable declarations that capture the parsed values. This means that + for every command line option that you would like to support, there should be a + global variable declaration to capture the result. For example, in a compiler, + we would like to support the Unix-standard '-o <filename>' option + to specify where to put the output. With the CommandLine library, this is + represented like this:
+ + ++ cl::opt<string> OutputFilename("o", cl::desc("Specify output filename"), cl::value_desc("filename")); +
This declares a global variable "OutputFilename" that is used to + capture the result of the "o" argument (first parameter). We specify + that this is a simple scalar option by using the "cl::opt" template (as opposed to the "cl::list template), and tell the CommandLine library + that the data type that we are parsing is a string.
+ +The second and third parameters (which are optional) are used to specify what + to output for the "--help" option. In this case, we get a line that + looks like this:
+ ++ USAGE: compiler [options] + + OPTIONS: + -help - display available options (--help-hidden for more) + -o <filename> - Specify output filename +
Because we specified that the command line option should parse using the + string data type, the variable declared is automatically usable as a + real string in all contexts that a normal C++ string object may be used. For + example:
+ ++ ... + ofstream Output(OutputFilename.c_str()); + if (Out.good()) ... + ... +
There are many different options that you can use to customize the command + line option handling library, but the above example shows the general interface + to these options. The options can be specified in any order, and are specified + with helper functions like cl::desc(...), so + there are no positional dependencies to remember. The available options are + discussed in detail in the Reference Guide.
+ +Continuing the example, we would like to have our compiler take an input + filename as well as an output filename, but we do not want the input filename to + be specified with a hyphen (ie, not -filename.c). To support this + style of argument, the CommandLine library allows for positional arguments to be specified for the program. + These positional arguments are filled with command line parameters that are not + in option form. We use this feature like this:
+ ++ cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::init("-")); +
This declaration indicates that the first positional argument should be + treated as the input filename. Here we use the cl::init option to specify an initial value for the + command line option, which is used if the option is not specified (if you do not + specify a cl::init modifier for an option, then + the default constructor for the data type is used to initialize the value). + Command line options default to being optional, so if we would like to require + that the user always specify an input filename, we would add the cl::Required flag, and we could eliminate the + cl::init modifier, like this:
+ ++ cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::Required); +
Again, the CommandLine library does not require the options to be specified + in any particular order, so the above declaration is equivalent to:
+ ++ cl::opt<string> InputFilename(cl::Positional, cl::Required, cl::desc("<input file>")); +
By simply adding the cl::Required flag, + the CommandLine library will automatically issue an error if the argument is not + specified, which shifts all of the command line option verification code out of + your application into the library. This is just one example of how using flags + can alter the default behaviour of the library, on a per-option basis. By + adding one of the declarations above, the --help option synopsis is now + extended to:
+ ++ USAGE: compiler [options] <input file> + + OPTIONS: + -help - display available options (--help-hidden for more) + -o <filename> - Specify output filename +
... indicating that an input filename is expected.
+ +In addition to input and output filenames, we would like the compiler example + to support three boolean flags: "-f" to force overwriting of the output + file, "--quiet" to enable quiet mode, and "-q" for backwards + compatibility with some of our users. We can support these by declaring options + of boolean type like this:
+ ++ cl::opt<bool> Force ("f", cl::desc("Overwrite output files")); + cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages")); + cl::opt<bool> Quiet2("q", cl::desc("Don't print informational messages"), cl::Hidden); +
This does what you would expect: it declares three boolean variables + ("Force", "Quiet", and "Quiet2") to recognize these + options. Note that the "-q" option is specified with the "cl::Hidden" flag. This modifier prevents it + from being shown by the standard "--help" output (note that it is still + shown in the "--help-hidden" output).
+ +The CommandLine library uses a different parser + for different data types. For example, in the string case, the argument passed + to the option is copied literally into the content of the string variable... we + obviously cannot do that in the boolean case, however, so we must use a smarter + parser. In the case of the boolean parser, it allows no options (in which case + it assigns the value of true to the variable), or it allows the values + "true" or "false" to be specified, allowing any of the + following inputs:
+ ++ compiler -f # No value, 'Force' == true + compiler -f=true # Value specified, 'Force' == true + compiler -f=TRUE # Value specified, 'Force' == true + compiler -f=FALSE # Value specified, 'Force' == false +
... you get the idea. The bool parser just turns + the string values into boolean values, and rejects things like 'compiler + -f=foo'. Similarly, the float, double, and int parsers work + like you would expect, using the 'strtol' and 'strtod' C + library calls to parse the string value into the specified data type.
+ +With the declarations above, "compiler --help" emits this:
+ ++ USAGE: compiler [options] <input file> + + OPTIONS: + -f - Overwrite output files + -o - Override output filename + -quiet - Don't print informational messages + -help - display available options (--help-hidden for more) +
and "compiler --help-hidden" prints this:
+ ++ USAGE: compiler [options] <input file> + + OPTIONS: + -f - Overwrite output files + -o - Override output filename + -q - Don't print informational messages + -quiet - Don't print informational messages + -help - display available options (--help-hidden for more) +
This brief example has shown you how to use the 'cl::opt' class to parse simple scalar command line + arguments. In addition to simple scalar arguments, the CommandLine library also + provides primitives to support CommandLine option aliases, + and lists of options.
+ +So far, the example works well, except for the fact that we need to check the + quiet condition like this now:
+ ++ ... + if (!Quiet && !Quiet2) printInformationalMessage(...); + ... +
... which is a real pain! Instead of defining two values for the same + condition, we can use the "cl::alias" class to make the "-q" + option an alias for the "-quiet" option, instead of providing + a value itself:
+ ++ cl::opt<bool> Force ("f", cl::desc("Overwrite output files")); + cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages")); + cl::alias QuietA("q", cl::desc("Alias for -quiet"), cl::aliasopt(Quiet)); +
The third line (which is the only one we modified from above) defines a + "-q" alias that updates the "Quiet" variable (as specified by + the cl::aliasopt modifier) whenever it is + specified. Because aliases do not hold state, the only thing the program has to + query is the Quiet variable now. Another nice feature of aliases is + that they automatically hide themselves from the -help output + (although, again, they are still visible in the --help-hidden + output).
+ +Now the application code can simply use:
+ ++ ... + if (!Quiet) printInformationalMessage(...); + ... +
... which is much nicer! The "cl::alias" + can be used to specify an alternative name for any variable type, and has many + uses.
+ +So far we have seen how the CommandLine library handles builtin types like + std::string, bool and int, but how does it handle + things it doesn't know about, like enums or 'int*'s?
+ +The answer is that it uses a table-driven generic parser (unless you specify + your own parser, as described in the Extension + Guide). This parser maps literal strings to whatever type is required, and + requires you to tell it what this mapping should be.
+ +Let's say that we would like to add four optimization levels to our + optimizer, using the standard flags "-g", "-O0", + "-O1", and "-O2". We could easily implement this with boolean + options like above, but there are several problems with this strategy:
+ +To cope with these problems, we can use an enum value, and have the + CommandLine library fill it in with the appropriate level directly, which is + used like this:
+ +
+ enum OptLevel {
+ g, O1, O2, O3
+ };
+
+ cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
+ cl::values(
+ clEnumVal(g , "No optimizations, enable debugging"),
+ clEnumVal(O1, "Enable trivial optimizations"),
+ clEnumVal(O2, "Enable default optimizations"),
+ clEnumVal(O3, "Enable expensive optimizations"),
+ clEnumValEnd));
+
+ ...
+ if (OptimizationLevel >= O2) doPartialRedundancyElimination(...);
+ ...
+ This declaration defines a variable "OptimizationLevel" of the + "OptLevel" enum type. This variable can be assigned any of the values + that are listed in the declaration (Note that the declaration list must be + terminated with the "clEnumValEnd" argument!). The CommandLine + library enforces + that the user can only specify one of the options, and it ensure that only valid + enum values can be specified. The "clEnumVal" macros ensure that the + command line arguments matched the enum values. With this option added, our + help output now is:
+ ++ USAGE: compiler [options] <input file> + + OPTIONS: + Choose optimization level: + -g - No optimizations, enable debugging + -O1 - Enable trivial optimizations + -O2 - Enable default optimizations + -O3 - Enable expensive optimizations + -f - Overwrite output files + -help - display available options (--help-hidden for more) + -o <filename> - Specify output filename + -quiet - Don't print informational messages +
In this case, it is sort of awkward that flag names correspond directly to + enum names, because we probably don't want a enum definition named "g" + in our program. Because of this, we can alternatively write this example like + this:
+ +
+ enum OptLevel {
+ Debug, O1, O2, O3
+ };
+
+ cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
+ cl::values(
+ clEnumValN(Debug, "g", "No optimizations, enable debugging"),
+ clEnumVal(O1 , "Enable trivial optimizations"),
+ clEnumVal(O2 , "Enable default optimizations"),
+ clEnumVal(O3 , "Enable expensive optimizations"),
+ clEnumValEnd));
+
+ ...
+ if (OptimizationLevel == Debug) outputDebugInfo(...);
+ ...
+ By using the "clEnumValN" macro instead of "clEnumVal", we + can directly specify the name that the flag should get. In general a direct + mapping is nice, but sometimes you can't or don't want to preserve the mapping, + which is when you would use it.
+ +Another useful argument form is a named alternative style. We shall use this + style in our compiler to specify different debug levels that can be used. + Instead of each debug level being its own switch, we want to support the + following options, of which only one can be specified at a time: + "--debug-level=none", "--debug-level=quick", + "--debug-level=detailed". To do this, we use the exact same format as + our optimization level flags, but we also specify an option name. For this + case, the code looks like this:
+ +
+ enum DebugLev {
+ nodebuginfo, quick, detailed
+ };
+
+ // Enable Debug Options to be specified on the command line
+ cl::opt<DebugLev> DebugLevel("debug_level", cl::desc("Set the debugging level:"),
+ cl::values(
+ clEnumValN(nodebuginfo, "none", "disable debug information"),
+ clEnumVal(quick, "enable quick debug information"),
+ clEnumVal(detailed, "enable detailed debug information"),
+ clEnumValEnd));
+ This definition defines an enumerated command line variable of type "enum + DebugLev", which works exactly the same way as before. The difference here + is just the interface exposed to the user of your program and the help output by + the "--help" option:
+ ++ USAGE: compiler [options] <input file> + + OPTIONS: + Choose optimization level: + -g - No optimizations, enable debugging + -O1 - Enable trivial optimizations + -O2 - Enable default optimizations + -O3 - Enable expensive optimizations + -debug_level - Set the debugging level: + =none - disable debug information + =quick - enable quick debug information + =detailed - enable detailed debug information + -f - Overwrite output files + -help - display available options (--help-hidden for more) + -o <filename> - Specify output filename + -quiet - Don't print informational messages +
Again, the only structural difference between the debug level declaration and + the optimization level declaration is that the debug level declaration includes + an option name ("debug_level"), which automatically changes how the + library processes the argument. The CommandLine library supports both forms so + that you can choose the form most appropriate for your application.
+ +Now that we have the standard run-of-the-mill argument types out of the way, + lets get a little wild and crazy. Lets say that we want our optimizer to accept + a list of optimizations to perform, allowing duplicates. For example, we + might want to run: "compiler -dce -constprop -inline -dce -strip". In + this case, the order of the arguments and the number of appearances is very + important. This is what the "cl::list" + template is for. First, start by defining an enum of the optimizations that you + would like to perform:
+ +
+ enum Opts {
+ // 'inline' is a C++ keyword, so name it 'inlining'
+ dce, constprop, inlining, strip
+ };
+ Then define your "cl::list" variable:
+ ++ cl::list<Opts> OptimizationList(cl::desc("Available Optimizations:"), + cl::values( + clEnumVal(dce , "Dead Code Elimination"), + clEnumVal(constprop , "Constant Propagation"), + clEnumValN(inlining, "inline", "Procedure Integration"), + clEnumVal(strip , "Strip Symbols"), + clEnumValEnd)); +
This defines a variable that is conceptually of the type + "std::vector<enum Opts>". Thus, you can access it with standard + vector methods:
+ ++ for (unsigned i = 0; i != OptimizationList.size(); ++i) + switch (OptimizationList[i]) + ... +
... to iterate through the list of options specified.
+ +Note that the "cl::list" template is + completely general and may be used with any data types or other arguments that + you can use with the "cl::opt" template. One + especially useful way to use a list is to capture all of the positional + arguments together if there may be more than one specified. In the case of a + linker, for example, the linker takes several '.o' files, and needs to + capture them into a list. This is naturally specified as:
+ ++ ... + cl::list<std::string> InputFilenames(cl::Positional, cl::desc("<Input files>"), cl::OneOrMore); + ... +
This variable works just like a "vector<string>" object. As + such, accessing the list is simple, just like above. In this example, we used + the cl::OneOrMore modifier to inform the + CommandLine library that it is an error if the user does not specify any + .o files on our command line. Again, this just reduces the amount of + checking we have to do.
+ +Instead of collecting sets of options in a list, it is also possible to + gather information for enum values in a bit vector. The represention used by + the cl::bits class is an unsigned + integer. An enum value is represented by a 0/1 in the enum's ordinal value bit + position. 1 indicating that the enum was specified, 0 otherwise. As each + specified value is parsed, the resulting enum's bit is set in the option's bit + vector:
+ ++ bits |= 1 << (unsigned)enum; +
Options that are specified multiple times are redundant. Any instances after + the first are discarded.
+ +Reworking the above list example, we could replace + cl::list with cl::bits:
+ ++ cl::bits<Opts> OptimizationBits(cl::desc("Available Optimizations:"), + cl::values( + clEnumVal(dce , "Dead Code Elimination"), + clEnumVal(constprop , "Constant Propagation"), + clEnumValN(inlining, "inline", "Procedure Integration"), + clEnumVal(strip , "Strip Symbols"), + clEnumValEnd)); +
To test to see if constprop was specified, we can use the + cl:bits::isSet function:
+ +
+ if (OptimizationBits.isSet(constprop)) {
+ ...
+ }
+ It's also possible to get the raw bit vector using the + cl::bits::getBits function:
+ ++ unsigned bits = OptimizationBits.getBits(); +
Finally, if external storage is used, then the location specified must be of + type unsigned. In all other ways a cl::bits option is equivalent to a cl::list option.
+ +As our program grows and becomes more mature, we may decide to put summary + information about what it does into the help output. The help output is styled + to look similar to a Unix man page, providing concise information about + a program. Unix man pages, however often have a description about what + the program does. To add this to your CommandLine program, simply pass a third + argument to the cl::ParseCommandLineOptions + call in main. This additional argument is then printed as the overview + information for your program, allowing you to include any additional information + that you want. For example:
+ +
+ int main(int argc, char **argv) {
+ cl::ParseCommandLineOptions(argc, argv, " CommandLine compiler example\n\n"
+ " This program blah blah blah...\n");
+ ...
+ }
+ would yield the help output:
+ ++ OVERVIEW: CommandLine compiler example + + This program blah blah blah... + + USAGE: compiler [options] <input file> + + OPTIONS: + ... + -help - display available options (--help-hidden for more) + -o <filename> - Specify output filename +
Now that you know the basics of how to use the CommandLine library, this + section will give you the detailed information you need to tune how command line + options work, as well as information on more "advanced" command line option + processing capabilities.
+ +Positional arguments are those arguments that are not named, and are not + specified with a hyphen. Positional arguments should be used when an option is + specified by its position alone. For example, the standard Unix grep + tool takes a regular expression argument, and an optional filename to search + through (which defaults to standard input if a filename is not specified). + Using the CommandLine library, this would be specified as:
+ ++ cl::opt<string> Regex (cl::Positional, cl::desc("<regular expression>"), cl::Required); + cl::opt<string> Filename(cl::Positional, cl::desc("<input file>"), cl::init("-")); +
Given these two option declarations, the --help output for our grep + replacement would look like this:
+ ++ USAGE: spiffygrep [options] <regular expression> <input file> + + OPTIONS: + -help - display available options (--help-hidden for more) +
... and the resultant program could be used just like the standard + grep tool.
+ +Positional arguments are sorted by their order of construction. This means + that command line options will be ordered according to how they are listed in a + .cpp file, but will not have an ordering defined if the positional arguments + are defined in multiple .cpp files. The fix for this problem is simply to + define all of your positional arguments in one .cpp file.
+ +Sometimes you may want to specify a value to your positional argument that + starts with a hyphen (for example, searching for '-foo' in a file). At + first, you will have trouble doing this, because it will try to find an argument + named '-foo', and will fail (and single quotes will not save you). + Note that the system grep has the same problem:
+ ++ $ spiffygrep '-foo' test.txt + Unknown command line argument '-foo'. Try: spiffygrep --help' + + $ grep '-foo' test.txt + grep: illegal option -- f + grep: illegal option -- o + grep: illegal option -- o + Usage: grep -hblcnsviw pattern file . . . +
The solution for this problem is the same for both your tool and the system + version: use the '--' marker. When the user specifies '--' on + the command line, it is telling the program that all options after the + '--' should be treated as positional arguments, not options. Thus, we + can use it like this:
+ ++ $ spiffygrep -- -foo test.txt + ...output... +
Sometimes an option can affect or modify the meaning of another option. For + example, consider gcc's -x LANG option. This tells + gcc to ignore the suffix of subsequent positional arguments and force + the file to be interpreted as if it contained source code in language + LANG. In order to handle this properly , you need to know the + absolute position of each argument, especially those in lists, so their + interaction(s) can be applied correctly. This is also useful for options like + -llibname which is actually a positional argument that starts with + a dash.
+So, generally, the problem is that you have two cl::list variables + that interact in some way. To ensure the correct interaction, you can use the + cl::list::getPosition(optnum) method. This method returns the + absolute position (as found on the command line) of the optnum + item in the cl::list.
+The idiom for usage is like this:
+ +
+ static cl::list<std::string> Files(cl::Positional, cl::OneOrMore);
+ static cl::list<std::string> Libraries("l", cl::ZeroOrMore);
+
+ int main(int argc, char**argv) {
+ // ...
+ std::vector<std::string>::iterator fileIt = Files.begin();
+ std::vector<std::string>::iterator libIt = Libraries.begin();
+ unsigned libPos = 0, filePos = 0;
+ while ( 1 ) {
+ if ( libIt != Libraries.end() )
+ libPos = Libraries.getPosition( libIt - Libraries.begin() );
+ else
+ libPos = 0;
+ if ( fileIt != Files.end() )
+ filePos = Files.getPosition( fileIt - Files.begin() );
+ else
+ filePos = 0;
+
+ if ( filePos != 0 && (libPos == 0 || filePos < libPos) ) {
+ // Source File Is next
+ ++fileIt;
+ }
+ else if ( libPos != 0 && (filePos == 0 || libPos < filePos) ) {
+ // Library is next
+ ++libIt;
+ }
+ else
+ break; // we're done with the list
+ }
+ }Note that, for compatibility reasons, the cl::opt also supports an + unsigned getPosition() option that will provide the absolute position + of that option. You can apply the same approach as above with a + cl::opt and a cl::list option as you can with two lists.
+The cl::ConsumeAfter formatting option is + used to construct programs that use "interpreter style" option processing. With + this style of option processing, all arguments specified after the last + positional argument are treated as special interpreter arguments that are not + interpreted by the command line argument.
+ +As a concrete example, lets say we are developing a replacement for the + standard Unix Bourne shell (/bin/sh). To run /bin/sh, first + you specify options to the shell itself (like -x which turns on trace + output), then you specify the name of the script to run, then you specify + arguments to the script. These arguments to the script are parsed by the Bourne + shell command line option processor, but are not interpreted as options to the + shell itself. Using the CommandLine library, we would specify this as:
+ ++ cl::opt<string> Script(cl::Positional, cl::desc("<input script>"), cl::init("-")); + cl::list<string> Argv(cl::ConsumeAfter, cl::desc("<program arguments>...")); + cl::opt<bool> Trace("x", cl::desc("Enable trace output")); +
which automatically provides the help output:
+ ++ USAGE: spiffysh [options] <input script> <program arguments>... + + OPTIONS: + -help - display available options (--help-hidden for more) + -x - Enable trace output +
At runtime, if we run our new shell replacement as `spiffysh -x test.sh + -a -x -y bar', the Trace variable will be set to true, the + Script variable will be set to "test.sh", and the + Argv list will contain ["-a", "-x", "-y", "bar"], because they + were specified after the last positional argument (which is the script + name).
+ +There are several limitations to when cl::ConsumeAfter options can + be specified. For example, only one cl::ConsumeAfter can be specified + per program, there must be at least one positional + argument specified, there must not be any cl::list + positional arguments, and the cl::ConsumeAfter option should be a cl::list option.
+ +By default, all command line options automatically hold the value that they + parse from the command line. This is very convenient in the common case, + especially when combined with the ability to define command line options in the + files that use them. This is called the internal storage model.
+ +Sometimes, however, it is nice to separate the command line option processing + code from the storage of the value parsed. For example, lets say that we have a + '-debug' option that we would like to use to enable debug information + across the entire body of our program. In this case, the boolean value + controlling the debug code should be globally accessable (in a header file, for + example) yet the command line option processing code should not be exposed to + all of these clients (requiring lots of .cpp files to #include + CommandLine.h).
+ +To do this, set up your .h file with your option, like this for example:
+ ++ // DebugFlag.h - Get access to the '-debug' command line option + // + + // DebugFlag - This boolean is set to true if the '-debug' command line option + // is specified. This should probably not be referenced directly, instead, use + // the DEBUG macro below. + // + extern bool DebugFlag; + + // DEBUG macro - This macro should be used by code to emit debug information. + // In the '-debug' option is specified on the command line, and if this is a + // debug build, then the code specified as the option to the macro will be + // executed. Otherwise it will not be. + #ifdef NDEBUG + #define DEBUG(X) + #else + #define DEBUG(X) do { if (DebugFlag) { X; } } while (0) + #endif ++
This allows clients to blissfully use the DEBUG() macro, or the + DebugFlag explicitly if they want to. Now we just need to be able to + set the DebugFlag boolean when the option is set. To do this, we pass + an additional argument to our command line argument processor, and we specify + where to fill in with the cl::location + attribute:
+ ++ bool DebugFlag; // the actual value + static cl::opt<bool, true> // The parser + Debug("debug", cl::desc("Enable debug output"), cl::Hidden, cl::location(DebugFlag)); ++
In the above example, we specify "true" as the second argument to + the cl::opt template, indicating that the + template should not maintain a copy of the value itself. In addition to this, + we specify the cl::location attribute, so + that DebugFlag is automatically set.
+ +This section describes the basic attributes that you can specify on + options.
+ ++ cl::opt<bool> Quiet("quiet"); ++ +
Option modifiers are the flags and expressions that you pass into the + constructors for cl::opt and cl::list. These modifiers give you the ability to + tweak how options are parsed and how --help output is generated to fit + your application well.
+ +These options fall into five main catagories:
+ +It is not possible to specify two options from the same catagory (you'll get + a runtime error) to a single option, except for options in the miscellaneous + catagory. The CommandLine library specifies defaults for all of these settings + that are the most useful in practice and the most common, which mean that you + usually shouldn't have to worry about these.
+ +The cl::NotHidden, cl::Hidden, and + cl::ReallyHidden modifiers are used to control whether or not an option + appears in the --help and --help-hidden output for the + compiled program:
+ +This group of options is used to control how many time an option is allowed + (or required) to be specified on the command line of your program. Specifying a + value for this setting allows the CommandLine library to do error checking for + you.
+ +The allowed values for this option group are:
+ +If an option is not specified, then the value of the option is equal to the + value specified by the cl::init attribute. If + the cl::init attribute is not specified, the + option value is initialized with the default constructor for the data type.
+ +If an option is specified multiple times for an option of the cl::opt class, only the last value will be + retained.
+ +This group of options is used to control whether or not the option allows a + value to be present. In the case of the CommandLine library, a value is either + specified with an equal sign (e.g. '-index-depth=17') or as a trailing + string (e.g. '-o a.out').
+ +The allowed values for this option group are:
+ +In general, the default values for this option group work just like you would + want them to. As mentioned above, you can specify the cl::ValueDisallowed modifier to a boolean + argument to restrict your command line parser. These options are mostly useful + when extending the library.
+ +The formatting option group is used to specify that the command line option + has special abilities and is otherwise different from other command line + arguments. As usual, you can only specify one of these arguments at most.
+ +The CommandLine library does not restrict how you use the cl::Prefix or cl::Grouping modifiers, but it is possible to + specify ambiguous argument settings. Thus, it is possible to have multiple + letter options that are prefix or grouping options, and they will still work as + designed.
+ +To do this, the CommandLine library uses a greedy algorithm to parse the + input option into (potentially multiple) prefix and grouping options. The + strategy basically looks like this:
+ +}
+The miscellaneous option modifiers are the only flags where you can specify + more than one flag from the set: they are not mutually exclusive. These flags + specify boolean properties that modify the option.
+ +So far, these are the only two miscellaneous option modifiers.
+ +Despite all of the built-in flexibility, the CommandLine option library + really only consists of one function (cl::ParseCommandLineOptions) + and three main classes: cl::opt, cl::list, and cl::alias. This section describes these three + classes in detail.
+ +The cl::ParseCommandLineOptions function is designed to be called + directly from main, and is used to fill in the values of all of the + command line option variables once argc and argv are + available.
+ +The cl::ParseCommandLineOptions function requires two parameters + (argc and argv), but may also take an optional third parameter + which holds additional extra text to emit when the + --help option is invoked.
+ +The cl::ParseEnvironmentOptions function has mostly the same effects + as cl::ParseCommandLineOptions, + except that it is designed to take values for options from an environment + variable, for those cases in which reading the command line is not convenient or + desired. It fills in the values of all the command line option variables just + like cl::ParseCommandLineOptions + does.
+ +It takes three parameters: the name of the program (since argv may + not be available, it can't just look in argv[0]), the name of the + environment variable to examine, and the optional + additional extra text to emit when the + --help option is invoked.
+ +cl::ParseEnvironmentOptions will break the environment + variable's value up into words and then process them using + cl::ParseCommandLineOptions. + Note: Currently cl::ParseEnvironmentOptions does not support + quoting, so an environment variable containing -option "foo bar" will + be parsed as three words, -option, "foo, and bar", + which is different from what you would get from the shell with the same + input.
+ +The cl::SetVersionPrinter function is designed to be called + directly from main and before + cl::ParseCommandLineOptions. Its use is optional. It simply arranges + for a function to be called in response to the --version option instead + of having the CommandLine library print out the usual version string + for LLVM. This is useful for programs that are not part of LLVM but wish to use + the CommandLine facilities. Such programs should just define a small + function that takes no arguments and returns void and that prints out + whatever version information is appropriate for the program. Pass the address + of that function to cl::SetVersionPrinter to arrange for it to be + called when the --version option is given by the user.
+ +The cl::opt class is the class used to represent scalar command line + options, and is the one used most of the time. It is a templated class which + can take up to three arguments (all except for the first have default values + though):
+ +
+ namespace cl {
+ template <class DataType, bool ExternalStorage = false,
+ class ParserClass = parser<DataType> >
+ class opt;
+ }
+ The first template argument specifies what underlying data type the command + line argument is, and is used to select a default parser implementation. The + second template argument is used to specify whether the option should contain + the storage for the option (the default) or whether external storage should be + used to contain the value parsed for the option (see Internal + vs External Storage for more information).
+ +The third template argument specifies which parser to use. The default value + selects an instantiation of the parser class based on the underlying + data type of the option. In general, this default works well for most + applications, so this option is only used when using a custom parser.
+ +The cl::list class is the class used to represent a list of command + line options. It too is a templated class which can take up to three + arguments:
+ +
+ namespace cl {
+ template <class DataType, class Storage = bool,
+ class ParserClass = parser<DataType> >
+ class list;
+ }
+ This class works the exact same as the cl::opt class, except that the second argument is + the type of the external storage, not a boolean value. For this class, + the marker type 'bool' is used to indicate that internal storage should + be used.
+ +The cl::bits class is the class used to represent a list of command + line options in the form of a bit vector. It is also a templated class which + can take up to three arguments:
+ +
+ namespace cl {
+ template <class DataType, class Storage = bool,
+ class ParserClass = parser<DataType> >
+ class bits;
+ }
+ This class works the exact same as the cl::lists class, except that the second argument + must be of type unsigned if external storage is used.
+ +The cl::alias class is a nontemplated class that is used to form + aliases for other arguments.
+ +
+ namespace cl {
+ class alias;
+ }
+ The cl::aliasopt attribute should be + used to specify which option this is an alias for. Alias arguments default to + being Hidden, and use the aliased options parser to do + the conversion from string to data.
+ +The cl::extrahelp class is a nontemplated class that allows extra + help text to be printed out for the --help option.
+ +
+ namespace cl {
+ struct extrahelp;
+ }
+ To use the extrahelp, simply construct one with a const char* + parameter to the constructor. The text passed to the constructor will be printed + at the bottom of the help message, verbatim. Note that multiple + cl::extrahelp can be used, but this practice is discouraged. If + your tool needs to print additional help information, put all that help into a + single cl::extrahelp instance.
+For example:
+
+ cl::extrahelp("\nADDITIONAL HELP:\n\n This is the extra help\n");
+ Parsers control how the string value taken from the command line is + translated into a typed value, suitable for use in a C++ program. By default, + the CommandLine library uses an instance of parser<type> if the + command line option specifies that it uses values of type 'type'. + Because of this, custom option processing is specified with specializations of + the 'parser' class.
+ +The CommandLine library provides the following builtin parser + specializations, which are sufficient for most applications. It can, however, + also be extended to work with new data types and new ways of interpreting the + same data. See the Writing a Custom Parser for more + details on this type of library extension.
+ +Although the CommandLine library has a lot of functionality built into it + already (as discussed previously), one of its true strengths lie in its + extensibility. This section discusses how the CommandLine library works under + the covers and illustrates how to do some simple, common, extensions.
+ +One of the simplest and most common extensions is the use of a custom parser. + As discussed previously, parsers are the portion + of the CommandLine library that turns string input from the user into a + particular parsed data type, validating the input in the process.
+ +There are two ways to use a new parser:
+ +Specialize the cl::parser template for + your custom data type.
+ +
This approach has the advantage that users of your custom data type will + automatically use your custom parser whenever they define an option with a value + type of your data type. The disadvantage of this approach is that it doesn't + work if your fundamental data type is something that is already supported.
+ +Write an independent class, using it explicitly from options that need + it.
+ +This approach works well in situations where you would line to parse an + option using special syntax for a not-very-special data-type. The drawback of + this approach is that users of your parser have to be aware that they are using + your parser instead of the builtin ones.
+ +To guide the discussion, we will discuss a custom parser that accepts file + sizes, specified with an optional unit after the numeric size. For example, we + would like to parse "102kb", "41M", "1G" into the appropriate integer value. In + this case, the underlying data type we want to parse into is + 'unsigned'. We choose approach #2 above because we don't want to make + this the default for all unsigned options.
+ +To start out, we declare our new FileSizeParser class:
+ +
+ struct FileSizeParser : public cl::basic_parser<unsigned> {
+ // parse - Return true on error.
+ bool parse(cl::Option &O, const char *ArgName, const std::string &ArgValue,
+ unsigned &Val);
+ };
+ Our new class inherits from the cl::basic_parser template class to + fill in the default, boiler plate code for us. We give it the data type that + we parse into, the last argument to the parse method, so that clients of + our custom parser know what object type to pass in to the parse method. (Here we + declare that we parse into 'unsigned' variables.)
+ +For most purposes, the only method that must be implemented in a custom + parser is the parse method. The parse method is called + whenever the option is invoked, passing in the option itself, the option name, + the string to parse, and a reference to a return value. If the string to parse + is not well-formed, the parser should output an error message and return true. + Otherwise it should return false and set 'Val' to the parsed value. In + our example, we implement parse as:
+ +
+ bool FileSizeParser::parse(cl::Option &O, const char *ArgName,
+ const std::string &Arg, unsigned &Val) {
+ const char *ArgStart = Arg.c_str();
+ char *End;
+
+ // Parse integer part, leaving 'End' pointing to the first non-integer char
+ Val = (unsigned)strtol(ArgStart, &End, 0);
+
+ while (1) {
+ switch (*End++) {
+ case 0: return false; // No error
+ case 'i': // Ignore the 'i' in KiB if people use that
+ case 'b': case 'B': // Ignore B suffix
+ break;
+
+ case 'g': case 'G': Val *= 1024*1024*1024; break;
+ case 'm': case 'M': Val *= 1024*1024; break;
+ case 'k': case 'K': Val *= 1024; break;
+
+ default:
+ // Print an error message if unrecognized character!
+ return O.error(": '" + Arg + "' value invalid for file size argument!");
+ }
+ }
+ }
+ This function implements a very simple parser for the kinds of strings we are + interested in. Although it has some holes (it allows "123KKK" for + example), it is good enough for this example. Note that we use the option + itself to print out the error message (the error method always returns + true) in order to get a nice error message (shown below). Now that we have our + parser class, we can use it like this:
+ ++ static cl::opt<unsigned, false, FileSizeParser> + MFS("max-file-size", cl::desc("Maximum file size to accept"), + cl::value_desc("size")); +
Which adds this to the output of our program:
+ ++ OPTIONS: + -help - display available options (--help-hidden for more) + ... + -max-file-size=<size> - Maximum file size to accept +
And we can test that our parse works correctly now (the test program just + prints out the max-file-size argument value):
+ ++ $ ./test + MFS: 0 + $ ./test -max-file-size=123MB + MFS: 128974848 + $ ./test -max-file-size=3G + MFS: 3221225472 + $ ./test -max-file-size=dog + -max-file-size option: 'dog' value invalid for file size argument! +
It looks like it works. The error message that we get is nice and helpful, + and we seem to accept reasonable file sizes. This wraps up the "custom parser" + tutorial.
+ +Several of the LLVM libraries define static cl::opt instances that + will automatically be included in any program that links with that library. + This is a feature. However, sometimes it is necessary to know the value of the + command line option outside of the library. In these cases the library does or + should provide an external storage location that is accessible to users of the + library. Examples of this include the llvm::DebugFlag exported by the + lib/Support/Debug.cpp file and the llvm::TimePassesIsEnabled + flag exported by the lib/VMCore/Pass.cpp file.
+ +TODO: complete this section
+ +TODO: fill in this section
+ +NOTE: This document is a work in progress!
+This document describes the requirements, design, and configuration of the + LLVM compiler driver, llvmc. The compiler driver knows about LLVM's + tool set and can be configured to know about a variety of compilers for + source languages. It uses this knowledge to execute the tools necessary + to accomplish general compilation, optimization, and linking tasks. The main + purpose of llvmc is to provide a simple and consistent interface to + all compilation tasks. This reduces the burden on the end user who can just + learn to use llvmc instead of the entire LLVM tool set and all the + source language compilers compatible with LLVM.
+The llvmc tool is a configurable compiler + driver. As such, it isn't a compiler, optimizer, + or a linker itself but it drives (invokes) other software that perform those + tasks. If you are familiar with the GNU Compiler Collection's gcc + tool, llvmc is very similar.
+The following introductory sections will help you understand why this tool + is necessary and what it does.
+llvmc was invented to make compilation of user programs with + LLVM-based tools easier. To accomplish this, llvmc strives to:
+Additionally, llvmc makes it easier to write a compiler for use + with LLVM, because it:
+At a high level, llvmc operation is very simple. The basic action + taken by llvmc is to simply invoke some tool or set of tools to fill + the user's request for compilation. Every execution of llvmctakes the + following sequence of steps:
+llvmc's operation must be simple, regular and predictable. + Developers need to be able to rely on it to take a consistent approach to + compilation. For example, the invocation:
+
+ llvmc -O2 x.c y.c z.c -o xyz
+ must produce exactly the same results as:
++ llvmc -O2 x.c -o x.o + llvmc -O2 y.c -o y.o + llvmc -O2 z.c -o z.o + llvmc -O2 x.o y.o z.o -o xyz+
To accomplish this, llvmc uses a very simple goal oriented + procedure to do its work. The overall goal is to produce a functioning + executable. To accomplish this, llvmc always attempts to execute a + series of compilation phases in the same sequence. + However, the user's options to llvmc can cause the sequence of phases + to start in the middle or finish early.
+llvmc breaks every compilation task into the following five + distinct phases:
+The following table shows the inputs, outputs, and command line options + applicable to each phase.
+| Phase | +Inputs | +Outputs | +Options | +
|---|---|---|---|
| Preprocessing | +
|
+
|
+
|
+
| Translation | +
|
+
|
+
|
+
| Optimization | +
|
+
|
+
|
+
| Linking | +
|
+
|
+
|
+
An action, with regard to llvmc is a basic operation that it takes + in order to fulfill the user's request. Each phase of compilation will invoke + zero or more actions in order to accomplish that phase.
+Actions come in two forms:
+This section of the document describes the configuration files used by + llvmc. Configuration information is relatively static for a + given release of LLVM and a compiler tool. However, the details may + change from release to release of either. Users are encouraged to simply use + the various options of the llvmc command and ignore the configuration + of the tool. These configuration files are for compiler writers and LLVM + developers. Those wishing to simply use llvmc don't need to understand + this section but it may be instructive on how the tool works.
+llvmc is highly configurable both on the command line and in + configuration files. The options it understands are generic, consistent and + simple by design. Furthermore, the llvmc options apply to the + compilation of any LLVM enabled programming language. To be enabled as a + supported source language compiler, a compiler writer must provide a + configuration file that tells llvmc how to invoke the compiler + and what its capabilities are. The purpose of the configuration files then + is to allow compiler writers to specify to llvmc how the compiler + should be invoked. Users may but are not advised to alter the compiler's + llvmc configuration.
+ +Because llvmc just invokes other programs, it must deal with the + available command line options for those programs regardless of whether they + were written for LLVM or not. Furthermore, not all compiler tools will + have the same capabilities. Some compiler tools will simply generate LLVM assembly + code, others will be able to generate fully optimized bitcode. In general, + llvmc doesn't make any assumptions about the capabilities or command + line options of a sub-tool. It simply uses the details found in the + configuration files and leaves it to the compiler writer to specify the + configuration correctly.
+ +This approach means that new compiler tools can be up and working very + quickly. As a first cut, a tool can simply compile its source to raw + (unoptimized) bitcode or LLVM assembly and llvmc can be configured + to pick up the slack (translate LLVM assembly to bitcode, optimize the + bitcode, generate native assembly, link, etc.). In fact, the compiler tools + need not use any LLVM libraries, and it could be written in any language + (instead of C++). The configuration data will allow the full range of + optimization, assembly, and linking capabilities that LLVM provides to be added + to these kinds of tools. Enabling the rapid development of front-ends is one + of the primary goals of llvmc.
+ +As a compiler tool matures, it may utilize the LLVM libraries and tools + to more efficiently produce optimized bitcode directly in a single compilation + and optimization program. In these cases, multiple tools would not be needed + and the configuration data for the compiler would change.
+ +Configuring llvmc to the needs and capabilities of a source language + compiler is relatively straight-forward. A compiler writer must provide a + definition of what to do for each of the five compilation phases for each of + the optimization levels. The specification consists simply of prototypical + command lines into which llvmc can substitute command line + arguments and file names. Note that any given phase can be completely blank if + the source language's compiler combines multiple phases into a single program. + For example, quite often pre-processing, translation, and optimization are + combined into a single program. The specification for such a compiler would have + blank entries for pre-processing and translation but a full command line for + optimization.
+Each configuration file provides the details for a single source language + that is to be compiled. This configuration information tells llvmc + how to invoke the language's pre-processor, translator, optimizer, assembler + and linker. Note that a given source language needn't provide all these tools + as many of them exist in llvm currently.
+llvmc always looks for files of a specific name. It uses the
+ first file with the name its looking for by searching directories in the
+ following order:
+
The first file found in this search will be used. Other files with the + same name will be ignored even if they exist in one of the subsequent search + locations.
+In the directories searched, each configuration file is given a specific + name to foster faster lookup (so llvmc doesn't have to do directory searches). + The name of a given language specific configuration file is simply the same + as the suffix used to identify files containing source in that language. + For example, a configuration file for C++ source might be named + cpp, C, or cxx. For languages that support multiple + file suffixes, multiple (probably identical) files (or symbolic links) will + need to be provided.
+Which configuration files are read depends on the command line options and + the suffixes of the file names provided on llvmc's command line. Note + that the -x LANGUAGE option alters the language that llvmc + uses for the subsequent files on the command line. Only the configuration + files actually needed to complete llvmc's task are read. Other + language specific files will be ignored.
+The syntax of the configuration files is very simple and somewhat + compatible with Java's property files. Here are the syntax rules:
+The table below provides definitions of the allowed configuration items + that may appear in a configuration file. Every item has a default value and + does not need to appear in the configuration file. Missing items will have the + default value. Each identifier may appear as all lower case, first letter + capitalized or all upper case.
+| Name | +Value Type | +Description | +Default | +
|---|---|---|---|
LLVMC ITEMS | |||
| version | +string | +Provides the version string for the contents of this + configuration file. What is accepted as a legal configuration file + will change over time and this item tells llvmc which version + should be expected. | +b | +
LANG ITEMS | |||
| lang.name | +string | +Provides the common name for a language definition. + For example "C++", "Pascal", "FORTRAN", etc. | +blank | +
| lang.opt1 | +string | +Specifies the parameters to give the optimizer when + -O1 is specified on the llvmc command line. | +-simplifycfg -instcombine -mem2reg | +
| lang.opt2 | +string | +Specifies the parameters to give the optimizer when + -O2 is specified on the llvmc command line. | +TBD | +
| lang.opt3 | +string | +Specifies the parameters to give the optimizer when + -O3 is specified on the llvmc command line. | +TBD | +
| lang.opt4 | +string | +Specifies the parameters to give the optimizer when + -O4 is specified on the llvmc command line. | +TBD | +
| lang.opt5 | +string | +Specifies the parameters to give the optimizer when + -O5 is specified on the llvmc command line. | +TBD | +
PREPROCESSOR ITEMS | |||
| preprocessor.command | +command | +This provides the command prototype that will be used + to run the preprocessor. This is generally only used with the + -E option. | +<blank> | +
| preprocessor.required | +boolean | +This item specifies whether the pre-processing phase + is required by the language. If the value is true, then the + preprocessor.command value must not be blank. With this option, + llvmc will always run the preprocessor as it assumes that the + translation and optimization phases don't know how to pre-process their + input. | +false | +
TRANSLATOR ITEMS | |||
| translator.command | +command | +This provides the command prototype that will be used + to run the translator. Valid substitutions are %in% for the + input file and %out% for the output file. | +<blank> | +
| translator.output | +bitcode or assembly | +This item specifies the kind of output the language's + translator generates. | +bitcode | +
| translator.preprocesses | +boolean | +Indicates that the translator also preprocesses. If + this is true, then llvmc will skip the pre-processing phase + whenever the final phase is not pre-processing. | +false | +
OPTIMIZER ITEMS | |||
| optimizer.command | +command | +This provides the command prototype that will be used + to run the optimizer. Valid substitutions are %in% for the + input file and %out% for the output file. | +<blank> | +
| optimizer.output | +bitcode or assembly | +This item specifies the kind of output the language's + optimizer generates. Valid values are "assembly" and "bitcode" | +bitcode | +
| optimizer.preprocesses | +boolean | +Indicates that the optimizer also preprocesses. If + this is true, then llvmc will skip the pre-processing phase + whenever the final phase is optimization or later. | +false | +
| optimizer.translates | +boolean | +Indicates that the optimizer also translates. If + this is true, then llvmc will skip the translation phase + whenever the final phase is optimization or later. | +false | +
ASSEMBLER ITEMS | |||
| assembler.command | +command | +This provides the command prototype that will be used + to run the assembler. Valid substitutions are %in% for the + input file and %out% for the output file. | +<blank> | +
On any configuration item that ends in command, you must + specify substitution tokens. Substitution tokens begin and end with a percent + sign (%) and are replaced by the corresponding text. Any substitution + token may be given on any command line but some are more useful than + others. In particular each command should have both an %in% + and an %out% substitution. The table below provides definitions of + each of the allowed substitution tokens.
+| Substitution Token | +Replacement Description | +
|---|---|
| %args% | +Replaced with all the tool-specific arguments given + to llvmc via the -T set of options. This just allows + you to place these arguments in the correct place on the command line. + If the %args% option does not appear on your command line, + then you are explicitly disallowing the -T option for your + tool. + | +
| %force% | +Replaced with the -f option if it was + specified on the llvmc command line. This is intended to tell + the compiler tool to force the overwrite of output files. + | +
| %in% | +Replaced with the full path of the input file. You + needn't worry about the cascading of file names. llvmc will + create temporary files and ensure that the output of one phase is the + input to the next phase. | +
| %opt% | +Replaced with the optimization options for the + tool. If the tool understands the -O options then that will + be passed. Otherwise, the lang.optN series of configuration + items will specify which arguments are to be given. | +
| %out% | +Replaced with the full path of the output file. + Note that this is not necessarily the output file specified with the + -o option on llvmc's command line. It might be a + temporary file that will be passed to a subsequent phase's input. + | +
| %stats% | +If your command accepts the -stats option, + use this substitution token. If the user requested -stats + from the llvmc command line then this token will be replaced + with -stats, otherwise it will be ignored. + | +
| %target% | +Replaced with the name of the target "machine" for + which code should be generated. The value used here is taken from the + llvmc option -march. + | +
| %time% | +If your command accepts the -time-passes + option, use this substitution token. If the user requested + -time-passes from the llvmc command line then this + token will be replaced with -time-passes, otherwise it will + be ignored. + | +
Since an example is always instructive, here's how the Stacker language + configuration file looks.
++ # Stacker Configuration File For llvmc + + ########################################################## + # Language definitions + ########################################################## + lang.name=Stacker + lang.opt1=-simplifycfg -instcombine -mem2reg + lang.opt2=-simplifycfg -instcombine -mem2reg -load-vn \ + -gcse -dse -scalarrepl -sccp + lang.opt3=-simplifycfg -instcombine -mem2reg -load-vn \ + -gcse -dse -scalarrepl -sccp -branch-combine -adce \ + -globaldce -inline -licm + lang.opt4=-simplifycfg -instcombine -mem2reg -load-vn \ + -gcse -dse -scalarrepl -sccp -ipconstprop \ + -branch-combine -adce -globaldce -inline -licm + lang.opt5=-simplifycfg -instcombine -mem2reg --load-vn \ + -gcse -dse scalarrepl -sccp -ipconstprop \ + -branch-combine -adce -globaldce -inline -licm \ + -block-placement + + ########################################################## + # Pre-processor definitions + ########################################################## + + # Stacker doesn't have a preprocessor but the following + # allows the -E option to be supported + preprocessor.command=cp %in% %out% + preprocessor.required=false + + ########################################################## + # Translator definitions + ########################################################## + + # To compile stacker source, we just run the stacker + # compiler with a default stack size of 2048 entries. + translator.command=stkrc -s 2048 %in% -o %out% %time% \ + %stats% %force% %args% + + # stkrc doesn't preprocess but we set this to true so + # that we don't run the cp command by default. + translator.preprocesses=true + + # The translator is required to run. + translator.required=true + + # stkrc doesn't handle the -On options + translator.output=bitcode + + ########################################################## + # Optimizer definitions + ########################################################## + + # For optimization, we use the LLVM "opt" program + optimizer.command=opt %in% -o %out% %opt% %time% %stats% \ + %force% %args% + + optimizer.required = true + + # opt doesn't translate + optimizer.translates = no + + # opt doesn't preprocess + optimizer.preprocesses=no + + # opt produces bitcode + optimizer.output = bc + + ########################################################## + # Assembler definitions + ########################################################## + assembler.command=llc %in% -o %out% %target% %time% %stats% ++
This document uses precise terms in reference to the various artifacts and + concepts related to compilation. The terms used throughout this document are + defined below.
+Note: This document is a work-in-progress. Additions and clarifications + are welcome.
+