From pmeredit at uiuc.edu Sun Feb 1 00:47:11 2009 From: pmeredit at uiuc.edu (Patrick Meredith) Date: Sun, 1 Feb 2009 00:47:11 -0600 Subject: [LLVMdev] Performance vs other VMs In-Reply-To: References: <200901302056.44693.jon@ffconsultancy.com> Message-ID: <5750E468-FB59-4A1C-8715-6D19275887AC@uiuc.edu> Here is a run of scimark2 with verbose GC enabled. You'll see that there are two garbage collection cycles for a total of around .003 seconds of time. It should also be noted that these GCs happened before the timer starts running. There is almost no dynamic memory allocation in this code. Modern garbage collectors are also very efficient (sometimes better than hand deallocation). java -verbose:gc jnt/scimark2/commandline [GC 511K->202K(1984K), 0.0018845 secs] [GC 714K->415K(1984K), 0.0015513 secs] SciMark 2.0a Composite Score: 327.3062235870194 FFT (1024): 127.42845375506063 SOR (100x100): 677.3128255261597 Monte Carlo : 29.4337095721763 Sparse matmult (N=1000, nz=5000): 300.2107071278524 LU (100x100): 502.14542195384803 java.vendor: Apple Inc. java.version: 1.5.0_16 os.arch: i386 os.name: Mac OS X os.version: 10.5.6 On Jan 31, 2009, at 11:25 PM, Ram?n Garc?a wrote: > This is not a quite fair comparison. Other virtual machines must be > doing garbage collection, while LLVM, as it is using C code, it is > taking advantage of memory allocation by hand. > > On Fri, Jan 30, 2009 at 9:56 PM, Jon Harrop > wrote: >> >> The release of a new code generator in Mono 2.2 prompted me to >> benchmark the >> performance of various VMs using the SciMark2 benchmark on an 8x >> 2.1GHz >> 64-bit Opteron and I have published the results here: >> >> http://flyingfrogblog.blogspot.com/2009/01/mono-22.html >> >> The LLVM results were generated using llvm-gcc 4.2.1 on the C >> version of >> SciMark2 with the following command-line options: >> >> llvm-gcc -Wall -lm -O2 -funroll-loops *.c -o scimark2 >> >> Mono was up to 12x slower than LLVM before and is now only 2.2x >> slower on >> average. Interestingly, the JVM scores slightly higher than LLVM on >> this >> benchmark on average and beats LLVM on two of the five individual >> tests. >> >> The individual scores are particularly enlightening. Specifically: >> >> . LLVM outperforms all other VMs by a significant margin on FFT, >> Monte Carlo >> and sparse matrix multiply. >> >> . LLVM is beaten by the JVM on successive over-relaxation (SOR) and >> LU >> decomposition. >> >> In the context of the SOR test, I suspect the JVM is using alias >> information >> to perform optimizations that LLVM and llvm-gcc probably do not do. >> >> I am not sure what causes the performance discrepancy on LU. >> Perhaps the JVM >> is generating SSE instructions. Does llvm-gcc generate SSE >> instructions under >> any circumstances? >> >> -- >> Dr Jon Harrop, Flying Frog Consultancy Ltd. >> http://www.ffconsultancy.com/?e >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From jon at ffconsultancy.com Sun Feb 1 07:38:24 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Sun, 1 Feb 2009 13:38:24 +0000 Subject: [LLVMdev] GEPping GEPs and first-class structs Message-ID: <200902011338.25003.jon@ffconsultancy.com> As I understand it, first-class structs will allow structs to be passed as function arguments and returned as results (i.e. multiple return values) instead of passing pointers to structs. However, the GEP instruction only handles pointer types. So I do not understand how you will be able to extract the fields of a struct when it is received as a value type. Will the GEP instruction be altered so that it can be applied to structs directly? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From jon at ffconsultancy.com Sun Feb 1 10:01:20 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Sun, 1 Feb 2009 16:01:20 +0000 Subject: [LLVMdev] Aliasing (was Performance vs other VMs) In-Reply-To: References: <200901302056.44693.jon@ffconsultancy.com> Message-ID: <200902011601.20343.jon@ffconsultancy.com> On Sunday 01 February 2009 05:25:40 Ram?n Garc?a wrote: > This is not a quite fair comparison. Other virtual machines must be > doing garbage collection, while LLVM, as it is using C code, it is > taking advantage of memory allocation by hand. That is an insignificant advantage in this particular case (SciMark2) because the memory for each test is preallocated and not part of the measurement and the heap and stack are both tiny during the computations so there is little to traverse. I am interested in the comparative results for LLVM because I consider it to represent how fast my LLVM-based VM might be compared to other garbage collected VMs. However, LLVM has a serious disadvantage compared to the other VMs here because it does not have aliasing assurances. For example, it does not know about array aliasing, e.g. that the subarrays in the successive over-relaxation test cannot overlap. The LLVM 2.1 release notes say that llvm-gcc got alias analysis and understood the "restrict" keyword but when I add it to the C code for SciMark2 it makes no difference. Can anyone else get this to work? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From nipun2512 at gmail.com Sun Feb 1 13:45:32 2009 From: nipun2512 at gmail.com (Nipun Arora) Date: Sun, 1 Feb 2009 14:45:32 -0500 Subject: [LLVMdev] Optimized code analysis problems In-Reply-To: References: Message-ID: Hi Eli, Well I think a way to hack it might be better for my purposes, can you suggest any ways of getting started on that and where. Essentially I'm developing an IDE and need to extract the dependency graphs while retaining the actual function names rather than them being converted to llvm.* names. If I go for the other option you suggested. I'd have to do a one-to one mapping of all possible optimized function calls that could be made from different libraries imported by the user. Thanks Nipun On Sat, Jan 31, 2009 at 4:48 PM, Eli Friedman wrote: > On Sat, Jan 31, 2009 at 1:14 PM, Nipun Arora wrote: > > Hii, > > > > Thanks for the response, yes I couldn't find any way to extract the names > > through any of the passes. > > Where could I potentially insert a hack so that any function call to > > intrinsic functions or library functions can be retrieved? > > Could you gimme any ideas for the start? > > Basically, there is no mapping from the llvm.* names to the _mm_* > names; the transformation is lossy. > > You have a couple options here: one is to manipulate the source to let > you see the _mm_ names, and the other is to catch the _mm_ names > before the inliner runs. > > Manipulating the source isn't actually very hard, although it's a > non-trivial amount of work; basically, you create your own xmmintrin.h > that doesn't have inline implementations, and mess with the include > paths so the compiler picks your version rather than the builtin > version. That way, once you transform to IL, the _mm_ calls will stay > as _mm_ calls. > > If you're using the standard headers, the _mm_ function are defined as > inline functions, so at least in trunk LLVM builds, they exist in the > IL at some point. They're gone by the time llvm-gcc outputs the IL, > though, because the inliner unconditionally inlines them. So to get > the _mm_ names, you'll have to hack the llvm-gcc source to disable the > inlining pass. > > -Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090201/0cb19677/attachment.html From lists at grahamwakefield.net Sun Feb 1 14:42:44 2009 From: lists at grahamwakefield.net (Graham Wakefield) Date: Sun, 1 Feb 2009 12:42:44 -0800 Subject: [LLVMdev] OCaml Journal article: Building a Virtual Machine with LLVM In-Reply-To: <200901251516.22805.jon@ffconsultancy.com> References: <200901251516.22805.jon@ffconsultancy.com> Message-ID: <3FACAF52-7D9B-4B55-9C5E-82C39634F2FB@grahamwakefield.net> I'd love to read this article, but I can't justify paying to register. Will it become a 'freely available' article at any point soon? Thanks On Jan 25, 2009, at 7:16 AM, Jon Harrop wrote: > > Following on from the success of our previous OCaml Journal articles > covering > LLVM, we have begun a series dedicated to the design and > implementation of > high-level languages using LLVM. In particular, these new articles > are more > pragmatic in nature and go beyond describing working compilers to also > discuss testing, debugging and the performance of LLVM-based > compilers. > > The first article in this new series has just been published: > > http://ocamlnews.blogspot.com/2009/01/building-virtual-machine-using-llvm.html > > This article describes a basic design for a High-Level Virtual > Machine (HLVM) > and walks through a core implementation written in OCaml that can > JIT compile > functions from a simple language (with unit, bool, int, float and > array > types) to optimized native code and execute them. > > Future articles in this series will add tail calls, tuples, algebraic > datatypes, run-time types, reflection, accurate garbage collection, > dynamically-loaded libraries and FFI to C, first-class functions and > many > more features. > > When our HLVM reaches a more advanced stage, with sum types and > garbage > collection, we shall release it as open source software and > encourage others > to build interoperable language implementations upon it, i.e. > creating a > common language run-time. > > Many thanks, > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From clattner at apple.com Mon Feb 2 00:10:26 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 1 Feb 2009 22:10:26 -0800 Subject: [LLVMdev] -msse3 can degrade performance In-Reply-To: <200901310543.29803.jon@ffconsultancy.com> References: <200901310143.30492.jon@ffconsultancy.com> <200901310543.29803.jon@ffconsultancy.com> Message-ID: <928B9BEC-092B-4E8F-BB2C-A5EEFC5A5873@apple.com> On Jan 30, 2009, at 9:43 PM, Jon Harrop wrote: > On Saturday 31 January 2009 03:42:04 Eli Friedman wrote: >> On Fri, Jan 30, 2009 at 5:43 PM, Jon Harrop >> wrote: >>> I just remembered an anomalous result that I stumbled upon whilst >>> tweaking the command-line options to llvm-gcc. Specifically, the - >>> msse3 >>> flag >> >> The -msse3 flag? Does the -msse2 flag have a similar effect? > > Yes: Hi Jon, I'm seeing exactly identical .s files with -msse2 and -msse3 on the scimark version I have. Can you please send the output of: llvm-gcc -O3 MonteCarlo.c -S -msse2 -o MonteCarlo.2.s llvm-gcc -O3 MonteCarlo.c -S -msse3 -o MonteCarlo.3.s llvm-gcc -O3 MonteCarlo.c -S -msse2 -o MonteCarlo.2.ll -emit-llvm llvm-gcc -O3 MonteCarlo.c -S -msse3 -o MonteCarlo.3.ll -emit-llvm Thanks, -Chris > > > $ llvm-gcc -Wall -lm -O3 -msse2 *.c -o scimark2 > $ ./scimark2 > Composite Score: 525.99 > FFT Mflops: 538.35 (N=1024) > SOR Mflops: 472.29 (100 x 100) > MonteCarlo: Mflops: 120.92 > Sparse matmult Mflops: 585.14 (N=1000, nz=5000) > LU Mflops: 913.27 (M=100, N=100) > > But -msse does not: > > $ llvm-gcc -Wall -lm -O3 -msse *.c -o scimark2 > $ ./scimark2 > Composite Score: 540.08 > FFT Mflops: 535.04 (N=1024) > SOR Mflops: 469.99 (100 x 100) > MonteCarlo: Mflops: 197.38 > Sparse matmult Mflops: 587.77 (N=1000, nz=5000) > LU Mflops: 910.22 (M=100, N=100) > > That was x64 and I get similar results for x86. > > Is there some kind of contention between the integer and SSE > registers? > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From clattner at apple.com Mon Feb 2 01:06:03 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 1 Feb 2009 23:06:03 -0800 Subject: [LLVMdev] Adding legal integer sizes to TargetData Message-ID: Now that 2.5 is about to branch, I'd like to bring up one of Scott's favorite topics: certain optimizers widen or narrow arithmetic, without regard for whether the type is legal for the target. In his specific case, instcombine is turning an i32 multiply into an i64 multiply in order to eliminate a cast. This does simplify/reduce the number of IR operations, but an i64 multiply is dramatically more expensive than an i32 multiply on CellSPU. There are a couple of different ways to look at this. On the one hand, I still strongly believe that codegen should be able to re- narrow operations (and it does on his testcase on i386). However, codegen is currently doing these optimizations on a per-basic block basis, and we're not likely to have whole-function dags in the near future, so there is an inherent limit to its power. An earlier place to handle this is in codegen prepare, which is global. However, the bad thing about this is that it would effectively require duplicating all the type legalization code in CGP, which is a pass we want to shrink, not grow. OTOH, the whole CGP pass is really a hack around selection dags not being whole-function. A third way to handle this is to add to target data a notion of "native types". Instcombine could then be constrained to not do the widening/narrowing transformations when the original type (i32 in this case) was native but the destination type (i64) is non-native. On the one hand, adding this to targetdata is simple and straight- forward with well-defined semantics. OTOH, it is somewhat ugly that IR canonicalization gets a bit more target-specific. On the third hand, instcombine already promotes indices of GEPs to match the pointer size etc, so it wouldn't be too crazy for it to do this. What do others think about this? -Chris From jay.foad at gmail.com Mon Feb 2 09:00:06 2009 From: jay.foad at gmail.com (Jay Foad) Date: Mon, 2 Feb 2009 15:00:06 +0000 Subject: [LLVMdev] bug 3367 Message-ID: Please can you consider fixing bug 3367 in the trunk and 2.5 branch? It's an assertion failure in an optimization pass that I hit when compiling a real C++ application. It's caused by -inline not updating the call graph when it replaces a call with an invoke. There's a very small test case attached to the bug, as well as a pretty obvious fix. http://llvm.org/bugs/show_bug.cgi?id=3367 Thanks! Jay. From listiges at arcor.de Mon Feb 2 09:14:55 2009 From: listiges at arcor.de (Nico) Date: Mon, 2 Feb 2009 16:14:55 +0100 Subject: [LLVMdev] OpenCL kernel to bitcode Message-ID: Hi, is there any possibility to compile OpenCL kernels into LLVM-bitcode? Thanx, Nico From criswell at cs.uiuc.edu Mon Feb 2 09:17:30 2009 From: criswell at cs.uiuc.edu (John Criswell) Date: Mon, 2 Feb 2009 09:17:30 -0600 Subject: [LLVMdev] Test Email Message-ID: <49870E8A.8020902@cs.uiuc.edu> Dear All, This is a test. Please ignore. -- John T. From snaroff at apple.com Mon Feb 2 09:24:21 2009 From: snaroff at apple.com (steve naroff) Date: Mon, 2 Feb 2009 10:24:21 -0500 Subject: [LLVMdev] [cfe-commits] r63168 - /cfe/trunk/Driver/clang.cpp In-Reply-To: <7925cd330901280444i2fbf835dh92eaa699de56b855@mail.gmail.com> References: <200901280243.n0S2haQj002955@zion.cs.uiuc.edu> <7925cd330901280444i2fbf835dh92eaa699de56b855@mail.gmail.com> Message-ID: Hi Piortr, This also breaks the hand-built VC++ project. Any clues on where I should define this? Thanks, snaroff -------------- next part -------------- A non-text attachment was scrubbed... Name: Picture 7.png Type: image/png Size: 13645 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/eb726ff5/attachment-0001.png -------------- next part -------------- On Jan 28, 2009, at 7:44 AM, Piotr Rak wrote: > 2009/1/28 Mike Stump : >> Author: mrs >> Date: Tue Jan 27 20:43:35 2009 >> New Revision: 63168 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=63168&view=rev >> Log: >> Add a preliminary version number. >> >> Modified: >> cfe/trunk/Driver/clang.cpp >> >> Modified: cfe/trunk/Driver/clang.cpp >> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/Driver/clang.cpp?rev=63168&r1=63167&r2=63168&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- cfe/trunk/Driver/clang.cpp (original) >> +++ cfe/trunk/Driver/clang.cpp Tue Jan 27 20:43:35 2009 >> @@ -1620,6 +1620,10 @@ >> } >> } >> >> + if (Verbose) >> + fprintf(stderr, "clang version 1.0 based upon " PACKAGE_STRING >> + " hosted on " LLVM_HOSTTRIPLE "\n"); >> + >> if (unsigned NumDiagnostics = Diags.getNumDiagnostics()) >> fprintf(stderr, "%d diagnostic%s generated.\n", NumDiagnostics, >> (NumDiagnostics == 1 ? "" : "s")); >> >> >> _______________________________________________ >> cfe-commits mailing list >> cfe-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >> > > Hi, > > This commit braeks cmake build for me. PACKAGE_STRING is not set by > llvm toplevel CMakeList.txt, and later defined by > 'include/llvm/Config/config.h.cmake'. > I also changed PACKAGE_VERSION to match with one from 'configure.ac'. > > Attached fix (for llvm). > > Piotr > < > cmake_package_string > .diff>_______________________________________________ > cfe-commits mailing list > cfe-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits From zhousheng00 at gmail.com Mon Feb 2 09:48:24 2009 From: zhousheng00 at gmail.com (Zhou Sheng) Date: Mon, 2 Feb 2009 23:48:24 +0800 Subject: [LLVMdev] Proposal: Debug information improvement - keep the line number with optimizations Message-ID: <8abe0dc60902020748l8cbda49h3552a1954e3e43be@mail.gmail.com> Hi, I've been thinking about how to keep the line number with the llvm transform/Analysis passes. Basically, I agree with Chris's notes ( http://www.nondot.org/sabre/LLVMNotes/DebugInfoImprovements.txt), and I will follow his way to turn on the line number information when optimization enabled. Here is a detailed proposal: 1. Introduction At the time of this writing, LLVM's DWARF debug info generation works reasonably well at -O0, but it is completely disabled at optimization levels -O1 and higher. This is because our debug info representation interferes with optimizations, transparently disabling them in cases where they would not update it correctly. This is useful for preserving correct debug info, but it is not what people expect when they use 'llvm-gcc -O3 -g foo.c'. (From Chris Lattner) ... ... This document describes a path forward that will get us to a place where turning on debug info does not pessimize code, and still preserves the invariant that we don't produce bogus debug info. *1.1. **Goals* The goals for this project are to: 1. Enable optimization when line number info is turned on. 2. Do not generate incorrect/bogus line number info The goals of this proposal are to: 1. Clearly state the requirements. 2. Identify a work plan that will satisfy those requirements. 3. Estimate time, schedule and cost for the work plan. *1.2. **Resources, Tools and Methods* This work will be accomplished using the following: 1. Intel(R) Xeon(R) CPU E5420 2.50GHz hardware 2. Linux 2.6 Kernel (Fedora Core 6 or Cent OS 5) 3. GCC 4.1 compiler 4. C++ programming language 5. LLVM, RELEASE 2.3 6. LLVM-GCC4.2 RELEASE 2.3 The method for performing this work will use the LLVM project's open source development policies, which include: 1. Incremental development. A progression of small changes to the LLVM code base will be made. To incrementally move LLVM in the desired direction, this generally means adding features and testing them before removing the functionality they are intended to replace. This approach maintains compatibility with previous designs until new designs are provably correct. 2. Validation with test cases. With each incremental change, llvm/test and llvm-test test suites should be run through to (a) prove the new functionality, (b) expose potential weaknesses in the implementation. Additionally, sets of unit test cases should be developed for each new feature. 3. Milestone Validation. Each milestone will be recognized as completed when the associated set of test cases functions correctly. 4. Open Source with peer review (Optional). For each patch, submit to LLVM commit-list via email for review and comment. This is for contribution back to LLVM in the future. 5. Incremental documentation. As new features are added to LLVM, the documentation will be updated at the same time. 6. Bugzilla Tracking. 2. Requirements As LLVM optimization passes will change the original input source code a lot, it is not a trivial work to keep the debug information in the optimized code. The point is we should make sure the debug information in the optimized code is totally correct. For now, I think there is no absolutely solution for this project. A reasonable scheme is to keep the correct debug info, if the debug info leaves incorrect after optimization, just remove it. It does not generate silently broken information. (From Chris) This is a long project, and will take quite a bit of work in all areas before we can declare "success", but it is worthwhile, and important and useful steps can be made without solving the whole problem. This proposal should solve half of this problem. That is to keep the line number information with optimization code. (From Chris) The following sub-sections define specific requirements to improve the debug information in LLVM. *2.1 ** Verification Flow* The most important of this project is to make the debug information do not block any optimization by LLVM transform passes. Here I propose a way to determine whether codegen is being impacted by debug info. This is also useful for us to scan the LLVM transform pass list to find which pass need to update to work with debug information. *From Chris:* Add a -strip-debug pass that removes all debug info from the LLVM IR. Given this, it would allow us to do: $ llvm-gcc -O3 -c -o - | llc > good.s $ llvm-gcc -O3 -c -g -o - | opt -strip-debug | llc > test.s $ diff good.s test.s If the two .s files differed, then badness happened. This obviously only catches badness that happens in the LLVM optimizer, if the code generator is broken, we'll need something more sophisticated that strips debug info out of the .s file. In any case, this is a good place to start, and should be turned into a llvm-test TEST/report. Incidentally, we have to go through codegen, we can't diff .ll files after debug info is stripped out. This is because debug info is allowed to (and probably does) impact local names within functions, but these functions are removed at codegen and are not important to preserve. *End* *2.2 * *A Pass to clean up the debug info* LLVM already has a transform pass "-strip-debug", it removes all the debug information. But for the first half of this project, we want to just keep the line number information (stop point) in the optimized code. So we need a new transform pass to just removes the variable declaration information. Pass "-strip-debug" also doesn't cleanup the dead variable and function calling for debug information, it thinks other pass like "-dce" or "-globaldce" can handle this. But as we are also going to update those passes, we can't use them in the verification flow, otherwise, it may output incorrect check results. The new pass "-strip-debug-pro" should have the following functions: 1. Just remove the variable declaration information and clean up the dead debug information. 2. Just remove the line number information and clean up the dead debug information. 3. Remove all the debug information and clean up. *2.3 ** Front End Changes* For the first half of the project, we just aim to handle the line number debug information. So we need to force llvm-gcc not to emit any variable declaration information. *2.4 ** Optimization Transform Changes* According to the output of the check script, we can get a pass-to-update list. Just follow the list to update the pass one by one. When done a single pass, turn back to run the llvm/test and llvm-test, note apply the pass "-strip-debug-pro" right after the updated pass to see if it work correctly. 2. Proposed Work Plan This section defines a proposed work plan to accomplish the requirements that we desires. The work plan is broken into several distinct phases that follow a logical progression of modifications to the LLVM software. *2.1 ** Phase 1: Establish the testing system* One of the most useful things to get started is to have some way to determine whether codegen is being impacted by debug info. It is important to be able to tell when this happens so that we can track down these places and fix them. *2.1.1 **Pass Scanning Script* Following the way proposed by Chris, it is good to have a script to scan the standard LLVM transform pass list. We can get the standard compile optimization pass list by: $ opt -std-compile-opts -debug-pass=Arguments foo.bc > /dev/null Pass Arguments: -preverify -domtree -verify -lowersetjmp -raiseallocs -simplifycfg -domtree -domfrontier -mem2reg -globalopt -globaldce -ipconstprop -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -inline -argpromotion -tailduplicate -simplify-libcalls -instcombine -jump-threading -simplifycfg -domtree -domfrontier -scalarrepl -instcombine -break-crit-edges -condprop -tailcallelim -simplifycfg -reassociate -domtree -loops -loopsimplify -domfrontier -scalar-evolution -lcssa -loop-rotate -licm -lcssa -loop-unswitch -scalar-evolution -lcssa -loop-index-split -instcombine -scalar-evolution -domfrontier -lcssa -indvars -domfrontier -scalar-evolution -lcssa -loop-unroll -instcombine -domtree -memdep -gvn -memcpyopt -sccp -instcombine -break-crit-edges -condprop -memdep -dse -mergereturn -postdomtree -postdomfrontier -adce -simplifycfg -strip-dead-prototypes -printusedtypes -deadtypeelim -constmerge -preverify -domtree -verify The script should look like: #!/bin/sh OPTS="-preverify -domtree -verify -lowersetjmp -raiseallocs -simplifycfg -domtree -domfrontier -mem2reg -globalopt -globaldce -ipconstprop -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -inline -argpromotion -tailduplicate -simplify-libcalls -instcombine -jump-threading -simplifycfg -domtree -domfrontier -scalarrepl -instcombine -break-crit-edges -condprop -tailcallelim -simplifycfg -reassociate -domtree -loops -loopsimplify -domfrontier -scalar-evolution -lcssa -loop-rotate -licm -lcssa -loop-unswitch -scalar-evolution -lcssa -loop-index-split -instcombine -scalar-evolution -domfrontier -lcssa -indvars -domfrontier -scalar-evolution -lcssa -loop-unroll -instcombine -domtree -memdep -gvn -memcpyopt -sccp -instcombine -break-crit-edges -condprop -memdep -dse -mergereturn -postdomtree -postdomfrontier -adce -simplifycfg -strip-dead-prototypes -printusedtypes -deadtypeelim -constmerge -preverify -domtree -verify" llvm-gcc -g -emit-llvm -c $1 -o $1.db1.ll -S llvm-gcc -emit-llvm -c $1 -o good.bc sed '/call void @llvm.dbg.declare/d' $1.db1.ll > $1.db2.ll llvm-as $1.db2.ll -f for p in $OPTS; do opt $p $1.db2.bc -o $1.db2.bc -f opt -strip-debug -deadtypeelim -dce -globaldce -deadtypeelim $1.db2.bc | llc > test.s -f opt $p -strip-debug -deadtypeelim -dce -globaldce -deadtypeelim good.bc -o good.bc -f llc good.bc > good.s -f echo "PASS $p : " >> diff.log if `diff good.s test.s >> diff.log 2>&1 ` ; then echo "PASS $p : SUCC" else echo "PASS $p : FAIL" fi done For example: Foo.c: int foo(int x, int y) { return x + y; } $ ./check.sh foo.c PASS -preverify : SUCC PASS -domtree : SUCC PASS -verify : SUCC PASS -lowersetjmp : SUCC PASS -raiseallocs : SUCC PASS -simplifycfg : SUCC PASS -domtree : SUCC PASS -domfrontier : SUCC PASS -mem2reg : FAIL PASS -globalopt : FAIL PASS -globaldce : FAIL PASS -ipconstprop : FAIL PASS -deadargelim : FAIL PASS -instcombine : FAIL PASS -simplifycfg : FAIL Check the log file: PASS -preverify : PASS -domtree : PASS -verify : PASS -lowersetjmp : PASS -raiseallocs : PASS -simplifycfg : PASS -domtree : PASS -domfrontier : PASS -mem2reg : 8,9c8,14 < movl 4(%esp), %eax < addl 8(%esp), %eax --- > subl $8, %esp > movl 12(%esp), %eax > movl %eax, 4(%esp) > movl 16(%esp), %eax > movl %eax, (%esp) > addl 4(%esp), %eax > addl $8, %esp For the above example, we found that the transform pass "mem2reg" obviously not done the work when keeping the debug information. Then we know we need to update it and re-test *2.1.2 **Update the LLVM testing system* The LLVM testing infrastructure contains two major categories of tests: code fragments and whole programs. Code fragments are referred to as the "DejaGNU tests" and are in the llvm module in subversion under the llvm/test directory. The whole programs tests are referred to as the "Test suite" and are in the test-suite module in subversion. Scanning all the test cases, find those using the specified transform and add the script similar to that previously mentioned. Make the result write into llvm-test TEST/report. *2.2 * *Phase 2: New Pass to Strip Debug Information* LLVM already has a transform pass "-strip-debug", it removes all the debug information. But for the first half of this project, we want to just keep the line number information (stop point) in the optimized code. So we need a new transform pass to just removes the variable declaration information. Pass "-strip-debug" also doesn't cleanup the dead variable and function calling for debug information, it thinks other pass like "-dce" or "-globaldce" can handle this. But as we are also going to update those passes, we can't use them in the verification flow, otherwise, it may output incorrect check results. The new pass "-strip-debug-pro" should have the following functions: 1. Just remove the variable declaration information and clean up the dead debug information. 2. Remove all the debug information and clean up *3.2.1 **Work Plan* 1. Take a reference to transform pass StripSymbol.cpp 2. Based on the StripSymbol.cpp, add an option to it to just remove debug information, like "-rm-debug" 3. Add an option to just remove the variable declaration information, like "?rm-debug=2" 4. Add a procedure to clean up the dead variables and function calls for debug purpose. *2.3 ** Phase 3: Extend llvm-gcc* Once we have a way to verify what is happening, I propose that we aim for an intermediate point: instead of having -O disable all debug info, we should make it disable just variable information, but keep emitting line number info. This would allow stepping through the program, getting stack traces, use performance tools like shark, etc. We need the front-end llvm-gcc to have a mode that causes it to emit line number info but not variable info, we can go through the process above to identify passes that change behavior when line number intrinsics are in the code. *1.3.1 **Work Plan* 1. First locate the file position that llvm-gcc handle the parameter options. 2. Add a new option to control the llvm-gcc to emit specified debug information: like ?g1. ?g1 to only emit line number. 3. Building the new llvm-gcc 4. Testing through llvm/test, llvm-test *2.4 ** Phase 4: Update Transform Passes for Line Number Info.* When the front-end has a mode that causes it to emit line number info but not variable info, we can go through the process above to identify passes that change behavior when line number intrinsics are in the code. Obvious cases are things like loop unroll and inlining: they 'measure' the size of some code to determine whether to unroll it or not. This means that it should be enhanced to ignore debug intrinsics for the sake of code size estimation. Another example is optimizations like SimplifyCFG when it merges if/then/else into select instructions. SimplifyCFG will have to be enhanced to ignore debug intrinsics when doing its safety/profitability analysis, but then it will also have to be updated to just delete the line number intrinsics when it does the xform. This is simplifycfg's way of "updating" the debug info for this example transformation. As we progress through various optimizations, we will find cases where it is possible to update (e.g. loop unroll or inlining, which doesn't have to do anything special to update line #'s) and places where it isn't. As long as the debug intrinsics don't affect codegen, we are happy, even if the debug intrinsics are deleted in cases where it would be possible to update them (this becomes a optimized debugging QoI issue). *3.4.1** Work Plan* 1. Update transform pass mem2reg 2. Testing through llvm/test, llvm-test 3. Update transform pass simplifycfg 4. Testing through llvm/test, llvm-test 5. Likewise, update transform passes globalopt, globaldce, ipconstprop, deadargelim, instcombine... 6. Update other passes and testing them. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/cbcd8e52/attachment-0001.html From dag at cray.com Mon Feb 2 12:08:11 2009 From: dag at cray.com (David Greene) Date: Mon, 2 Feb 2009 12:08:11 -0600 Subject: [LLVMdev] undefs in phis In-Reply-To: References: <200901291647.26270.dag@cray.com> <200901301552.24534.dag@cray.com> Message-ID: <200902021208.12714.dag@cray.com> On Friday 30 January 2009 16:54, Evan Cheng wrote: > I don't have the whole context to understand why you think this is a > bug. An implicit_def doesn't actually define any value. So we don't > care if a live interval overlaps live ranges defined by an implicit_def. It's a bug because the coalerscer does illegal coaescing. Our last episode left us here: bb134: 2696 %reg1645 = FsMOVAPSrr %reg1458 ; srcLine 0 bb74: 2700 %reg1176 = FsMOVAPSrr %reg1645 ; srcLine 0 [deleted copy] 2708 %reg1178 = FsMOVAPSrr %reg1647 ; srcLine 0 *** u before d 2712 TEST64rr %reg1173, %reg1173, %EFLAGS ; srcLine 30 2716 JLE mbb, %EFLAGS ; srcLine 0 bb108: [...] 4352 %reg1253 = MAXSSrr %reg1253, %reg1588 ; srcLine 60 4356 %reg1645 = FsMOVAPSrr %reg1253 ; srcLine 0 4360 %reg1177 = FsMOVAPSrr %reg1176 ; srcLine 0 *** updated 4364 %reg1647 = FsMOVAPSrr %reg1243 ; srcLine 0 4368 JMP mbb ; srcLine 0 This still looks correct. The coalescer then says: 4360 %reg1177 = FsMOVAPSrr %reg1176 ; srcLine 0 Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) and %reg1177,0 = [2700,3712:0)[3768,3878:0)[4362,4372:0) 0 at 4362-(3878): Joined. Result = %reg1177,0 = [2700,4372:0) 0 at 2702-(4362) Now let's look at the resulting code: bb134: 2696 %reg1645 = FsMOVAPSrr %reg1458 ; srcLine 0 bb74: 2700 %reg1177 = FsMOVAPSrr %reg1645 ; srcLine 0 *** u [deleted copy] 2708 %reg1178 = FsMOVAPSrr %reg1647 ; srcLine 0 *** u before d 2712 TEST64rr %reg1173, %reg1173, %EFLAGS ; srcLine 30 2716 JLE mbb, %EFLAGS ; srcLine 0 bb108: [...] 4352 %reg1253 = MAXSSrr %reg1253, %reg1588 ; srcLine 60 4356 %reg1645 = FsMOVAPSrr %reg1253 ; srcLine 0 [deleted copy] 4364 %reg1647 = FsMOVAPSrr %reg1243 ; srcLine 0 4368 JMP mbb ; srcLine 0 The very first instruction in bb74 is wrong. The coalescer has said that y always has the same value of x and that's incorrect. y is always one value "behind" x in the original source. The coalescer thinks it can do this because %reg1177 only has one value number and that VN is marked as a copy from %reg1176. What I'm saying is that while %reg1177 really is copied from %reg1176, it also has some initial value ("undef") coming into the loop. That value is not captured by the live interval information. It's because of this value that we cannot coalesce %reg1177 and %reg1176. It's because of this value that %reg1177 is always one value "behind" %reg1176. Now, if there's some other way to tell the coalescer that the coalescing is illegal, that's fine. I don't care about the undef value number itself. I care about the coalescer behaving itself. :) I just don't know how to tell the coalescer not to coalesce this without disabling all coalescings of interfering live intervals, even those that are just copies from the source register. Can you think of another way to fix this that's quick and easy? > That said, the way we models undef in machine instruction has always > bugged me. I thought about adding a MachineOperand type to represent > undef. Then we don't have to muddle the semantics of a "def". To me, > that's a cleaner representation, but it will require work. Long-term I don't have an opinion on what happens here. But right now I need to fix this. -Dave From piotr.rak at gmail.com Mon Feb 2 12:26:58 2009 From: piotr.rak at gmail.com (Piotr Rak) Date: Mon, 2 Feb 2009 19:26:58 +0100 Subject: [LLVMdev] [cfe-commits] r63168 - /cfe/trunk/Driver/clang.cpp In-Reply-To: References: <200901280243.n0S2haQj002955@zion.cs.uiuc.edu> <7925cd330901280444i2fbf835dh92eaa699de56b855@mail.gmail.com> Message-ID: <7925cd330902021026s78d8d615w86fa30249ef40d87@mail.gmail.com> 2009/2/2 steve naroff : > Hi Piortr, > > This also breaks the hand-built VC++ project. > > Any clues on where I should define this? > > Thanks, > > snaroff > > > > > > On Jan 28, 2009, at 7:44 AM, Piotr Rak wrote: > >> 2009/1/28 Mike Stump : >>> >>> Author: mrs >>> Date: Tue Jan 27 20:43:35 2009 >>> New Revision: 63168 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=63168&view=rev >>> Log: >>> Add a preliminary version number. >>> >>> Modified: >>> cfe/trunk/Driver/clang.cpp >>> >>> Modified: cfe/trunk/Driver/clang.cpp >>> URL: >>> http://llvm.org/viewvc/llvm-project/cfe/trunk/Driver/clang.cpp?rev=63168&r1=63167&r2=63168&view=diff >>> >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> ===================================================================== >>> --- cfe/trunk/Driver/clang.cpp (original) >>> +++ cfe/trunk/Driver/clang.cpp Tue Jan 27 20:43:35 2009 >>> @@ -1620,6 +1620,10 @@ >>> } >>> } >>> >>> + if (Verbose) >>> + fprintf(stderr, "clang version 1.0 based upon " PACKAGE_STRING >>> + " hosted on " LLVM_HOSTTRIPLE "\n"); >>> + >>> if (unsigned NumDiagnostics = Diags.getNumDiagnostics()) >>> fprintf(stderr, "%d diagnostic%s generated.\n", NumDiagnostics, >>> (NumDiagnostics == 1 ? "" : "s")); >>> >>> >>> _______________________________________________ >>> cfe-commits mailing list >>> cfe-commits at cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >>> >> >> Hi, >> >> This commit braeks cmake build for me. PACKAGE_STRING is not set by >> llvm toplevel CMakeList.txt, and later defined by >> 'include/llvm/Config/config.h.cmake'. >> I also changed PACKAGE_VERSION to match with one from 'configure.ac'. >> >> Attached fix (for llvm). >> >> Piotr >> < >> cmake_package_string >> .diff>_______________________________________________ >> cfe-commits mailing list >> cfe-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits > > > Hi Steve, No idea, sorry, I use linux x86 now. There must be some way to generate correct llvm/config.h there though. Maybe you should check PACKAGE_VERSION and similar is done there? Piotr From evan.cheng at apple.com Mon Feb 2 13:14:08 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 2 Feb 2009 11:14:08 -0800 Subject: [LLVMdev] undefs in phis In-Reply-To: <200902021208.12714.dag@cray.com> References: <200901291647.26270.dag@cray.com> <200901301552.24534.dag@cray.com> <200902021208.12714.dag@cray.com> Message-ID: <472926BB-086F-48B8-BA9E-978C7C84A958@apple.com> On Feb 2, 2009, at 10:08 AM, David Greene wrote: > On Friday 30 January 2009 16:54, Evan Cheng wrote: > >> I don't have the whole context to understand why you think this is a >> bug. An implicit_def doesn't actually define any value. So we don't >> care if a live interval overlaps live ranges defined by an >> implicit_def. > > It's a bug because the coalerscer does illegal coaescing. > > Our last episode left us here: > > bb134: > 2696 %reg1645 = FsMOVAPSrr %reg1458 ; srcLine 0 > > bb74: > 2700 %reg1176 = FsMOVAPSrr %reg1645 ; srcLine 0 > [deleted copy] > 2708 %reg1178 = FsMOVAPSrr %reg1647 ; srcLine > 0 *** u > before d > 2712 TEST64rr %reg1173, %reg1173, %EFLAGS ; srcLine 30 > 2716 JLE mbb, %EFLAGS use,kill> ; > srcLine > 0 > > bb108: > [...] > 4352 %reg1253 = MAXSSrr %reg1253, %reg1588 ; > srcLine 60 > 4356 %reg1645 = FsMOVAPSrr %reg1253 ; srcLine 0 > 4360 %reg1177 = FsMOVAPSrr %reg1176 ; srcLine > 0 *** > updated > 4364 %reg1647 = FsMOVAPSrr %reg1243 ; srcLine 0 > 4368 JMP mbb ; srcLine 0 > > This still looks correct. The coalescer then says: > > 4360 %reg1177 = FsMOVAPSrr %reg1176 ; srcLine 0 > Inspecting %reg1176,0 = [2702,4362:0) 0 at 2702-(4362) > and > %reg1177,0 = > [2700,3712:0)[3768,3878:0)[4362,4372:0) 0 at 4362-(3878): > Joined. Result = %reg1177,0 = [2700,4372:0) 0 at 2702- > (4362) > > Now let's look at the resulting code: > > bb134: > 2696 %reg1645 = FsMOVAPSrr %reg1458 ; srcLine 0 > > bb74: > 2700 %reg1177 = FsMOVAPSrr %reg1645 ; srcLine 0 > *** u > [deleted copy] > 2708 %reg1178 = FsMOVAPSrr %reg1647 ; srcLine > 0 *** u > before d > 2712 TEST64rr %reg1173, %reg1173, %EFLAGS ; srcLine 30 > 2716 JLE mbb, %EFLAGS use,kill> ; > srcLine > 0 > > bb108: > [...] > 4352 %reg1253 = MAXSSrr %reg1253, %reg1588 ; > srcLine 60 > 4356 %reg1645 = FsMOVAPSrr %reg1253 ; srcLine 0 > [deleted copy] > 4364 %reg1647 = FsMOVAPSrr %reg1243 ; srcLine 0 > 4368 JMP mbb ; srcLine 0 > > The very first instruction in bb74 is wrong. The coalescer has said > that y > always has the same value of x and that's incorrect. y is always > one value > "behind" x in the original source. > > The coalescer thinks it can do this because %reg1177 only has one > value number > and that VN is marked as a copy from %reg1176. What I'm saying is > that while > %reg1177 really is copied from %reg1176, it also has some initial > value > ("undef") coming into the loop. That value is not captured by the > live > interval information. It's because of this value that we cannot > coalesce > %reg1177 and %reg1176. It's because of this value that %reg1177 is > always one > value "behind" %reg1176. I am sorry I don't really follow it. Is this what you are describing? %v1177 = undef ... loop: ... %v1176 = op ... = %v1177 %v1177 = %v1176 jmp loop Why is not safe to coalesce the 2 registers? Evan > > > Now, if there's some other way to tell the coalescer that the > coalescing is > illegal, that's fine. I don't care about the undef value number > itself. I > care about the coalescer behaving itself. :) I just don't know how > to tell > the coalescer not to coalesce this without disabling all coalescings > of > interfering live intervals, even those that are just copies from the > source > register. > > Can you think of another way to fix this that's quick and easy? > >> That said, the way we models undef in machine instruction has always >> bugged me. I thought about adding a MachineOperand type to represent >> undef. Then we don't have to muddle the semantics of a "def". To me, >> that's a cleaner representation, but it will require work. > > Long-term I don't have an opinion on what happens here. But right > now I > need to fix this. > > -Dave From alenhar2 at uiuc.edu Mon Feb 2 13:17:21 2009 From: alenhar2 at uiuc.edu (Andrew Lenharth) Date: Mon, 2 Feb 2009 13:17:21 -0600 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <200902011338.25003.jon@ffconsultancy.com> References: <200902011338.25003.jon@ffconsultancy.com> Message-ID: <85dfcd7f0902021117m28d9d157x5ab99a1767403c45@mail.gmail.com> On Sun, Feb 1, 2009 at 7:38 AM, Jon Harrop wrote: > > As I understand it, first-class structs will allow structs to be passed as > function arguments and returned as results (i.e. multiple return values) > instead of passing pointers to structs. However, the GEP instruction only > handles pointer types. So I do not understand how you will be able to extract > the fields of a struct when it is received as a value type. > > Will the GEP instruction be altered so that it can be applied to structs > directly? No, see: http://llvm.org/docs/LangRef.html#i_extractvalue Andrew From gohman at apple.com Mon Feb 2 13:20:43 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 2 Feb 2009 11:20:43 -0800 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <200902011338.25003.jon@ffconsultancy.com> References: <200902011338.25003.jon@ffconsultancy.com> Message-ID: <683C2898-842B-4C42-A6D2-D42BFFD95E1B@apple.com> On Feb 1, 2009, at 5:38 AM, Jon Harrop wrote: > > As I understand it, first-class structs will allow structs to be > passed as > function arguments and returned as results (i.e. multiple return > values) > instead of passing pointers to structs. However, the GEP instruction > only > handles pointer types. So I do not understand how you will be able > to extract > the fields of a struct when it is received as a value type. Use the extractvalue instruction. Dan From tonic at nondot.org Mon Feb 2 13:20:53 2009 From: tonic at nondot.org (Tanya M. Lattner) Date: Mon, 2 Feb 2009 11:20:53 -0800 (PST) Subject: [LLVMdev] Reminder: 2.5 branch re-creation tonight. Message-ID: Just a reminder, I'll be re-creating the 2.5 branch tonight at 9pm PST. -Tanya From clattner at apple.com Mon Feb 2 13:25:46 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 2 Feb 2009 11:25:46 -0800 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <200902011338.25003.jon@ffconsultancy.com> References: <200902011338.25003.jon@ffconsultancy.com> Message-ID: <796EE1FB-ECE7-4F14-A794-6454DD681EFE@apple.com> On Feb 1, 2009, at 5:38 AM, Jon Harrop wrote: > > As I understand it, first-class structs will allow structs to be > passed as first-class structs already exist. :) > > function arguments and returned as results (i.e. multiple return > values) > instead of passing pointers to structs. However, the GEP instruction > only > handles pointer types. So I do not understand how you will be able > to extract > the fields of a struct when it is received as a value type. Use the extractvalue instruction. > Will the GEP instruction be altered so that it can be applied to > structs > directly? No. -Chris From bob.wilson at apple.com Mon Feb 2 13:31:02 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Mon, 2 Feb 2009 11:31:02 -0800 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs Message-ID: LLVM's type legalizer is changing the types of BUILD_VECTORs in a way that seems wrong to me, but I'm not sure if this is a bug or if some targets may be relying on it. On a 32-bit target, the default action for legalizing i8 and i16 types is to promote them. If you then have a BUILD_VECTOR to construct a legal vector type composed of i8 or i16 values, the type legalizer will look at the BUILD_VECTOR operands and decide that it needs to promote them to i32 types. You end up with a BUILD_VECTOR that constructs a vector of i32 values that are then bitcast to the original vector type. This works fine for SSE, where it appears that BUILD_VECTORs are intentionally canonicalized to use i32 elements for the benefit of CSE. I'm looking at implementing something where I think I'd like to keep the original vector types. Is this behavior in the type legalizer something that should be changed? From gordonhenriksen at me.com Mon Feb 2 13:39:16 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Mon, 02 Feb 2009 14:39:16 -0500 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <200902011338.25003.jon@ffconsultancy.com> References: <200902011338.25003.jon@ffconsultancy.com> Message-ID: <42DEED7A-2118-4BD6-8AB4-FB6D93F54BA1@me.com> On Feb 1, 2009, at 08:38, Jon Harrop wrote: > As I understand it, first-class structs will allow structs to be > passed as function arguments and returned as results (i.e. multiple > return values) instead of passing pointers to structs. However, the > GEP instruction only handles pointer types. So I do not understand > how you will be able to extract the fields of a struct when it is > received as a value type. > > Will the GEP instruction be altered so that it can be applied to > structs directly? You can't take the address of a register, much less an element thereof. extractvalue and insertvalue are the instructions to manipulate aggregate registers. http://llvm.org/docs/LangRef.html#aggregateops ? Gordon From jon at ffconsultancy.com Mon Feb 2 14:02:35 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Mon, 2 Feb 2009 20:02:35 +0000 Subject: [LLVMdev] OCaml Journal article: Building a Virtual Machine with LLVM In-Reply-To: <3FACAF52-7D9B-4B55-9C5E-82C39634F2FB@grahamwakefield.net> References: <200901251516.22805.jon@ffconsultancy.com> <3FACAF52-7D9B-4B55-9C5E-82C39634F2FB@grahamwakefield.net> Message-ID: <200902022002.35147.jon@ffconsultancy.com> On Sunday 01 February 2009 20:42:44 Graham Wakefield wrote: > I'd love to read this article, but I can't justify paying to register. > Will it become a 'freely available' article at any point soon? No. :-) -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From dag at cray.com Mon Feb 2 14:12:01 2009 From: dag at cray.com (David Greene) Date: Mon, 2 Feb 2009 14:12:01 -0600 Subject: [LLVMdev] undefs in phis In-Reply-To: <472926BB-086F-48B8-BA9E-978C7C84A958@apple.com> References: <200901291647.26270.dag@cray.com> <200902021208.12714.dag@cray.com> <472926BB-086F-48B8-BA9E-978C7C84A958@apple.com> Message-ID: <200902021412.01657.dag@cray.com> On Monday 02 February 2009 13:14, Evan Cheng wrote: > I am sorry I don't really follow it. Is this what you are describing? > > %v1177 = undef > ... > loop: > ... > %v1176 = op ... > = %v1177 > %v1177 = %v1176 > jmp loop > > Why is not safe to coalesce the 2 registers? Not quite. The original code is: %v1177 = undef %v1645 = ... loop: %v1176 = %v1645 ... = %v1176 = %v1177 %v1645 = op ... %v1177 = %v1176 jmp loop We can't coalesce %v1177 and %v1176 legally. But we do. -Dave From Micah.Villmow at amd.com Mon Feb 2 14:19:00 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Mon, 2 Feb 2009 12:19:00 -0800 Subject: [LLVMdev] 16 bit to 32 bit conversion Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785549@ssanexmb1.amd.com> It seems that LLVM is converting all the 16 bit ints into 32 bit ints. Is there a way I can tell LLVM that 16 bit ints are valid and legal and not to do any conversions on them? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/fa445a4e/attachment.html From evan.cheng at apple.com Mon Feb 2 14:29:45 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 2 Feb 2009 12:29:45 -0800 Subject: [LLVMdev] undefs in phis In-Reply-To: <200902021412.01657.dag@cray.com> References: <200901291647.26270.dag@cray.com> <200902021208.12714.dag@cray.com> <472926BB-086F-48B8-BA9E-978C7C84A958@apple.com> <200902021412.01657.dag@cray.com> Message-ID: <193746F5-D74B-4A7A-95B9-65C9D52BAAD0@apple.com> On Feb 2, 2009, at 12:12 PM, David Greene wrote: > On Monday 02 February 2009 13:14, Evan Cheng wrote: > >> I am sorry I don't really follow it. Is this what you are describing? >> >> %v1177 = undef >> ... >> loop: >> ... >> %v1176 = op ... >> = %v1177 >> %v1177 = %v1176 >> jmp loop >> >> Why is not safe to coalesce the 2 registers? > > Not quite. The original code is: > > %v1177 = undef > %v1645 = ... > loop: > %v1176 = %v1645 > ... > = %v1176 > = %v1177 > %v1645 = op ... > %v1177 = %v1176 > jmp loop > > We can't coalesce %v1177 and %v1176 legally. But we do. Seriously, why not? In the first iteration, it's totally legal for v1177 has the same value as v1176. It's defined by an undef, it's allowed to have contain any value. Evan > > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From evan.cheng at apple.com Mon Feb 2 14:34:22 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 2 Feb 2009 12:34:22 -0800 Subject: [LLVMdev] 16 bit to 32 bit conversion In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785549@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785549@ssanexmb1.amd.com> Message-ID: <8BD10072-E134-4050-A301-8B2416C8F131@apple.com> Are you marking i16 a legal type? In XXISelLowering.cpp, you should assign it a register class. e.g. addRegisterClass(MVT::i16, XX::i16RegisterClass) Evan On Feb 2, 2009, at 12:19 PM, Villmow, Micah wrote: > It seems that LLVM is converting all the 16 bit ints into 32 bit > ints. Is there a way I can tell LLVM that 16 bit ints are valid and > legal and not to do any conversions on them? > > Thanks, > > Micah Villmow > Systems Engineer > Advanced Technology & Performance > Advanced Micro Devices Inc. > S1-609 One AMD Place > Sunnyvale, CA. 94085 > P: 408-749-3966 > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/29c0b7f7/attachment-0001.html From dag at cray.com Mon Feb 2 14:37:57 2009 From: dag at cray.com (David Greene) Date: Mon, 2 Feb 2009 14:37:57 -0600 Subject: [LLVMdev] Reminder: 2.5 branch re-creation tonight. In-Reply-To: References: Message-ID: <200902021437.57867.dag@cray.com> On Monday 02 February 2009 13:20, Tanya M. Lattner wrote: > Just a reminder, I'll be re-creating the 2.5 branch tonight at 9pm PST. What does re-creating mean? Why can't the previously-created 2.5 branch simply be updated? I ask because svn history will look a little wierd and it makes it harder for third parties to track revisions and do merges. -Dave From tonic at nondot.org Mon Feb 2 14:54:19 2009 From: tonic at nondot.org (Tanya M. Lattner) Date: Mon, 2 Feb 2009 12:54:19 -0800 (PST) Subject: [LLVMdev] Reminder: 2.5 branch re-creation tonight. In-Reply-To: <200902021437.57867.dag@cray.com> References: <200902021437.57867.dag@cray.com> Message-ID: > On Monday 02 February 2009 13:20, Tanya M. Lattner wrote: >> Just a reminder, I'll be re-creating the 2.5 branch tonight at 9pm PST. > > What does re-creating mean? Why can't the previously-created 2.5 branch > simply be updated? It means deleting the branch and creating a new one. > I ask because svn history will look a little wierd and it makes it harder for > third parties to track revisions and do merges. I don't really understand why this is an issue. I'm sure svn is capable of extracting the information that you want. I'd rather not do one giant merge to our existing branch which will only create more noise to llvm-commits. -Tanya From Micah.Villmow at amd.com Mon Feb 2 14:54:57 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Mon, 2 Feb 2009 12:54:57 -0800 Subject: [LLVMdev] 16 bit to 32 bit conversion In-Reply-To: <8BD10072-E134-4050-A301-8B2416C8F131@apple.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785549@ssanexmb1.amd.com> <8BD10072-E134-4050-A301-8B2416C8F131@apple.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785554@ssanexmb1.amd.com> Thanks for making me double check my own mistake. I had addRegisterClass(MVT::i16, XX::i32RegisterClass). Micah ________________________________ From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Evan Cheng Sent: Monday, February 02, 2009 12:34 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] 16 bit to 32 bit conversion Are you marking i16 a legal type? In XXISelLowering.cpp, you should assign it a register class. e.g. addRegisterClass(MVT::i16, XX::i16RegisterClass) Evan On Feb 2, 2009, at 12:19 PM, Villmow, Micah wrote: It seems that LLVM is converting all the 16 bit ints into 32 bit ints. Is there a way I can tell LLVM that 16 bit ints are valid and legal and not to do any conversions on them? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/8f18920c/attachment.html From dag at cray.com Mon Feb 2 15:05:04 2009 From: dag at cray.com (David Greene) Date: Mon, 2 Feb 2009 15:05:04 -0600 Subject: [LLVMdev] undefs in phis In-Reply-To: <193746F5-D74B-4A7A-95B9-65C9D52BAAD0@apple.com> References: <200901291647.26270.dag@cray.com> <200902021412.01657.dag@cray.com> <193746F5-D74B-4A7A-95B9-65C9D52BAAD0@apple.com> Message-ID: <200902021505.05179.dag@cray.com> On Monday 02 February 2009 14:29, Evan Cheng wrote: > On Feb 2, 2009, at 12:12 PM, David Greene wrote: > > On Monday 02 February 2009 13:14, Evan Cheng wrote: > >> I am sorry I don't really follow it. Is this what you are describing? > >> > >> %v1177 = undef > >> ... > >> loop: > >> ... > >> %v1176 = op ... > >> = %v1177 > >> %v1177 = %v1176 > >> jmp loop > >> > >> Why is not safe to coalesce the 2 registers? > > > > Not quite. The original code is: > > > > %v1177 = undef > > %v1645 = ... > > loop: > > %v1176 = %v1645 > > ... > > = %v1176 > > = %v1177 > > %v1645 = op ... > > %v1177 = %v1176 > > jmp loop > > > > We can't coalesce %v1177 and %v1176 legally. But we do. > > Seriously, why not? In the first iteration, it's totally legal for > v1177 has the same value as v1176. It's defined by an undef, it's > allowed to have contain any value. Think about what will happen the 2nd iteration. %v1177 will have the value of %v1645 which is wrong. This is because %v1176 in bb74 will be replaced with %v1177. That's incorrect. -Dave From dag at cray.com Mon Feb 2 15:22:17 2009 From: dag at cray.com (David Greene) Date: Mon, 2 Feb 2009 15:22:17 -0600 Subject: [LLVMdev] Reminder: 2.5 branch re-creation tonight. In-Reply-To: References: <200902021437.57867.dag@cray.com> Message-ID: <200902021522.17256.dag@cray.com> On Monday 02 February 2009 14:54, Tanya M. Lattner wrote: > > On Monday 02 February 2009 13:20, Tanya M. Lattner wrote: > >> Just a reminder, I'll be re-creating the 2.5 branch tonight at 9pm PST. > > > > What does re-creating mean? Why can't the previously-created 2.5 branch > > simply be updated? > > It means deleting the branch and creating a new one. That's what I was afraid of. > > I ask because svn history will look a little wierd and it makes it harder > > for third parties to track revisions and do merges. > > I don't really understand why this is an issue. I'm sure svn is capable of > extracting the information that you want. I'd rather not do one giant > merge to our existing branch which will only create more noise to > llvm-commits. A delete followed by a re-copy is more noise than a single commit. It's a problem because we'll essentially have created two branches with the same name. Yes, I can extract the information "by eye" but we have scripts here that walk svn history to do merges and a re-created branch is probably going to confuse them. -Dave From dalej at apple.com Mon Feb 2 15:26:45 2009 From: dalej at apple.com (Dale Johannesen) Date: Mon, 2 Feb 2009 13:26:45 -0800 Subject: [LLVMdev] Adding legal integer sizes to TargetData In-Reply-To: References: Message-ID: <73749B32-92D4-4C77-98F3-C27F57BA8ABD@apple.com> On Feb 1, 2009, at 11:06 PMPST, Chris Lattner wrote: > Now that 2.5 is about to branch, I'd like to bring up one of Scott's > favorite topics: certain optimizers widen or narrow arithmetic, > without regard for whether the type is legal for the target. In his > specific case, instcombine is turning an i32 multiply into an i64 > multiply in order to eliminate a cast. This does simplify/reduce the > number of IR operations, but an i64 multiply is dramatically more > expensive than an i32 multiply on CellSPU. I basically agree with Scott on this: we shouldn't reintroduce types that are illegal for the target after Legalize. > There are a couple of different ways to look at this. On the one > hand, I still strongly believe that codegen should be able to re- > narrow operations (and it does on his testcase on i386). However, > codegen is currently doing these optimizations on a per-basic block > basis, and we're not likely to have whole-function dags in the near > future, so there is an inherent limit to its power. > > An earlier place to handle this is in codegen prepare, which is > global. However, the bad thing about this is that it would > effectively require duplicating all the type legalization code in CGP, > which is a pass we want to shrink, not grow. OTOH, the whole CGP pass > is really a hack around selection dags not being whole-function. > > A third way to handle this is to add to target data a notion of > "native types". Instcombine could then be constrained to not do the > widening/narrowing transformations when the original type (i32 in this > case) was native but the destination type (i64) is non-native. > > On the one hand, adding this to targetdata is simple and straight- > forward with well-defined semantics. OTOH, it is somewhat ugly that > IR canonicalization gets a bit more target-specific. IR after Legalize is target-specific (indeed that's Legalizer's job), so I don't see why you should expect to treat it in a target-independent way. This seems like the right fix to me. (I don't offhand see why the separation into legal and illegal types that we already have isn't enough, but no doubt you're right.) > On the third > hand, instcombine already promotes indices of GEPs to match the > pointer size etc, so it wouldn't be too crazy for it to do this. > > What do others think about this? > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/1cf8bdd3/attachment-0001.html From howarth at bromo.med.uc.edu Mon Feb 2 15:27:11 2009 From: howarth at bromo.med.uc.edu (Jack Howarth) Date: Mon, 2 Feb 2009 16:27:11 -0500 Subject: [LLVMdev] building libLTO.dylib Message-ID: <20090202212711.GA14420@bromo.med.uc.edu> Is there a particular method one should used to build the libLTO.dylib from the llvm svn on darwin9? Unless I am confused, it seems that libLTO.dylib gets built when I do a configure and make directly in the llvm svn tree but not when I created separate llvm_objdir directory and execute configure and make within it. Thanks in advance for any clarifications. Jack From clattner at apple.com Mon Feb 2 15:29:05 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 2 Feb 2009 13:29:05 -0800 Subject: [LLVMdev] Adding legal integer sizes to TargetData In-Reply-To: <73749B32-92D4-4C77-98F3-C27F57BA8ABD@apple.com> References: <73749B32-92D4-4C77-98F3-C27F57BA8ABD@apple.com> Message-ID: <9E72358E-8056-42DA-846D-E09B5F2E15DD@apple.com> On Feb 2, 2009, at 1:26 PM, Dale Johannesen wrote: > > On Feb 1, 2009, at 11:06 PMPST, Chris Lattner wrote: > >> Now that 2.5 is about to branch, I'd like to bring up one of Scott's >> favorite topics: certain optimizers widen or narrow arithmetic, >> without regard for whether the type is legal for the target. In his >> specific case, instcombine is turning an i32 multiply into an i64 >> multiply in order to eliminate a cast. This does simplify/reduce the >> number of IR operations, but an i64 multiply is dramatically more >> expensive than an i32 multiply on CellSPU. > > I basically agree with Scott on this: we shouldn't reintroduce > types that > are illegal for the target after Legalize. I'm sorry, to be clear, this is mostly talking about an instcombine change. Obviously anything in codegen should respect current restrictions. The question is whether the mid-level optimizer should try to avoid introducing illegal types. -Chris From howarth at bromo.med.uc.edu Mon Feb 2 15:31:44 2009 From: howarth at bromo.med.uc.edu (Jack Howarth) Date: Mon, 2 Feb 2009 16:31:44 -0500 Subject: [LLVMdev] building llvm-gcc-4.2 with llvm installed Message-ID: <20090202213144.GA14448@bromo.med.uc.edu> I tried constructing a llvm package for fink to go with my llvm-gcc42 packaging this weekend. My hope was that I could have the llvm package installed and use those static libraries to build llvm-gcc-4.2 with rather than rebuilding llvm again. Unfortunately, --enable-llvm= seems to demand the llvm build directory. Is this absolutely necessary or could the llvm-gcc-4.2 build be adjusted to work with just the installed llvm headers and static libraries to compete the llvm-gcc-4.2 build? Thanks in advance for any clarifications. Jack From dalej at apple.com Mon Feb 2 15:33:25 2009 From: dalej at apple.com (Dale Johannesen) Date: Mon, 2 Feb 2009 13:33:25 -0800 Subject: [LLVMdev] Adding legal integer sizes to TargetData In-Reply-To: <9E72358E-8056-42DA-846D-E09B5F2E15DD@apple.com> References: <73749B32-92D4-4C77-98F3-C27F57BA8ABD@apple.com> <9E72358E-8056-42DA-846D-E09B5F2E15DD@apple.com> Message-ID: On Feb 2, 2009, at 1:29 PMPST, Chris Lattner wrote: > > On Feb 2, 2009, at 1:26 PM, Dale Johannesen wrote: > >> >> On Feb 1, 2009, at 11:06 PMPST, Chris Lattner wrote: >> >>> Now that 2.5 is about to branch, I'd like to bring up one of Scott's >>> favorite topics: certain optimizers widen or narrow arithmetic, >>> without regard for whether the type is legal for the target. In his >>> specific case, instcombine is turning an i32 multiply into an i64 >>> multiply in order to eliminate a cast. This does simplify/reduce >>> the >>> number of IR operations, but an i64 multiply is dramatically more >>> expensive than an i32 multiply on CellSPU. >> >> I basically agree with Scott on this: we shouldn't reintroduce >> types that >> are illegal for the target after Legalize. > > I'm sorry, to be clear, this is mostly talking about an instcombine > change. Obviously anything in codegen should respect current > restrictions. The question is whether the mid-level optimizer should > try to avoid introducing illegal types. I understand; I was stating a general principle, which I believe to be a good one. From eli.friedman at gmail.com Mon Feb 2 15:41:42 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 2 Feb 2009 13:41:42 -0800 Subject: [LLVMdev] Optimized code analysis problems In-Reply-To: References: Message-ID: On Sun, Feb 1, 2009 at 11:45 AM, Nipun Arora wrote: > Well I think a way to hack it might be better for my purposes, can you > suggest any ways of getting started on that and where. Assuming you're using llvm-gcc for the analysis, look for NeedAlwaysInliner in llvm-backend.cpp. On a side note, depending on what exactly you're doing, a source-level analysis tool might be more convenient than LLVM; a couple possibilities are Dehydra (https://developer.mozilla.org/en/Dehydra) and clang (http://clang.llvm.org/). -Eli From dpatel at apple.com Mon Feb 2 15:49:43 2009 From: dpatel at apple.com (Devang Patel) Date: Mon, 2 Feb 2009 13:49:43 -0800 Subject: [LLVMdev] Proposal: Debug information improvement - keep the line number with optimizations In-Reply-To: <8abe0dc60902020748l8cbda49h3552a1954e3e43be@mail.gmail.com> References: <8abe0dc60902020748l8cbda49h3552a1954e3e43be@mail.gmail.com> Message-ID: Hi Zhou, There is certainly an interest to preserve line number information (and valid variable info) during optimizations in llvm. More info below... On Feb 2, 2009, at 7:48 AM, Zhou Sheng wrote: > > The following sub-sections define specific requirements to improve > the debug information in LLVM. > > > 2.1 Verification Flow > The most important of this project is to make the debug information > do not block any optimization by LLVM transform passes. Here I > propose a way to determine whether codegen is being impacted by > debug info. This is also useful for us to scan the LLVM transform > pass list to find which pass need to update to work with debug > information. > > From Chris: Add a -strip-debug pass that removes all debug info from > the LLVM IR. Given this, it would allow us to do: > $ llvm-gcc -O3 -c -o - | llc > good.s > $ llvm-gcc -O3 -c -g -o - | opt -strip-debug | llc > test.s > $ diff good.s test.s > If the two .s files differed, then badness happened. This may not work perfectly because presence of debug info may influence compiler generated symbol names and label numbers. > This obviously only catches badness that happens in the LLVM > optimizer, There is an establish way to check this. See http://llvm.org/docs/SourceLevelDebugging.html#debugopt > if the code generator is broken, we'll need something more > sophisticated that strips debug info out of the .s file. In any > case, this is a good place to start, and should be turned into a > llvm-test TEST/report. > > Incidentally, we have to go through codegen, we can't diff .ll files > after debug info is stripped out. This is because debug info is > allowed to (and probably does) impact local names within functions, > but these functions are removed at codegen and are not important to > preserve. End > > > > 2.2 A Pass to clean up the debug info > LLVM already has a transform pass "-strip-debug", it removes all the > debug information. But for the first half of this project, we want > to just keep the line number information (stop point) in the > optimized code. So we need a new transform pass to just removes the > variable declaration information. FWIW, mem2reg already does this. > Pass "-strip-debug" also doesn't cleanup the dead variable and > function calling for debug information, it thinks other pass like "- > dce" or "-globaldce" can handle this. Yes. > But as we are also going to update those passes, we can't use them > in the verification flow, otherwise, it may output incorrect check > results. I am not sure, I follow this. > > The new pass "-strip-debug-pro" should have the following functions: > 1. Just remove the variable declaration information and > clean up the dead debug information. This are two separate tasks. 1) Remove variable declaration info. This is already done (indirectly) by mem2reg. But a separate pass to do so won't hurt either. 2) Remove dead debug information. This is very useful as a separate pass and can be used while debugging non optimized code (for example, to remove type info for the types that are not used at all). > 2. Just remove the line number information and clean up the > dead debug information. I am not sure what is the purpose of this ? > 3. Remove all the debug information and clean up. That's what, "Remove Debug Info", -strip-debug does. If you put -strip-debug + -dce in one pass then you're not comparing apple and apple in your 2.1 style verification. Or I am missing something. > 2.3 Front End Changes > For the first half of the project, we just aim to handle the line > number debug information. So we need to force llvm-gcc not to emit > any variable declaration information. > > 2.4 Optimization Transform Changes > According to the output of the check script, we can get a pass-to- > update list. Just follow the list to update the pass one by one. > When done a single pass, turn back to run the llvm/test and llvm- > test, note apply the pass "-strip-debug-pro" right after the updated > pass to see if it work correctly. > > 2. Proposed Work Plan > This section defines a proposed work plan to accomplish the > requirements that we desires. The work plan is broken into several > distinct phases that follow a logical progression of modifications > to the LLVM software. > > 2.1 Phase 1: Establish the testing system > One of the most useful things to get started is to have some way to > determine whether codegen is being impacted by debug info. It is > important to be able to tell when this happens so that we can track > down these places and fix them. > > 2.1.1 Pass Scanning Script > Following the way proposed by Chris, it is good to have a script to > scan the standard LLVM transform pass list. We can get the standard > compile optimization pass list by: You can use http://llvm.org/docs/SourceLevelDebugging.html#debugopt as a starting point here. > > $ opt -std-compile-opts -debug-pass=Arguments foo.bc > /dev/ > null > Pass Arguments: -preverify -domtree -verify -lowersetjmp - > raiseallocs -simplifycfg -domtree -domfrontier -mem2reg -globalopt - > globaldce -ipconstprop -deadargelim -instcombine -simplifycfg - > basiccg -prune-eh -inline -argpromotion -tailduplicate -simplify- > libcalls -instcombine -jump-threading -simplifycfg -domtree - > domfrontier -scalarrepl -instcombine -break-crit-edges -condprop - > tailcallelim -simplifycfg -reassociate -domtree -loops -loopsimplify > -domfrontier -scalar-evolution -lcssa -loop-rotate -licm -lcssa - > loop-unswitch -scalar-evolution -lcssa -loop-index-split - > instcombine -scalar-evolution -domfrontier -lcssa -indvars - > domfrontier -scalar-evolution -lcssa -loop-unroll -instcombine - > domtree -memdep -gvn -memcpyopt -sccp -instcombine -break-crit-edges > -condprop -memdep -dse -mergereturn -postdomtree -postdomfrontier - > adce -simplifycfg -strip-dead-prototypes -printusedtypes - > deadtypeelim -constmerge -preverify -domtree -verify > > > > The script should look like: > #!/bin/sh > > OPTS="-preverify -domtree -verify -lowersetjmp -raiseallocs - > simplifycfg -domtree -domfrontier -mem2reg -globalopt -globaldce - > ipconstprop -deadargelim -instcombine -simplifycfg -basiccg -prune- > eh -inline -argpromotion -tailduplicate -simplify-libcalls - > instcombine -jump-threading -simplifycfg -domtree -domfrontier - > scalarrepl -instcombine -break-crit-edges -condprop -tailcallelim - > simplifycfg -reassociate -domtree -loops -loopsimplify -domfrontier - > scalar-evolution -lcssa -loop-rotate -licm -lcssa -loop-unswitch - > scalar-evolution -lcssa -loop-index-split -instcombine -scalar- > evolution -domfrontier -lcssa -indvars -domfrontier -scalar- > evolution -lcssa -loop-unroll -instcombine -domtree -memdep -gvn - > memcpyopt -sccp -instcombine -break-crit-edges -condprop -memdep - > dse -mergereturn -postdomtree -postdomfrontier -adce -simplifycfg - > strip-dead-prototypes -printusedtypes -deadtypeelim -constmerge - > preverify -domtree -verify" > > llvm-gcc -g -emit-llvm -c $1 -o $1.db1.ll -S > llvm-gcc -emit-llvm -c $1 -o good.bc > > sed '/call void @llvm.dbg.declare/d' $1.db1.ll > $1.db2.ll > > llvm-as $1.db2.ll -f > > for p in $OPTS; do > opt $p $1.db2.bc -o $1.db2.bc -f > opt -strip-debug -deadtypeelim -dce -globaldce -deadtypeelim > $1.db2.bc | llc > test.s -f > opt $p -strip-debug -deadtypeelim -dce -globaldce -deadtypeelim > good.bc -o good.bc -f > llc good.bc > good.s -f > echo "PASS $p : " >> diff.log > if `diff good.s test.s >> diff.log 2>&1 ` ; then > echo "PASS $p : SUCC" > else > echo "PASS $p : FAIL" > fi > done > > For example: > Foo.c: > int foo(int x, int y) { > return x + y; > } > > $ ./check.sh foo.c > PASS -preverify : SUCC > PASS -domtree : SUCC > PASS -verify : SUCC > PASS -lowersetjmp : SUCC > PASS -raiseallocs : SUCC > PASS -simplifycfg : SUCC > PASS -domtree : SUCC > PASS -domfrontier : SUCC > PASS -mem2reg : FAIL > PASS -globalopt : FAIL > PASS -globaldce : FAIL > PASS -ipconstprop : FAIL > PASS -deadargelim : FAIL > PASS -instcombine : FAIL > PASS -simplifycfg : FAIL > > Check the log file: > PASS -preverify : > PASS -domtree : > PASS -verify : > PASS -lowersetjmp : > PASS -raiseallocs : > PASS -simplifycfg : > PASS -domtree : > PASS -domfrontier : > PASS -mem2reg : > 8,9c8,14 > < movl 4(%esp), %eax > < addl 8(%esp), %eax > --- > > subl $8, %esp > > movl 12(%esp), %eax > > movl %eax, 4(%esp) > > movl 16(%esp), %eax > > movl %eax, (%esp) > > addl 4(%esp), %eax > > addl $8, %esp > For the above example, we found that the transform pass "mem2reg" > obviously not done the work when keeping the debug information. Then > we know we need to update it and re-test > > > 2.1.2 Update the LLVM testing system > The LLVM testing infrastructure contains two major categories of > tests: code fragments and whole programs. Code fragments are > referred to as the "DejaGNU tests" and are in the llvm module in > subversion under the llvm/test directory. The whole programs tests > are referred to as the "Test suite" and are in the test-suite module > in subversion. > Scanning all the test cases, find those using the specified > transform and add the script similar to that previously mentioned. > Make the result write into llvm-test TEST/report. > > > 2.2 Phase 2: New Pass to Strip Debug Information > LLVM already has a transform pass "-strip-debug", it removes all the > debug information. But for the first half of this project, we want > to just keep the line number information (stop point) in the > optimized code. So we need a new transform pass to just removes the > variable declaration information. Pass "-strip-debug" also doesn't > cleanup the dead variable and function calling for debug > information, it thinks other pass like "-dce" or "-globaldce" can > handle this. But as we are also going to update those passes, we > can't use them in the verification flow, otherwise, it may output > incorrect check results. > > The new pass "-strip-debug-pro" should have the following functions: > 1. Just remove the variable declaration information and > clean up the dead debug information. > > 2. Remove all the debug information and clean up > > 3.2.1 Work Plan > 1. Take a reference to transform pass StripSymbol.cpp > 2. Based on the StripSymbol.cpp, add an option to it to just > remove debug information, like "-rm-debug" That's what -strip-debug is doing. > 3. Add an option to just remove the variable declaration > information, like "?rm-debug=2" Why not -strip-debug=2 if you want a way to remove variable declarations ..? > 4. Add a procedure to clean up the dead variables and > function calls for debug purpose. > > 2.3 Phase 3: Extend llvm-gcc > Once we have a way to verify what is happening, I propose that we > aim for an intermediate point: instead of having -O disable all > debug info, we should make it disable just variable information, but > keep emitting line number info. This would allow stepping through > the program, getting stack traces, use performance tools like shark, > etc. > > We need the front-end llvm-gcc to have a mode that causes it to emit > line number info but not > variable info, we can go through the process above to identify > passes that change behavior when line number intrinsics are in the > code. > > 1.3.1 Work Plan > 1. First locate the file position that llvm-gcc handle the > parameter options. > 2. Add a new option to control the llvm-gcc to emit > specified debug information: like ?g1. ?g1 to only emit line number > 3. Building the new llvm-gcc > 4. Testing through llvm/test, llvm-test > > 2.4 Phase 4: Update Transform Passes for Line Number Info. > When the front-end has a mode that causes it to emit line number > info but not variable info, we can go through the process above to > identify passes that change behavior when line number intrinsics are > in the code. I think, the optimizer is not changing behavior when dbg info is present. Try running dbgopt tests. > Obvious cases are things like loop unroll and inlining: they > 'measure' the size of some code to determine whether to unroll it or > not. This means that it should be enhanced to ignore debug > intrinsics for the sake of code size estimation. The loop unrolling pass already ignores the debug info! See LoopUnroll.cpp::ApproximateLoopSize() > > Another example is optimizations like SimplifyCFG when it merges if/ > then/else into select instructions. SimplifyCFG will have to be > enhanced to ignore debug intrinsics when doing its safety/ > profitability analysis, I think, it handles this part well, but ... > but then it will also have to be updated to just delete the line > number intrinsics when it does the xform. This is simplifycfg's way > of "updating" the debug info for this example transformation. .. the second part has not received full attention. > As we progress through various optimizations, we will find cases > where it is possible to update (e.g. loop unroll or inlining, which > doesn't have to do anything special to update line #'s) and places > where it isn't. As long as the debug intrinsics don't affect > codegen, we are happy, even if the debug intrinsics are deleted in > cases where it would be possible to update them (this becomes a > optimized debugging QoI issue). > > > > 3.4.1 Work Plan > 1. Update transform pass mem2reg > 2. Testing through llvm/test, llvm-test > 3. Update transform pass simplifycfg > 4. Testing through llvm/test, llvm-test > 5. Likewise, update transform passes globalopt, globaldce, > ipconstprop, deadargelim, instcombine... > 6. Update other passes and testing them. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev I'm looking forward to your contributions in this area. - Devang From eli.friedman at gmail.com Mon Feb 2 15:51:32 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 2 Feb 2009 13:51:32 -0800 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs In-Reply-To: References: Message-ID: On Mon, Feb 2, 2009 at 11:31 AM, Bob Wilson wrote: > LLVM's type legalizer is changing the types of BUILD_VECTORs in a way > that seems wrong to me, but I'm not sure if this is a bug or if some > targets may be relying on it. > > On a 32-bit target, the default action for legalizing i8 and i16 types > is to promote them. This isn't true on x86. > If you then have a BUILD_VECTOR to construct a > legal vector type composed of i8 or i16 values, the type legalizer > will look at the BUILD_VECTOR operands and decide that it needs to > promote them to i32 types. You end up with a BUILD_VECTOR that > constructs a vector of i32 values that are then bitcast to the > original vector type. I'm pretty sure the target-independent legalizer does no such thing. Can you point to the code you're talking about? That said, I do recall the x86 backend does a similar platform-specific transformation with constants... do you care about x86 in particular? > This works fine for SSE, where it appears that BUILD_VECTORs are > intentionally canonicalized to use i32 elements for the benefit of > CSE. I'm looking at implementing something where I think I'd like to > keep the original vector types. Is this behavior in the type > legalizer something that should be changed? I'm not sure what behavior you're talking about... -Eli From monping at apple.com Mon Feb 2 15:53:50 2009 From: monping at apple.com (Mon Ping Wang) Date: Mon, 2 Feb 2009 13:53:50 -0800 Subject: [LLVMdev] Adding legal integer sizes to TargetData In-Reply-To: References: <73749B32-92D4-4C77-98F3-C27F57BA8ABD@apple.com> <9E72358E-8056-42DA-846D-E09B5F2E15DD@apple.com> Message-ID: <390AB299-CBA8-4754-92E0-F5A2204D6D14@apple.com> I have always found this issue a little thorny on compilers. On one hand, we want to remove unnecessary IR instructions in general as that would reduce IR instruction count that could speed up compilation and and it can also simplify target independent optimization passes because it can assume certain code patterns will not occur after running some cleanup phase. On the other hand, I agree with Dale that it's generally not a good idea for a transformation to introduce an illegal type that would need to be undone in the code generation phase. For this particular case, it sounds like undoing this transformation in CodeGen is not easy given our current framework and doing the optimization doesn't help making other transformation simpler (e.g., loop bound calculations). If that is the case, the fix that you are suggesting seems to be the right approach. -- Mon Ping On Feb 2, 2009, at 1:33 PM, Dale Johannesen wrote: > > On Feb 2, 2009, at 1:29 PMPST, Chris Lattner wrote: > >> >> On Feb 2, 2009, at 1:26 PM, Dale Johannesen wrote: >> >>> >>> On Feb 1, 2009, at 11:06 PMPST, Chris Lattner wrote: >>> >>>> Now that 2.5 is about to branch, I'd like to bring up one of >>>> Scott's >>>> favorite topics: certain optimizers widen or narrow arithmetic, >>>> without regard for whether the type is legal for the target. In >>>> his >>>> specific case, instcombine is turning an i32 multiply into an i64 >>>> multiply in order to eliminate a cast. This does simplify/reduce >>>> the >>>> number of IR operations, but an i64 multiply is dramatically more >>>> expensive than an i32 multiply on CellSPU. >>> >>> I basically agree with Scott on this: we shouldn't reintroduce >>> types that >>> are illegal for the target after Legalize. >> >> I'm sorry, to be clear, this is mostly talking about an instcombine >> change. Obviously anything in codegen should respect current >> restrictions. The question is whether the mid-level optimizer should >> try to avoid introducing illegal types. > > I understand; I was stating a general principle, which I believe to be > a good one. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From dpatel at apple.com Mon Feb 2 15:56:37 2009 From: dpatel at apple.com (Devang Patel) Date: Mon, 2 Feb 2009 13:56:37 -0800 Subject: [LLVMdev] building llvm-gcc-4.2 with llvm installed In-Reply-To: <20090202213144.GA14448@bromo.med.uc.edu> References: <20090202213144.GA14448@bromo.med.uc.edu> Message-ID: <33E698D8-BE35-4A33-92C5-032BED0682AA@apple.com> On Feb 2, 2009, at 1:31 PM, Jack Howarth wrote: > I tried constructing a llvm package for fink to > go with my llvm-gcc42 packaging this weekend. My hope > was that I could have the llvm package installed and > use those static libraries to build llvm-gcc-4.2 with > rather than rebuilding llvm again. Unfortunately, > --enable-llvm= seems to demand the llvm build directory. > Is this absolutely necessary or could the llvm-gcc-4.2 > build be adjusted to work with just the installed llvm headers > and static libraries to compete the llvm-gcc-4.2 build? > Thanks in advance for any clarifications. If you have installed llvm binaries (llc, opt etc.. ) and headers in / usr/local/bin and /usr/local/include respectively then using --enable- llvm=/usr/local should work. - Devang From bob.wilson at apple.com Mon Feb 2 17:34:13 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Mon, 2 Feb 2009 15:34:13 -0800 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs In-Reply-To: References: Message-ID: <98EAF9B2-D839-4879-893F-56F907E9AA2D@apple.com> On Feb 2, 2009, at 1:51 PM, Eli Friedman wrote: > On Mon, Feb 2, 2009 at 11:31 AM, Bob Wilson > wrote: >> LLVM's type legalizer is changing the types of BUILD_VECTORs in a way >> that seems wrong to me, but I'm not sure if this is a bug or if some >> targets may be relying on it. >> >> On a 32-bit target, the default action for legalizing i8 and i16 >> types >> is to promote them. > > This isn't true on x86. No, it's not, but it is that way on PowerPC. > >> If you then have a BUILD_VECTOR to construct a >> legal vector type composed of i8 or i16 values, the type legalizer >> will look at the BUILD_VECTOR operands and decide that it needs to >> promote them to i32 types. You end up with a BUILD_VECTOR that >> constructs a vector of i32 values that are then bitcast to the >> original vector type. > > I'm pretty sure the target-independent legalizer does no such thing. > Can you point to the code you're talking about? In DAGTypeLegalizer::run(), the loop after the "ScanOperands" label looks at the operands of the BUILD_VECTOR and the getTypeAction method returns PromoteInteger (because i8 and i16 are promoted to fit in 32- bit registers on PPC). > > > That said, I do recall the x86 backend does a similar > platform-specific transformation with constants... do you care about > x86 in particular? > >> This works fine for SSE, where it appears that BUILD_VECTORs are >> intentionally canonicalized to use i32 elements for the benefit of >> CSE. I'm looking at implementing something where I think I'd like to >> keep the original vector types. Is this behavior in the type >> legalizer something that should be changed? > > I'm not sure what behavior you're talking about... If I have a BUILD_VECTOR that creates a v16i8 type, and v16i8 is a legal type for the target that does not require any promotion or expansion, I would not expect the type legalizer to change it. But, it does. It looks at the i8 elements and decides that they need to be promoted to i32, so it changes the BUILD_VECTOR to create a v4i32 vector, which is then bitcast back to v16i8. I'm not specifically working on SSE right now, I was just explaining why this issue doesn't arise for that target. From eli.friedman at gmail.com Mon Feb 2 17:50:18 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 2 Feb 2009 15:50:18 -0800 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs In-Reply-To: <98EAF9B2-D839-4879-893F-56F907E9AA2D@apple.com> References: <98EAF9B2-D839-4879-893F-56F907E9AA2D@apple.com> Message-ID: On Mon, Feb 2, 2009 at 3:34 PM, Bob Wilson wrote: > If I have a BUILD_VECTOR that creates a v16i8 type, and v16i8 is a > legal type for the target that does not require any promotion or > expansion, I would not expect the type legalizer to change it. But, > it does. It looks at the i8 elements and decides that they need to be > promoted to i32, so it changes the BUILD_VECTOR to create a v4i32 > vector, which is then bitcast back to v16i8. Ah, this is the code in DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR. Thanks for the pointer; this is making a lot more sense now. I somehow find it surprising that the operands of a BUILD_VECTOR are required to be the same type as the elements of the resulting vector. Besides changing that, I don't have any particularly good suggestions. -Eli From evan.cheng at apple.com Mon Feb 2 17:54:10 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 2 Feb 2009 15:54:10 -0800 Subject: [LLVMdev] undefs in phis In-Reply-To: <200902021505.05179.dag@cray.com> References: <200901291647.26270.dag@cray.com> <200902021412.01657.dag@cray.com> <193746F5-D74B-4A7A-95B9-65C9D52BAAD0@apple.com> <200902021505.05179.dag@cray.com> Message-ID: On Feb 2, 2009, at 1:05 PM, David Greene wrote: > On Monday 02 February 2009 14:29, Evan Cheng wrote: >> On Feb 2, 2009, at 12:12 PM, David Greene wrote: >>> On Monday 02 February 2009 13:14, Evan Cheng wrote: >>>> I am sorry I don't really follow it. Is this what you are >>>> describing? >>>> >>>> %v1177 = undef >>>> ... >>>> loop: >>>> ... >>>> %v1176 = op ... >>>> = %v1177 >>>> %v1177 = %v1176 >>>> jmp loop >>>> >>>> Why is not safe to coalesce the 2 registers? >>> >>> Not quite. The original code is: >>> >>> %v1177 = undef >>> %v1645 = ... >>> loop: >>> %v1176 = %v1645 >>> ... >>> = %v1176 >>> = %v1177 >>> %v1645 = op ... >>> %v1177 = %v1176 >>> jmp loop >>> >>> We can't coalesce %v1177 and %v1176 legally. But we do. >> >> Seriously, why not? In the first iteration, it's totally legal for >> v1177 has the same value as v1176. It's defined by an undef, it's >> allowed to have contain any value. > > Think about what will happen the 2nd iteration. %v1177 will have > the value of > %v1645 which is wrong. This is because %v1176 in bb74 will be > replaced with > %v1177. That's incorrect. Ok, right. The trick to fixing is to make sure the valno of the def of v1177 hasPHIKill to true and make sure the coalescer checks it. Evan > > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From gohman at apple.com Mon Feb 2 17:58:59 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 2 Feb 2009 15:58:59 -0800 Subject: [LLVMdev] Aliasing (was Performance vs other VMs) In-Reply-To: <200902011601.20343.jon@ffconsultancy.com> References: <200901302056.44693.jon@ffconsultancy.com> <200902011601.20343.jon@ffconsultancy.com> Message-ID: On Feb 1, 2009, at 8:01 AM, Jon Harrop wrote: > > The LLVM 2.1 release notes say that llvm-gcc got alias analysis and > understood > the "restrict" keyword but when I add it to the C code for SciMark2 > it makes > no difference. Can anyone else get this to work? It works for me. LLVM doesn't yet perform many of the optimizations that typically benefit from this type of information being available though. Dan From tema13tema at yahoo.de Mon Feb 2 03:01:13 2009 From: tema13tema at yahoo.de (Rudskyy) Date: Mon, 2 Feb 2009 10:01:13 +0100 Subject: [LLVMdev] LLVM and backend Message-ID: <000301c98514$cd67ecb0$07362c8d@MOIPC7> Hallo! I have found the LLVM-project and hope it can be useful for me in my work. There is a processor with a simple assembly code (http://hilscher.com/ xPEC- processor). The task is that I need a "translator" from C/C++ to native assembly code. And understand, that I need to write a backend specifying my target (convertion a llvm-IR code to assembly). Can you help, suggest to start from? Best regards, Rudskyy Artem -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/68aef950/attachment-0001.html From nicholas at mxc.ca Mon Feb 2 22:35:15 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 02 Feb 2009 20:35:15 -0800 Subject: [LLVMdev] bug 3367 In-Reply-To: References: Message-ID: <4987C983.4030307@mxc.ca> Jay Foad wrote: > Please can you consider fixing bug 3367 in the trunk and 2.5 branch? > It's an assertion failure in an optimization pass that I hit when > compiling a real C++ application. > > It's caused by -inline not updating the call graph when it replaces a > call with an invoke. There's a very small test case attached to the > bug, as well as a pretty obvious fix. I agree, that's an obvious fix. I've applied it in r63600. Thanks for the patch and testcase! Nick > http://llvm.org/bugs/show_bug.cgi?id=3367 > > Thanks! > Jay. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From zhousheng at autoesl.com Mon Feb 2 22:37:47 2009 From: zhousheng at autoesl.com (Sheng Zhou) Date: Tue, 03 Feb 2009 12:37:47 +0800 Subject: [LLVMdev] Proposal: Debug information improvement - keep the line number with optimizations Message-ID: <4987CA1B.6040804@gmail.com> Hi Patel, Thanks for your comments, some reply below... (This is the first part, I'll send the second part later) > 2.1 Verification Flow > > The most important of this project is to make the debug information > > do not block any optimization by LLVM transform passes. Here I > > propose a way to determine whether codegen is being impacted by > > debug info. This is also useful for us to scan the LLVM transform > > pass list to find which pass need to update to work with debug > > information. > > > > From Chris: Add a -strip-debug pass that removes all debug info from > > the LLVM IR. Given this, it would allow us to do: > > $ llvm-gcc -O3 -c -o - | llc > good.s > > $ llvm-gcc -O3 -c -g -o - | opt -strip-debug | llc > test.s > > $ diff good.s test.s > > > > > >> > If the two .s files differed, then badness happened. >> > > This may not work perfectly because presence of debug info may > influence compiler generated symbol names and label numbers. > I see. But if the optimizations do the right thing with debug info, the two .s files should be very similar. And the labels in .s are basically corresponds to the basicblock in .ll I think we can find some way to workaround this, like adding some filter before the differ. I'm not sure how the debug info influence the symbol names in assembly file, for example? > >> > This obviously only catches badness that happens in the LLVM >> > optimizer, >> > > There is an establish way to check this. See > http://llvm.org/docs/SourceLevelDebugging.html#debugopt > > That's great, thanks. >> > if the code generator is broken, we'll need something more >> > sophisticated that strips debug info out of the .s file. In any >> > case, this is a good place to start, and should be turned into a >> > llvm-test TEST/report. >> > >> > Incidentally, we have to go through codegen, we can't diff .ll files >> > after debug info is stripped out. This is because debug info is >> > allowed to (and probably does) impact local names within functions, >> > but these functions are removed at codegen and are not important to >> > preserve. End >> > >> > >> > >> > 2.2 A Pass to clean up the debug info >> > LLVM already has a transform pass "-strip-debug", it removes all the >> > debug information. But for the first half of this project, we want >> > to just keep the line number information (stop point) in the >> > optimized code. So we need a new transform pass to just removes the >> > variable declaration information. >> > > FWIW, mem2reg already does this. > Seems mem2reg now can work very well with the line number information. > >> > Pass "-strip-debug" also doesn't cleanup the dead variable and >> > function calling for debug information, it thinks other pass like "- >> > dce" or "-globaldce" can handle this. >> > > Yes. > > >> > But as we are also going to update those passes, we can't use them >> > in the verification flow, otherwise, it may output incorrect check >> > results. >> > > I am not sure, I follow this. > > >> > >> > The new pass "-strip-debug-pro" should have the following functions: >> > 1. Just remove the variable declaration information and >> > clean up the dead debug information. >> > > This are two separate tasks. > 1) Remove variable declaration info. > This is already done (indirectly) by mem2reg. But a separate pass to > do so won't hurt either. > 2) Remove dead debug information. > This is very useful as a separate pass and can be used while > debugging non optimized code (for example, to remove type info for the > types that are not used at all). > > >> > 2. Just remove the line number information and clean up the >> > dead debug information. >> > > I am not sure what is the purpose of this ? > Eh...just forget this. > >> > 3. Remove all the debug information and clean up. >> > > That's what, "Remove Debug Info", -strip-debug does. > > If you put -strip-debug + -dce in one pass then you're not comparing > apple and apple in your 2.1 style verification. Or I am missing > something. > > Don't remember why I wrote this, skip it. > >> > 2.3 Front End Changes >> > For the first half of the project, we just aim to handle the line >> > number debug information. So we need to force llvm-gcc not to emit >> > any variable declaration information. >> > >> > 2.4 Optimization Transform Changes >> > According to the output of the check script, we can get a pass-to- >> > update list. Just follow the list to update the pass one by one. >> > When done a single pass, turn back to run the llvm/test and llvm- >> > test, note apply the pass "-strip-debug-pro" right after the updated >> > pass to see if it work correctly. >> > From tonic at nondot.org Mon Feb 2 23:39:01 2009 From: tonic at nondot.org (Tanya Lattner) Date: Mon, 2 Feb 2009 21:39:01 -0800 Subject: [LLVMdev] 2.5 Branch Created Message-ID: <5F5DBD18-8EFB-460A-BD02-7226AD2E6ACD@nondot.org> LLVMers, The 2.5 release branch has been created. You may check it out with the following commands: svn co https://llvm.org/svn/llvm-project/llvm/branches/release_25 svn co https://llvm.org/svn/llvm-project/llvm-gcc-4.2/branches/release_25 svn co https://llvm.org/svn/llvm-project/test-suite/branches/release_25 Please do not commit anything to the release branch. If you have a patch that needs to be merged in, you must get it approved by a code owner and forward me the link to the llvm-commits message for your patch. Here is the updated release schedule: Feb 2 - Code Freeze/Branch Creation (9PM PST). Feb 5 - Pre-release 1 testing begins. Feb 12 - Pre-release 1 testing ends. Feb 16 - Pre-release 2 testing begins. Feb 23 - Pre-release 2 testing ends. Feb 25 - Release. We apologize for this set back. Thanks, Tanya -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090202/ba326aad/attachment.html From evan.cheng at apple.com Mon Feb 2 23:55:46 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 2 Feb 2009 21:55:46 -0800 Subject: [LLVMdev] undefs in phis In-Reply-To: References: <200901291647.26270.dag@cray.com> <200902021412.01657.dag@cray.com> <193746F5-D74B-4A7A-95B9-65C9D52BAAD0@apple.com> <200902021505.05179.dag@cray.com> Message-ID: <43E485F3-6440-4F76-9E8B-D1BE97D9C937@apple.com> On Feb 2, 2009, at 3:54 PM, Evan Cheng wrote: > > On Feb 2, 2009, at 1:05 PM, David Greene wrote: > >> On Monday 02 February 2009 14:29, Evan Cheng wrote: >>> On Feb 2, 2009, at 12:12 PM, David Greene wrote: >>>> On Monday 02 February 2009 13:14, Evan Cheng wrote: >>>>> I am sorry I don't really follow it. Is this what you are >>>>> describing? >>>>> >>>>> %v1177 = undef >>>>> ... >>>>> loop: >>>>> ... >>>>> %v1176 = op ... >>>>> = %v1177 >>>>> %v1177 = %v1176 >>>>> jmp loop >>>>> >>>>> Why is not safe to coalesce the 2 registers? >>>> >>>> Not quite. The original code is: >>>> >>>> %v1177 = undef >>>> %v1645 = ... >>>> loop: >>>> %v1176 = %v1645 >>>> ... >>>> = %v1176 >>>> = %v1177 >>>> %v1645 = op ... >>>> %v1177 = %v1176 >>>> jmp loop >>>> >>>> We can't coalesce %v1177 and %v1176 legally. But we do. >>> >>> Seriously, why not? In the first iteration, it's totally legal for >>> v1177 has the same value as v1176. It's defined by an undef, it's >>> allowed to have contain any value. >> >> Think about what will happen the 2nd iteration. %v1177 will have >> the value of >> %v1645 which is wrong. This is because %v1176 in bb74 will be >> replaced with >> %v1177. That's incorrect. > > Ok, right. The trick to fixing is to make sure the valno of the def of > v1177 hasPHIKill to true and make sure the coalescer checks it. Actually liveintervals can construct a v1177 live range starting from the beginning mbb with a val# of unknown def. Evan > > Evan > >> >> >> -Dave >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From isanbard at gmail.com Tue Feb 3 00:05:55 2009 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 2 Feb 2009 22:05:55 -0800 Subject: [LLVMdev] LLVM and backend In-Reply-To: <000301c98514$cd67ecb0$07362c8d@MOIPC7> References: <000301c98514$cd67ecb0$07362c8d@MOIPC7> Message-ID: <0613A1E9-2ECF-431A-A4ED-81215AAC0369@gmail.com> On Feb 2, 2009, at 1:01 AM, Rudskyy wrote: > Hallo! > > I have found the LLVM-project and hope it can be useful for me in my > work. > There is a processor with a simple assembly code (http:// > hilscher.com/ xPEC- processor). > The task is that I need a ?translator? from C/C++ to native assembly > code. > And understand, that I need to write a backend specifying my target > (convertion a llvm-IR code to assembly). > Can you help, suggest to start from? > Hi Rudskyy, This is a webpage explaining how to write a backend: http://llvm.org/docs/WritingAnLLVMBackend.html It would also be good to peruse the other LLVM docs to get a feel for how it all works: http://llvm.org/docs/ -bw From zhousheng at autoesl.com Tue Feb 3 01:50:19 2009 From: zhousheng at autoesl.com (Sheng Zhou) Date: Tue, 03 Feb 2009 15:50:19 +0800 Subject: [LLVMdev] Proposal: Debug information improvement - keep the line number with optimizations Message-ID: <4987F73B.6080400@gmail.com> Hi Patel, Here is second part of my reply. > 2. Proposed Work Plan > > This section defines a proposed work plan to accomplish the > > requirements that we desires. The work plan is broken into several > > distinct phases that follow a logical progression of modifications > > to the LLVM software. > > > > 2.1 Phase 1: Establish the testing system > > One of the most useful things to get started is to have some way to > > determine whether codegen is being impacted by debug info. It is > > important to be able to tell when this happens so that we can track > > down these places and fix them. > > > > 2.1.1 Pass Scanning Script > > Following the way proposed by Chris, it is good to have a script to > > scan the standard LLVM transform pass list. We can get the standard > > compile optimization pass list by: > > > You can use http://llvm.org/docs/SourceLevelDebugging.html#debugopt as > a starting point here. > Ok. >> > >> > $ opt -std-compile-opts -debug-pass=Arguments foo.bc > /dev/ >> > null ... ... > 2.2 Phase 2: New Pass to Strip Debug Information > > LLVM already has a transform pass "-strip-debug", it removes all the > > debug information. But for the first half of this project, we want > > to just keep the line number information (stop point) in the > > optimized code. So we need a new transform pass to just removes the > > variable declaration information. Pass "-strip-debug" also doesn't > > cleanup the dead variable and function calling for debug > > information, it thinks other pass like "-dce" or "-globaldce" can > > handle this. But as we are also going to update those passes, we > > can't use them in the verification flow, otherwise, it may output > > incorrect check results. > > > > The new pass "-strip-debug-pro" should have the following functions: > > 1. Just remove the variable declaration information and > > clean up the dead debug information. > > > > 2. Remove all the debug information and clean up > > > > 3.2.1 Work Plan > > 1. Take a reference to transform pass StripSymbol.cpp > > 2. Based on the StripSymbol.cpp, add an option to it to just > > remove debug information, like "-rm-debug" > > > That's what -strip-debug is doing. > > >> > 3. Add an option to just remove the variable declaration >> > information, like "?rm-debug=2" >> > > Why not -strip-debug=2 if you want a way to remove variable > declarations ..? > Agree. > >> > 4. Add a procedure to clean up the dead variables and >> > function calls for debug purpose. >> > >> > 2.3 Phase 3: Extend llvm-gcc >> > Once we have a way to verify what is happening, I propose that we >> > aim for an intermediate point: instead of having -O disable all >> > debug info, we should make it disable just variable information, but >> > keep emitting line number info. This would allow stepping through >> > the program, getting stack traces, use performance tools like shark, >> > etc. >> > >> > We need the front-end llvm-gcc to have a mode that causes it to emit >> > line number info but not >> > variable info, we can go through the process above to identify >> > passes that change behavior when line number intrinsics are in the >> > code. >> > >> > 1.3.1 Work Plan >> > 1. First locate the file position that llvm-gcc handle the >> > parameter options. >> > 2. Add a new option to control the llvm-gcc to emit >> > specified debug information: like ?g1. ?g1 to only emit line number >> > > >> > 3. Building the new llvm-gcc >> > 4. Testing through llvm/test, llvm-test >> > >> > 2.4 Phase 4: Update Transform Passes for Line Number Info. >> > When the front-end has a mode that causes it to emit line number >> > info but not variable info, we can go through the process above to >> > identify passes that change behavior when line number intrinsics are >> > in the code. >> > > I think, the optimizer is not changing behavior when dbg info is > present. Try running dbgopt tests. > > >> > Obvious cases are things like loop unroll and inlining: they >> > 'measure' the size of some code to determine whether to unroll it or >> > not. This means that it should be enhanced to ignore debug >> > intrinsics for the sake of code size estimation. >> > > The loop unrolling pass already ignores the debug info! See > LoopUnroll.cpp::ApproximateLoopSize() > ok, I see. > >> > >> > Another example is optimizations like SimplifyCFG when it merges if/ >> > then/else into select instructions. SimplifyCFG will have to be >> > enhanced to ignore debug intrinsics when doing its safety/ >> > profitability analysis, >> > > I think, it handles this part well, but ... > >> > but then it will also have to be updated to just delete the line >> > number intrinsics when it does the xform. This is simplifycfg's way >> > of "updating" the debug info for this example transformation. >> > > .. the second part has not received full attention. > > >> > As we progress through various optimizations, we will find cases >> > where it is possible to update (e.g. loop unroll or inlining, which >> > doesn't have to do anything special to update line #'s) and places >> > where it isn't. As long as the debug intrinsics don't affect >> > codegen, we are happy, even if the debug intrinsics are deleted in >> > cases where it would be possible to update them (this becomes a >> > optimized debugging QoI issue). From baldrick at free.fr Tue Feb 3 03:39:26 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 3 Feb 2009 10:39:26 +0100 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs Message-ID: <200902031039.26397.baldrick@free.fr> (Resend, since it didn't seem to reach the mailing list the first time) Hi Bob, > LLVM's type legalizer is changing the types of BUILD_VECTORs in a way > that seems wrong to me, but I'm not sure if this is a bug or if some > targets may be relying on it. > > On a 32-bit target, the default action for legalizing i8 and i16 types > is to promote them. If you then have a BUILD_VECTOR to construct a > legal vector type composed of i8 or i16 values, the type legalizer > will look at the BUILD_VECTOR operands and decide that it needs to > promote them to i32 types. You end up with a BUILD_VECTOR that > constructs a vector of i32 values that are then bitcast to the > original vector type. > > This works fine for SSE, where it appears that BUILD_VECTORs are > intentionally canonicalized to use i32 elements for the benefit of > CSE. I'm looking at implementing something where I think I'd like to > keep the original vector types. Is this behavior in the type > legalizer something that should be changed? another way this could be done is to say that the operands of a BUILD_VECTOR don't have to have the same type as the element type of the built vector. Then when the type legalizer sees a v4i16 = BUILD_VECTOR(i16, i16, i16, i16) it can turn this into a v4i16 = BUILD_VECTOR(i32, i32, i32, i32) and it will be happy (all result and operand types are legal). This requires changing the definition of BUILD_VECTOR slightly. Targets will need to understand that only the bottom 16 bits of the operands are to be used, but I doubt that's a problem. Would this solve your problem? Ciao, Duncan. PS: Can you please give a concrete example where the current behavior causes trouble for you? From baldrick at free.fr Mon Feb 2 13:52:12 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 2 Feb 2009 20:52:12 +0100 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <200902011338.25003.jon@ffconsultancy.com> References: <200902011338.25003.jon@ffconsultancy.com> Message-ID: <200902022052.12984.baldrick@free.fr> Hi Jon, check out http://llvm.org/docs/LangRef.html#aggregateops Ciao, Duncan. From baldrick at free.fr Mon Feb 2 15:22:16 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 2 Feb 2009 22:22:16 +0100 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs In-Reply-To: References: Message-ID: <200902022222.17222.baldrick@free.fr> Hi Bob, > LLVM's type legalizer is changing the types of BUILD_VECTORs in a way > that seems wrong to me, but I'm not sure if this is a bug or if some > targets may be relying on it. > > On a 32-bit target, the default action for legalizing i8 and i16 types > is to promote them. If you then have a BUILD_VECTOR to construct a > legal vector type composed of i8 or i16 values, the type legalizer > will look at the BUILD_VECTOR operands and decide that it needs to > promote them to i32 types. You end up with a BUILD_VECTOR that > constructs a vector of i32 values that are then bitcast to the > original vector type. > > This works fine for SSE, where it appears that BUILD_VECTORs are > intentionally canonicalized to use i32 elements for the benefit of > CSE. I'm looking at implementing something where I think I'd like to > keep the original vector types. Is this behavior in the type > legalizer something that should be changed? another way this could be done is to say that the operands of a BUILD_VECTOR don't have to have the same type as the element type of the built vector. Then when the type legalizer sees a v4i16 = BUILD_VECTOR(i16, i16, i16, i16) it can turn this into a v4i16 = BUILD_VECTOR(i32, i32, i32, i32) and it will be happy (all result and operand types are legal). This requires changing the definition of BUILD_VECTOR slightly. Targets will need to understand that only the bottom 16 bits of the operands are to be used, but I doubt that's a problem. Would this solve your problem? Ciao, Duncan. PS: Can you please give a concrete example where the current behavior causes trouble for you? From howarth at bromo.med.uc.edu Tue Feb 3 08:53:01 2009 From: howarth at bromo.med.uc.edu (Jack Howarth) Date: Tue, 3 Feb 2009 09:53:01 -0500 Subject: [LLVMdev] building libLTO.dylib Message-ID: <20090203145301.GA31604@bromo.med.uc.edu> I puzzled out that the --with-pic option was required to cause the libLTO.dylib to be built. I must have done a make before directly in the lto tools subdirectory when I was building directly in the source tree. Jack From bob.wilson at apple.com Tue Feb 3 11:02:58 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Tue, 3 Feb 2009 09:02:58 -0800 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs In-Reply-To: <200902022222.17222.baldrick@free.fr> References: <200902022222.17222.baldrick@free.fr> Message-ID: <9258AF9B-A22B-4686-8EC8-C1287169C926@apple.com> On Feb 2, 2009, at 1:22 PM, Duncan Sands wrote: > another way this could be done is to say that the operands of a > BUILD_VECTOR don't have to have the same type as the element type > of the built vector. Then when the type legalizer sees a > v4i16 = BUILD_VECTOR(i16, i16, i16, i16) it can turn this into a > v4i16 = BUILD_VECTOR(i32, i32, i32, i32) and it will be happy > (all result and operand types are legal). This requires changing > the definition of BUILD_VECTOR slightly. Targets will need to > understand that only the bottom 16 bits of the operands are to > be used, but I doubt that's a problem. Would this solve your > problem? That might be good. An alternative would be to leave the current definition of BUILD_VECTOR but change the type legalizer to ignore the BUILD_VECTOR operands. I would prefer the latter. Unless the BUILD_VECTOR is being expanded and the individual element values need to be instantiated separately, I don't see why the element types would need to be legal on their own. I'm new to LLVM so maybe there are some constraints that I don't know about. But, it seems like if a target says that a particular vector type is legal and if it does not lower a BUILD_VECTOR of that type, then nothing more needs to be done to the vector elements. > > PS: Can you please give a concrete example where the current > behavior causes trouble for you? Well, at this point, I've already worked around the immediate issue. Changing it is not urgent. I may or may not run into this again later in the project I am working on. For now I wanted to raise the issue because I found the current behavior surprising and it took me a while to work out why it was happening. Thanks for thinking about it. From eli.friedman at gmail.com Tue Feb 3 11:31:22 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 3 Feb 2009 09:31:22 -0800 Subject: [LLVMdev] type legalizer promoting BUILD_VECTORs In-Reply-To: <9258AF9B-A22B-4686-8EC8-C1287169C926@apple.com> References: <200902022222.17222.baldrick@free.fr> <9258AF9B-A22B-4686-8EC8-C1287169C926@apple.com> Message-ID: On Tue, Feb 3, 2009 at 9:02 AM, Bob Wilson wrote: > An alternative would be to leave the current > definition of BUILD_VECTOR but change the type legalizer to ignore the > BUILD_VECTOR operands. Won't work; if we don't promote the operands, we'll be left with nodes with illegal result types, which PPC doesn't know how to generate. -Eli From Micah.Villmow at amd.com Tue Feb 3 12:23:44 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Tue, 3 Feb 2009 10:23:44 -0800 Subject: [LLVMdev] Promoting i1,i8,i16 Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> Is there a way to force llvm to promote all smaller types to i32 instead of i16? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090203/7fa691f3/attachment.html From isanbard at gmail.com Tue Feb 3 13:01:27 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 3 Feb 2009 11:01:27 -0800 Subject: [LLVMdev] Promoting i1,i8,i16 In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> Message-ID: <16e5fdf90902031101x53ee4035x9fdd71a28b40a44c@mail.gmail.com> On Tue, Feb 3, 2009 at 10:23 AM, Villmow, Micah wrote: > Is there a way to force llvm to promote all smaller types to i32 instead of > i16? > This might help. In TargetLowering.h: /// AddPromotedToType - If Opc/OrigVT is specified as being promoted, the /// promotion code defaults to trying a larger integer/fp until it can find /// one that works. If that default is insufficient, this method can be used /// by the target to override the default. void AddPromotedToType(unsigned Opc, MVT OrigVT, MVT DestVT) { PromoteToType[std::make_pair(Opc, OrigVT.getSimpleVT())] = DestVT.getSimpleVT(); } -bw From eli.friedman at gmail.com Tue Feb 3 13:07:47 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 3 Feb 2009 11:07:47 -0800 Subject: [LLVMdev] Promoting i1,i8,i16 In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> Message-ID: On Tue, Feb 3, 2009 at 10:23 AM, Villmow, Micah wrote: > Is there a way to force llvm to promote all smaller types to i32 instead of > i16? It should just work if i16 is also set to promote... what are you trying to do? -Eli From Micah.Villmow at amd.com Tue Feb 3 13:31:54 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Tue, 3 Feb 2009 11:31:54 -0800 Subject: [LLVMdev] Promoting i1,i8,i16 In-Reply-To: References: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C78575B@ssanexmb1.amd.com> I want to promote i1 to i32 and not i16 as i32 is my native type and i16 is emulated, but I need to handle i16 as a special case so I don't want to promote it. I will see if what Bill pointed out is what I need. Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Eli Friedman Sent: Tuesday, February 03, 2009 11:08 AM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Promoting i1,i8,i16 On Tue, Feb 3, 2009 at 10:23 AM, Villmow, Micah wrote: > Is there a way to force llvm to promote all smaller types to i32 instead of > i16? It should just work if i16 is also set to promote... what are you trying to do? -Eli _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kotha.aparna at gmail.com Tue Feb 3 14:33:47 2009 From: kotha.aparna at gmail.com (aparna kotha) Date: Tue, 3 Feb 2009 15:33:47 -0500 Subject: [LLVMdev] multithreaded applications Message-ID: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> Hi all: I am working on a project using llvm and we need to deal with multithreaded applications. I wanted to know if there was a C front end for llvm that could parse multithreaded applications? I tried llvm-gcc (4.2) and could not get it to work. Is there an extra parameter that I need to pass or something ? Thanks a lot for your help. Regards -- -- Aparna Graduate Student Department of Electrical and Computer Engineering University of Maryland, College Park -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090203/ed882c36/attachment.html From luked at cs.rochester.edu Tue Feb 3 14:47:40 2009 From: luked at cs.rochester.edu (Luke Dalessandro) Date: Tue, 03 Feb 2009 15:47:40 -0500 Subject: [LLVMdev] multithreaded applications In-Reply-To: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> References: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> Message-ID: <4988AD6C.1080509@cs.rochester.edu> aparna kotha wrote: > Hi all: > > I am working on a project using llvm and we need to deal with > multithreaded applications. I wanted to know if there was a C front end > for llvm that could parse multithreaded applications? I tried llvm-gcc > (4.2) and could not get it to work. Is there an extra parameter that I > need to pass or something ? Just the standard flags should work. We define -D_REENTRANT during compilation and -lpthread during linking. We also have no problem with OpenMP (-fopenmp -lgomp) if that's what you are using. You'll need to adjust this for whatever threading package you use. Luke > > > > Thanks a lot for your help. > > > > Regards > > -- > -- Aparna > > Graduate Student > Department of Electrical and Computer Engineering > University of Maryland, College Park > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From isanbard at gmail.com Tue Feb 3 14:51:14 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 3 Feb 2009 12:51:14 -0800 Subject: [LLVMdev] multithreaded applications In-Reply-To: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> References: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> Message-ID: <16e5fdf90902031251l702bde8fy56528ea976534427@mail.gmail.com> On Tue, Feb 3, 2009 at 12:33 PM, aparna kotha wrote: > Hi all: > > I am working on a project using llvm and we need to deal with multithreaded > applications. I wanted to know if there was a C front end for llvm that > could parse multithreaded applications? I tried llvm-gcc (4.2) and could not > get it to work. Is there an extra parameter that I need to pass or something > ? > What type of multithreaded language are you using? If it's something other than C/C++/ObjC using MPI, pthreads, or OpenMP, then you'll have to find a front-end that will parse your language and then add support to it to emit LLVM. -bw From kotha.aparna at gmail.com Tue Feb 3 14:55:34 2009 From: kotha.aparna at gmail.com (aparna kotha) Date: Tue, 3 Feb 2009 15:55:34 -0500 Subject: [LLVMdev] multithreaded applications In-Reply-To: <16e5fdf90902031251l702bde8fy56528ea976534427@mail.gmail.com> References: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> <16e5fdf90902031251l702bde8fy56528ea976534427@mail.gmail.com> Message-ID: <326a2f490902031255ke09f9bg57366eb727fc61ea@mail.gmail.com> I am using pthreads. I was also wondering what will the llvm IR be for pthreads ? On Tue, Feb 3, 2009 at 3:51 PM, Bill Wendling wrote: > On Tue, Feb 3, 2009 at 12:33 PM, aparna kotha > wrote: > > Hi all: > > > > I am working on a project using llvm and we need to deal with > multithreaded > > applications. I wanted to know if there was a C front end for llvm that > > could parse multithreaded applications? I tried llvm-gcc (4.2) and could > not > > get it to work. Is there an extra parameter that I need to pass or > something > > ? > > > What type of multithreaded language are you using? If it's something > other than C/C++/ObjC using MPI, pthreads, or OpenMP, then you'll have > to find a front-end that will parse your language and then add support > to it to emit LLVM. > > -bw > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -- -- Aparna -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090203/82fb4d39/attachment-0001.html From isanbard at gmail.com Tue Feb 3 14:59:25 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 3 Feb 2009 12:59:25 -0800 Subject: [LLVMdev] multithreaded applications In-Reply-To: <326a2f490902031255ke09f9bg57366eb727fc61ea@mail.gmail.com> References: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> <16e5fdf90902031251l702bde8fy56528ea976534427@mail.gmail.com> <326a2f490902031255ke09f9bg57366eb727fc61ea@mail.gmail.com> Message-ID: <16e5fdf90902031259x1853882xa50576953ee698fe@mail.gmail.com> On Tue, Feb 3, 2009 at 12:55 PM, aparna kotha wrote: > I am using pthreads. > > I was also wondering what will the llvm IR be for pthreads ? > Okay. Luke gave hints on how to get pthreads to work. LLVM doesn't do anything special for pthreads calls. So they should look like regular calls into a library. -bw From kotha.aparna at gmail.com Tue Feb 3 15:04:01 2009 From: kotha.aparna at gmail.com (aparna kotha) Date: Tue, 3 Feb 2009 16:04:01 -0500 Subject: [LLVMdev] multithreaded applications In-Reply-To: <16e5fdf90902031259x1853882xa50576953ee698fe@mail.gmail.com> References: <326a2f490902031233s1ce4e68eg736e581299ec667b@mail.gmail.com> <16e5fdf90902031251l702bde8fy56528ea976534427@mail.gmail.com> <326a2f490902031255ke09f9bg57366eb727fc61ea@mail.gmail.com> <16e5fdf90902031259x1853882xa50576953ee698fe@mail.gmail.com> Message-ID: <326a2f490902031304jb546f5aic78eb9c77db85e34@mail.gmail.com> Thanks Luke and Bill. Aparna On Tue, Feb 3, 2009 at 3:59 PM, Bill Wendling wrote: > On Tue, Feb 3, 2009 at 12:55 PM, aparna kotha > wrote: > > I am using pthreads. > > > > I was also wondering what will the llvm IR be for pthreads ? > > > Okay. Luke gave hints on how to get pthreads to work. LLVM doesn't do > anything special for pthreads calls. So they should look like regular > calls into a library. > > -bw > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -- -- Aparna -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090203/734e84ad/attachment.html From kasra_n500 at yahoo.com Tue Feb 3 16:28:17 2009 From: kasra_n500 at yahoo.com (Kasra) Date: Tue, 3 Feb 2009 14:28:17 -0800 (PST) Subject: [LLVMdev] rol/ror llvm instruction set Message-ID: <712464.403.qm@web110007.mail.gq1.yahoo.com> Hi, I was looking around the LLVM instruction set and I failed to find ROL and ROR instructions. Is there any plans on adding these instructions to LLVM? The reason that I am asking is for cryptographical algorithms which are becoming ever more important rotation is a major operation. Thus including such instruction could reduce 3 instructions {shl, shr, or} into {rol | ror} which could gain considerable performance. However, the increased performance is not confined within crypto algorithms. -- Kasra From mrs at apple.com Tue Feb 3 16:35:53 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 3 Feb 2009 14:35:53 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <712464.403.qm@web110007.mail.gq1.yahoo.com> References: <712464.403.qm@web110007.mail.gq1.yahoo.com> Message-ID: On Feb 3, 2009, at 2:28 PM, Kasra wrote: > I was looking around the LLVM instruction set and I failed to find > ROL and ROR instructions. Is there any plans on adding these > instructions to LLVM? Not sure what you mean: $ cat t.c unsigned int rol(unsigned int i) { return i << 1 | i >> 31; } mrs $ clang -S t.c -O2 mrs $ cat t.s .text .align 4,0x90 .globl _rol _rol: movl 4(%esp), %eax roll %eax ret ? From dalej at apple.com Tue Feb 3 16:45:30 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 3 Feb 2009 14:45:30 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: References: <712464.403.qm@web110007.mail.gq1.yahoo.com> Message-ID: On Feb 3, 2009, at 2:35 PMPST, Mike Stump wrote: > On Feb 3, 2009, at 2:28 PM, Kasra wrote: >> I was looking around the LLVM instruction set and I failed to find >> ROL and ROR instructions. Is there any plans on adding these >> instructions to LLVM? > > Not sure what you mean: He's referring to the LLVM IR, I think, and it's true that doesn't have rotates. The LLVM back ends do know about rotate instructions on targets that have them, though, and the llvm optimizers are pretty smart about recognizing the usual ways to express rotate with shift/ and/or, as below. > $ cat t.c > unsigned int rol(unsigned int i) { > return i << 1 | i >> 31; > } > mrs $ clang -S t.c -O2 > mrs $ cat t.s > > > .text > .align 4,0x90 > .globl _rol > _rol: > movl 4(%esp), %eax > roll %eax > ret > > ? > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From isanbard at gmail.com Tue Feb 3 16:52:35 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 3 Feb 2009 14:52:35 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: References: <712464.403.qm@web110007.mail.gq1.yahoo.com> Message-ID: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> On Tue, Feb 3, 2009 at 2:45 PM, Dale Johannesen wrote: > > On Feb 3, 2009, at 2:35 PMPST, Mike Stump wrote: > >> On Feb 3, 2009, at 2:28 PM, Kasra wrote: >>> I was looking around the LLVM instruction set and I failed to find >>> ROL and ROR instructions. Is there any plans on adding these >>> instructions to LLVM? >> >> Not sure what you mean: > > He's referring to the LLVM IR, I think, and it's true that doesn't > have rotates. The LLVM back ends do know about rotate instructions on > targets that have them, though, and the llvm optimizers are pretty > smart about recognizing the usual ways to express rotate with shift/ > and/or, as below. > Look in the DAGCombiner.cpp file to see which patterns it translates into ROTL and ROTR instructions. -bw From scooter.phd at gmail.com Tue Feb 3 17:31:22 2009 From: scooter.phd at gmail.com (Scott Michel) Date: Tue, 3 Feb 2009 15:31:22 -0800 Subject: [LLVMdev] GetConstantBuildVectorBits, isConstantSplat Message-ID: <258cd3200902031531i14b0ba6awcb77e2a793e3f0c9@mail.gmail.com> Is anyone going to develop acute heartburn if I move these two functions as TargetLowering methods? It seems that they're used frequently enough across multiple backends and provide common functionality. I could be convinced that TargetLowering is the wrong place to put these functions as methods and that they're better off in a separate BUILD_VECTOR SDValue class. Comments? -scooter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090203/a4d53556/attachment.html From kasra_n500 at yahoo.com Tue Feb 3 17:54:23 2009 From: kasra_n500 at yahoo.com (Kasra) Date: Tue, 3 Feb 2009 15:54:23 -0800 (PST) Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> Message-ID: <368936.96481.qm@web110012.mail.gq1.yahoo.com> --- On Tue, 2/3/09, Bill Wendling wrote: > From: Bill Wendling > Subject: Re: [LLVMdev] rol/ror llvm instruction set > To: "LLVM Developers Mailing List" > Cc: kasra_n500 at yahoo.com > Date: Tuesday, February 3, 2009, 2:52 PM > On Tue, Feb 3, 2009 at 2:45 PM, Dale Johannesen > wrote: > > > > On Feb 3, 2009, at 2:35 PMPST, Mike Stump wrote: > > > >> On Feb 3, 2009, at 2:28 PM, Kasra wrote: > >>> I was looking around the LLVM instruction set > and I failed to find > >>> ROL and ROR instructions. Is there any plans > on adding these > >>> instructions to LLVM? > >> > >> Not sure what you mean: > > > > He's referring to the LLVM IR, I think, and > it's true that doesn't > > have rotates. The LLVM back ends do know about rotate > instructions on > > targets that have them, though, and the llvm > optimizers are pretty > > smart about recognizing the usual ways to express > rotate with shift/ > > and/or, as below. > > > Look in the DAGCombiner.cpp file to see which patterns it > translates > into ROTL and ROTR instructions. > > -bw I guess the backends could know about the instructions. But I am not convinced why it is beneficial not to have ROR and ROL instructions within llvm. > Look in the DAGCombiner.cpp file to see which patterns it > translates > into ROTL and ROTR instructions. Right, I sure will do. -- Kasra From isanbard at gmail.com Tue Feb 3 18:17:49 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 3 Feb 2009 16:17:49 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <368936.96481.qm@web110012.mail.gq1.yahoo.com> References: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> <368936.96481.qm@web110012.mail.gq1.yahoo.com> Message-ID: <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> On Tue, Feb 3, 2009 at 3:54 PM, Kasra wrote: > > I guess the backends could know about the instructions. But I am not convinced why it is beneficial not to have ROR and ROL instructions within llvm. > I guess I could ask you the opposite question: What is the benefit of having these? They would have to be mappable to the source language in some way. I'm not sure about Ada, but I don't know of a "rotate" operator for any of the C variants, or any other high-level language. (This could be my lack of knowledge about other languages.) In C, you specify a rotate by doing shifts and bit-wise operations. This isn't to say that LLVM IR is C-specific. Just that if you did have an LLVM rotate instruction, it would have to be generated by the front-end -- currently by recognizing the same things that the DAG combiner recognizes. And then it may need to be "lowered" for various platforms that don't support it, which is greater than the number of platforms that don't have shifts. If a language came along that had rotate as a primitive and that generated LLVM IR, then you could probably convince people that having the rotates as LLVM IR instructions would be a benefit. We're not above changing the language to support good things. :-) -bw From resistor at mac.com Tue Feb 3 18:20:14 2009 From: resistor at mac.com (Owen Anderson) Date: Tue, 03 Feb 2009 16:20:14 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <368936.96481.qm@web110012.mail.gq1.yahoo.com> References: <368936.96481.qm@web110012.mail.gq1.yahoo.com> Message-ID: <6CF66EEC-1A0A-481C-BA2A-0297B7A47A53@mac.com> On Feb 3, 2009, at 3:54 PM, Kasra wrote: > I guess the backends could know about the instructions. But I am not > convinced why it is beneficial not to have ROR and ROL instructions > within llvm. > How would it be beneficial to have them, if we already generate them at the target level properly? Adding instructions "just because" doesn't seem wise. -Owen From kasra_n500 at yahoo.com Tue Feb 3 18:27:46 2009 From: kasra_n500 at yahoo.com (Kasra) Date: Tue, 3 Feb 2009 16:27:46 -0800 (PST) Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> Message-ID: <20628.49671.qm@web110013.mail.gq1.yahoo.com> --- On Tue, 2/3/09, Bill Wendling wrote: > From: Bill Wendling > Subject: Re: [LLVMdev] rol/ror llvm instruction set > To: kasra_n500 at yahoo.com, "LLVM Developers Mailing List" > Date: Tuesday, February 3, 2009, 4:17 PM > On Tue, Feb 3, 2009 at 3:54 PM, Kasra > wrote: > > > > I guess the backends could know about the > instructions. But I am not convinced why it is beneficial > not to have ROR and ROL instructions within llvm. > > > I guess I could ask you the opposite question: What is the > benefit of > having these? They would have to be mappable to the source > language in > some way. I'm not sure about Ada, but I don't know > of a "rotate" > operator for any of the C variants, or any other high-level > language. > (This could be my lack of knowledge about other languages.) > In C, you > specify a rotate by doing shifts and bit-wise operations. > > This isn't to say that LLVM IR is C-specific. Just that > if you did > have an LLVM rotate instruction, it would have to be > generated by the > front-end -- currently by recognizing the same things that > the DAG > combiner recognizes. And then it may need to be > "lowered" for various > platforms that don't support it, which is greater than > the number of > platforms that don't have shifts. > > If a language came along that had rotate as a primitive and > that > generated LLVM IR, then you could probably convince people > that having > the rotates as LLVM IR instructions would be a benefit. > We're not > above changing the language to support good things. :-) > > -bw You could not be more right. However, rotations was not widely implemented on machines when C and C++ was evolved. Python and other high level languages are just too high level (hence, inefficient) to have such semantics. The argument sounds fine, however, remember that the number of platforms that don't implement an instruction could not be a point of simile. Say we have 100 Linux distribution that do not have a certain feature, we can't conclude that that feature is not a common feature. We should consider the mainstream Linux distributions (and even if we may the giant of current desktops Windows). Since x86 is about (if I am not wrong) 95% of the desktop PC's we could say rotation is implemented on most of the machines where LLVM will be running. I bet out of the people who are reading/following this thread majority are running under x86 any way :D So I guess what I am saying is rotations is very wide spread except that there are many machines out there that do not implement it however, the number of machine that do is far grater. -- Kasra From kasra_n500 at yahoo.com Tue Feb 3 18:37:33 2009 From: kasra_n500 at yahoo.com (Kasra) Date: Tue, 3 Feb 2009 16:37:33 -0800 (PST) Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <6CF66EEC-1A0A-481C-BA2A-0297B7A47A53@mac.com> Message-ID: <351260.97163.qm@web110005.mail.gq1.yahoo.com> --- On Tue, 2/3/09, Owen Anderson wrote: > From: Owen Anderson > Subject: Re: [LLVMdev] rol/ror llvm instruction set > To: kasra_n500 at yahoo.com, "LLVM Developers Mailing List" > Date: Tuesday, February 3, 2009, 4:20 PM > On Feb 3, 2009, at 3:54 PM, Kasra wrote: > > I guess the backends could know about the > instructions. But I am not convinced why it is beneficial > not to have ROR and ROL instructions within llvm. > > > > How would it be beneficial to have them, if we already > generate them at the target level properly? Adding > instructions "just because" doesn't seem wise. > > -Owen If you look at it the way you are it sounds fine. :D However, if we have 1 instruction we reduce the amount of time we will spend optomising. I argued on my previous post that rotations are implemented on most machine (x86 platform). Thus it seems right to include 1 instruction in llvm and translate it to 3 instruction older architectures that have not implemented the rotation instruction yet. I am sure that it is only matter of time before architectures without rotation instruction implementing it. Because of cryptography, it is becoming much more popular about 500% growth over the past decade (AES competitors against SHA-3 competitors). Crypto algorithms really like rotations since it is easily analysed and could be implemented efficiently -- Kasra From jon at ffconsultancy.com Tue Feb 3 18:48:06 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 4 Feb 2009 00:48:06 +0000 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> References: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> <368936.96481.qm@web110012.mail.gq1.yahoo.com> <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> Message-ID: <200902040048.06648.jon@ffconsultancy.com> On Wednesday 04 February 2009 00:17:49 Bill Wendling wrote: > If a language came along that had rotate as a primitive and that > generated LLVM IR, then you could probably convince people that having > the rotates as LLVM IR instructions would be a benefit. We're not > above changing the language to support good things. :-) I would like to include rotations in my HLL implementation but I am not averse to encoding them myself. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From isanbard at gmail.com Tue Feb 3 18:51:14 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 3 Feb 2009 16:51:14 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <20628.49671.qm@web110013.mail.gq1.yahoo.com> References: <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> <20628.49671.qm@web110013.mail.gq1.yahoo.com> Message-ID: <16e5fdf90902031651v47d72708y7975b11c9dad4dd1@mail.gmail.com> On Tue, Feb 3, 2009 at 4:27 PM, Kasra wrote: > > You could not be more right. However, rotations was not widely implemented on machines when C and C++ was evolved. Python and other high level languages are just too high level (hence, inefficient) to have such semantics. > > The argument sounds fine, however, remember that the number of platforms that don't implement an instruction could not be a point of simile. > > Say we have 100 Linux distribution that do not have a certain feature, we can't conclude that that feature is not a common feature. We should consider the mainstream Linux distributions (and even if we may the giant of current desktops Windows). > > Since x86 is about (if I am not wrong) 95% of the desktop PC's we could say rotation is implemented on most of the machines where LLVM will be running. > > I bet out of the people who are reading/following this thread majority are running under x86 any way :D > > So I guess what I am saying is rotations is very wide spread except that there are many machines out there that do not implement it however, the number of machine that do is far grater. > I would not bet against you on that. :-) And, truth be told, it probably wouldn't be too much of a hardship for platforms that don't support rotates to lower them. But it's currently just extra work that we don't need until we have a reason to add the rotate instructions. -bw From schlie at comcast.net Tue Feb 3 21:22:47 2009 From: schlie at comcast.net (Paul Schlie) Date: Tue, 03 Feb 2009 22:22:47 -0500 Subject: [LLVMdev] rol/ror llvm instruction set Message-ID: Dale Johannesen wrote: >On Feb 3, 2009, at 2:35 PMPST, Mike Stump wrote: >> On Feb 3, 2009, at 2:28 PM, Kasra wrote: >>> I was looking around the LLVM instruction set and I failed to find >>> ROL and ROR instructions. Is there any plans on adding these >>> instructions to LLVM? >> >> Not sure what you mean: > > He's referring to the LLVM IR, I think, and it's true that doesn't > have rotates. The LLVM back ends do know about rotate instructions on > targets that have them, though, and the llvm optimizers are pretty > smart about recognizing the usual ways to express rotate with shift/ > and/or, as below. Similarly, ones-complement addition would be helpful (i.e. end around carry checksums like Fletcher and such), as it's not apparent, at least to me, how such sums may be otherwise cleanly defined in a efficient target independent manner (especially when the sums may require precision equal to or exceeding the natural operand size of the target machine)? From nicholas at mxc.ca Tue Feb 3 22:35:11 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 03 Feb 2009 20:35:11 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <351260.97163.qm@web110005.mail.gq1.yahoo.com> References: <351260.97163.qm@web110005.mail.gq1.yahoo.com> Message-ID: <49891AFF.9020108@mxc.ca> Kasra wrote: > --- On Tue, 2/3/09, Owen Anderson wrote: > >> From: Owen Anderson >> Subject: Re: [LLVMdev] rol/ror llvm instruction set >> To: kasra_n500 at yahoo.com, "LLVM Developers Mailing List" >> Date: Tuesday, February 3, 2009, 4:20 PM >> On Feb 3, 2009, at 3:54 PM, Kasra wrote: >>> I guess the backends could know about the >> instructions. But I am not convinced why it is beneficial >> not to have ROR and ROL instructions within llvm. >> How would it be beneficial to have them, if we already >> generate them at the target level properly? Adding >> instructions "just because" doesn't seem wise. >> >> -Owen > > If you look at it the way you are it sounds fine. :D > > However, if we have 1 instruction we reduce the amount of time we will spend optomising. No, it's the other way around. Adding rotate means we have two instructions that represent nearly the same thing, and the optimizers will have to match them both. Designing the IR for a compiler is an art. LLVM has always tried to keep the number of instructions minimal, within reason. This means we write less code when working on an optimizer, analyzer, file formats, etc. Unless you can show a strong case that rotate expresses something that is difficult to capture in the current IR, I don't think we're interested in adding it. You'd have better luck trying to convince me that we should replace shift with rotate than to suggest we have both. Nick > I argued on my previous post that rotations are implemented on most machine (x86 platform). Thus it seems right to include 1 instruction in llvm and translate it to 3 instruction older architectures that have not implemented the rotation instruction yet. > > I am sure that it is only matter of time before architectures without rotation instruction implementing it. Because of cryptography, it is becoming much more popular about 500% growth over the past decade (AES competitors against SHA-3 competitors). Crypto algorithms really like rotations since it is easily analysed and could be implemented efficiently > > -- Kasra > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From baldrick at free.fr Wed Feb 4 00:23:02 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 4 Feb 2009 07:23:02 +0100 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> References: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> <368936.96481.qm@web110012.mail.gq1.yahoo.com> <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> Message-ID: <200902040723.02715.baldrick@free.fr> Hi Bill, > I guess I could ask you the opposite question: What is the benefit of > having these? They would have to be mappable to the source language in > some way. I'm not sure about Ada, but I don't know of a "rotate" > operator for any of the C variants, or any other high-level language.. Ada has rotate. Ciao, Duncan. From zhousheng00 at gmail.com Wed Feb 4 01:33:35 2009 From: zhousheng00 at gmail.com (Zhou Sheng) Date: Wed, 4 Feb 2009 15:33:35 +0800 Subject: [LLVMdev] make TEST=dbgopt donesn't work? Message-ID: <8abe0dc60902032333j5705368dt8f4317606bbe997f@mail.gmail.com> Hi, I'm following http://llvm.org/docs/SourceLevelDebugging.html#debugopt to do the dbgopt testing. But seems, there is something wrong with the Makefile, it told me : llvm-gcc sse.expandfft.c -g --emit-llvm -c -o Output/sse.expandfft.bc llvm-gcc: sse.expandfft.c: No such file or directory llvm-gcc: no input files Am I missing something, like the configure option? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090204/cb8b2766/attachment.html From baldrick at free.fr Wed Feb 4 01:51:40 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 4 Feb 2009 08:51:40 +0100 Subject: [LLVMdev] Promoting i1,i8,i16 In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C78575B@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785721@ssanexmb1.amd.com> <5BA674C5FF7B384A92C2C95D8CC71E1C78575B@ssanexmb1.amd.com> Message-ID: <200902040851.40281.baldrick@free.fr> Hi, > I want to promote i1 to i32 and not i16 as i32 is my native type and i16 > is emulated, but I need to handle i16 as a special case so I don't want > to promote it. you could make i16 an illegal type and custom lower nodes using i16. Ciao, Duncan. From dpatel at apple.com Wed Feb 4 11:45:21 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 4 Feb 2009 09:45:21 -0800 Subject: [LLVMdev] make TEST=dbgopt donesn't work? In-Reply-To: <8abe0dc60902032333j5705368dt8f4317606bbe997f@mail.gmail.com> References: <8abe0dc60902032333j5705368dt8f4317606bbe997f@mail.gmail.com> Message-ID: On Feb 3, 2009, at 11:33 PM, Zhou Sheng wrote: > Hi, > > I'm following http://llvm.org/docs/SourceLevelDebugging.html#debugopt > to do the dbgopt testing. But seems, there is something wrong with > the Makefile, it told me : > > llvm-gcc sse.expandfft.c -g --emit-llvm -c -o Output/sse.expandfft.bc > llvm-gcc: sse.expandfft.c: No such file or directory > llvm-gcc: no input files > > > Am I missing something, like the configure option? Are you able to run nightly test ? Here is what I see... $ make TEST=dbgopt /Developer/usr/bin//llvm-gcc sse.expandfft.c -g --emit-llvm -c -o Output/sse.expandfft.bc /Volumes/Nanpura/mainline/llvm/Debug/bin/opt Output/sse.expandfft.bc - strip-nondebug -strip-debug -std-compile-opts -strip -f -o Output/ sse.expandfft.t.bc /Volumes/Nanpura/mainline/llvm/Debug/bin/llvm-dis Output/ sse.expandfft.t.bc -f -o Output/sse.expandfft.first.ll /Volumes/Nanpura/mainline/llvm/Debug/bin/opt Output/sse.expandfft.bc - strip-nondebug -std-compile-opts -strip-debug -strip -f -o Output/ sse.expandfft.t.bc /Volumes/Nanpura/mainline/llvm/Debug/bin/llvm-dis Output/ sse.expandfft.t.bc -f -o Output/sse.expandfft.second.ll --------- TEST-PASS: sse.expandfft - Devang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090204/b31904cc/attachment.html From isanbard at gmail.com Wed Feb 4 13:03:08 2009 From: isanbard at gmail.com (Bill Wendling) Date: Wed, 4 Feb 2009 11:03:08 -0800 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <200902040723.02715.baldrick@free.fr> References: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> <368936.96481.qm@web110012.mail.gq1.yahoo.com> <16e5fdf90902031617ge95a33ap6981a2bbd23e41e2@mail.gmail.com> <200902040723.02715.baldrick@free.fr> Message-ID: <16e5fdf90902041103k1603612dk53da75c2e394b34d@mail.gmail.com> On Tue, Feb 3, 2009 at 10:23 PM, Duncan Sands wrote: > Hi Bill, > >> I guess I could ask you the opposite question: What is the benefit of >> having these? They would have to be mappable to the source language in >> some way. I'm not sure about Ada, but I don't know of a "rotate" >> operator for any of the C variants, or any other high-level language.. > > Ada has rotate. > Ada has so much. :-) How do you stand on the "LLVM IR-level rotate instruction" issue? Has it been a pain for you to produce good code without it? -bw From baldrick at free.fr Wed Feb 4 13:07:55 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 4 Feb 2009 20:07:55 +0100 Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <16e5fdf90902041103k1603612dk53da75c2e394b34d@mail.gmail.com> References: <16e5fdf90902031452i4f10638w5285cd677528f0ac@mail.gmail.com> <200902040723.02715.baldrick@free.fr> <16e5fdf90902041103k1603612dk53da75c2e394b34d@mail.gmail.com> Message-ID: <200902042007.56278.baldrick@free.fr> > > Ada has rotate. > > > Ada has so much. :-) Too right :) > How do you stand on the "LLVM IR-level rotate > instruction" issue? Has it been a pain for you to produce good code > without it? Nope. In the cases I've looked at (not many!) it produced a rotate machine operation in the final assembler. Ciao, Duncan. From clattner at apple.com Wed Feb 4 13:15:11 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 4 Feb 2009 11:15:11 -0800 Subject: [LLVMdev] -msse3 can degrade performance In-Reply-To: <200902022300.36741.jon@ffconsultancy.com> References: <200901310143.30492.jon@ffconsultancy.com> <200902022039.52256.jon@ffconsultancy.com> <822C6708-800E-49A2-8910-79666146AC74@apple.com> <200902022300.36741.jon@ffconsultancy.com> Message-ID: On Feb 2, 2009, at 3:00 PM, Jon Harrop wrote: > On Monday 02 February 2009 20:37:47 you wrote: >> On Feb 2, 2009, at 12:39 PM, Jon Harrop wrote: >>> On Monday 02 February 2009 06:10:26 Chris Lattner wrote: >>>> I'm seeing exactly identical .s files with -msse2 and -msse3 on the >>>> scimark version I have. Can you please send the output of: >>>> >>>> llvm-gcc -O3 MonteCarlo.c -S -msse2 -o MonteCarlo.2.s >>>> llvm-gcc -O3 MonteCarlo.c -S -msse3 -o MonteCarlo.3.s >>>> >>>> llvm-gcc -O3 MonteCarlo.c -S -msse2 -o MonteCarlo.2.ll -emit-llvm >>>> llvm-gcc -O3 MonteCarlo.c -S -msse3 -o MonteCarlo.3.ll -emit-llvm >>> >>> Can I just check that you had noticed that my timings for those >>> (sse2 vs sse3) >>> were the same and that the difference was occurring between -msse >>> and -msse2 >>> (see below)? >> > The x86 output is attached for those (which give the same results > here too) as > well as -O3 and -O3 -msse which give different results here. Here > are the > performance results I just got when redoing this on x86: > > MonteCarlo: Mflops: 212.20 -O3 > MonteCarlo: Mflops: 211.37 -O3 -msse > MonteCarlo: Mflops: 123.70 -O3 -msse2 > MonteCarlo: Mflops: 127.22 -O3 -msse3 Ok, thanks Jon! I diff'd the files and the -msse2 and -msse3 code is identical, so we're not doing anything wrong with -msse3 :). OTOH, the perf drop from sse -> sse2 is concerning. The difference here is that we do double math in SSE regs instead of FPStack regs. In this case, using the fp stack avoids some cross-class register copying. We could improve the code generator to notice and handle this, I added this note to the x86 backend with some details: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090202/073254.html This is a long-known issue, but a great example of it. > Two other points of interest: > > . I just retimed in x64 and could not reproduce the difference so > this only > afflicts x86 and not x64 as I had said previously. Right, this occurs because of the x86-32 ABI. x86-64 should not be affected. > . Pulling the whole benchmark into a single compilation unit changes > the > performance results completely (still x86): > > $ llvm-gcc -O3 -msse3 -lm all.c -o all > $ ./all > Composite Score: 570.07 > FFT Mflops: 599.40 (N=1024) > SOR Mflops: 476.97 (100 x 100) > MonteCarlo: Mflops: 278.17 > Sparse matmult Mflops: 582.54 (N=1000, nz=5000) > LU Mflops: 913.27 (M=100, N=100) > $ gcc -O3 -msse3 -lm all.c -o all > $ ./all > Composite Score: 539.20 > FFT Mflops: 516.05 (N=1024) > SOR Mflops: 472.29 (100 x 100) > MonteCarlo: Mflops: 167.25 > Sparse matmult Mflops: 633.20 (N=1000, nz=5000) > LU Mflops: 907.20 (M=100, N=100) > > Note that llvm-gcc is achieving almost 280MFLOPS on MonteCarlo here, > far > higher than any competitors, and it is outperforming gcc overall. Great! Do you see the same results with LTO? Inlining Random_nextDouble from random.c to MonteCarlo.c should be a big win. -Chris From nipun2512 at gmail.com Wed Feb 4 15:55:19 2009 From: nipun2512 at gmail.com (Nipun Arora) Date: Wed, 4 Feb 2009 16:55:19 -0500 Subject: [LLVMdev] Creating AST Message-ID: Hi all, Does LLVM provide any way to parse and extract the AST from C++ source files? Thanks Nipun -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090204/7dffd376/attachment.html From schlie at comcast.net Tue Feb 3 14:17:49 2009 From: schlie at comcast.net (Paul Schlie) Date: Tue, 03 Feb 2009 15:17:49 -0500 Subject: [LLVMdev] Adding legal integer sizes to TargetData Message-ID: > Now that 2.5 is about to branch, I'd like to bring up one of Scott's > favorite topics: certain optimizers widen or narrow arithmetic, > without regard for whether the type is legal for the target. In his > specific case, instcombine is turning an i32 multiply into an i64 > multiply in order to eliminate a cast. This does simplify/reduce the > number of IR operations, but an i64 multiply is dramatically more > expensive than an i32 multiply on CellSPU. > > There are a couple of different ways to look at this. ... It would seem most effective to maintain the minimum a required precision associated with each of the operands of a transform (as may often differ); and simply let the target code generator selectively widen them as may be most efficient, not before (although target specific attributes may enable target neutral intermediate optimizers to be more ideally influenced). Maintaining this canonical information enables mapping optimizations to often further narrow intermediate operand minimum precision requirements, and thereby potentially improve efficiency of target code generation; particularly as may be useful for smaller native precision targets and/or multiple operand vector units. IMHO. (seemingly a bit late at this stage of the game, but possibly not?) From mrs at apple.com Wed Feb 4 17:29:13 2009 From: mrs at apple.com (Mike Stump) Date: Wed, 4 Feb 2009 15:29:13 -0800 Subject: [LLVMdev] Creating AST In-Reply-To: References: Message-ID: On Feb 4, 2009, at 1:55 PM, Nipun Arora wrote: > Does LLVM provide any way to parse and extract the AST from C++ > source files? Yes, see clang.llvm.org. If you want to do source to source, see the rewriter, otherwise the normal ast builder is fine. From jon at ffconsultancy.com Wed Feb 4 18:19:39 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 5 Feb 2009 00:19:39 +0000 Subject: [LLVMdev] IR in XML Message-ID: <200902050019.40002.jon@ffconsultancy.com> Is there a tool to spit LLVM's IR out in a more machine-friendly syntax like XML? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From mrs at apple.com Wed Feb 4 19:04:06 2009 From: mrs at apple.com (Mike Stump) Date: Wed, 4 Feb 2009 17:04:06 -0800 Subject: [LLVMdev] IR in XML In-Reply-To: <200902050019.40002.jon@ffconsultancy.com> References: <200902050019.40002.jon@ffconsultancy.com> Message-ID: <5D7DEBFC-EAE9-449C-8738-EE9094CDD9C2@apple.com> On Feb 4, 2009, at 4:19 PM, Jon Harrop wrote: > Is there a tool to spit LLVM's IR out in a more machine-friendly > syntax like XML? Nope, but, you could add one. :-) It should be a AST consumer. From me22.ca at gmail.com Wed Feb 4 19:11:51 2009 From: me22.ca at gmail.com (me22) Date: Wed, 4 Feb 2009 20:11:51 -0500 Subject: [LLVMdev] IR in XML In-Reply-To: <200902050019.40002.jon@ffconsultancy.com> References: <200902050019.40002.jon@ffconsultancy.com> Message-ID: On Wed, Feb 4, 2009 at 19:19, Jon Harrop wrote: > > Is there a tool to spit LLVM's IR out in a more machine-friendly syntax like > XML? > It seems like the correct, if unhelpful, answer to that is bitcode, which is far more machine-friendly (by my definition) than XML. What's your eventual goal? From jon at ffconsultancy.com Wed Feb 4 19:32:39 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 5 Feb 2009 01:32:39 +0000 Subject: [LLVMdev] IR in XML In-Reply-To: References: <200902050019.40002.jon@ffconsultancy.com> Message-ID: <200902050132.39888.jon@ffconsultancy.com> On Thursday 05 February 2009 01:11:51 me22 wrote: > On Wed, Feb 4, 2009 at 19:19, Jon Harrop wrote: > > Is there a tool to spit LLVM's IR out in a more machine-friendly syntax > > like XML? > > It seems like the correct, if unhelpful, answer to that is bitcode, > which is far more machine-friendly (by my definition) than XML. I am toying with the idea of a managed IR that converts LLVM IR into code for my VM that runs in a safe environment, allowing code compiled from any of LLVM's front-ends (C, C++, Fortran) to be run in a managed environment and then interoperated with more easily (and safely) from other managed languages. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From aoeullvm at brinckerhoff.org Wed Feb 4 19:30:46 2009 From: aoeullvm at brinckerhoff.org (John Clements) Date: Wed, 4 Feb 2009 17:30:46 -0800 Subject: [LLVMdev] IR in XML In-Reply-To: References: <200902050019.40002.jon@ffconsultancy.com> Message-ID: <94B21332-2F27-441A-8EBC-FAD0A2329756@brinckerhoff.org> On Feb 4, 2009, at 5:11 PM, me22 wrote: > On Wed, Feb 4, 2009 at 19:19, Jon Harrop > wrote: >> >> Is there a tool to spit LLVM's IR out in a more machine-friendly >> syntax like >> XML? >> > > It seems like the correct, if unhelpful, answer to that is bitcode, > which is far more machine-friendly (by my definition) than XML. > > What's your eventual goal? I'm not the one who asked, but I would love to see something like this, not because it's more machine-friendly, but because it's a representation that most languages can handle. To wit: I'm manipulating IR in Scheme. If there was a two-way map between XML terms and LLVM IR, my life would be much simpler. S-expressions would be even better, but I'm not holding my breath. :) John Clements From nipun2512 at gmail.com Wed Feb 4 20:51:38 2009 From: nipun2512 at gmail.com (Nipun Arora) Date: Wed, 4 Feb 2009 21:51:38 -0500 Subject: [LLVMdev] Creating AST In-Reply-To: References: Message-ID: Hi Mike, Thanks for the response, doesn't Clang only support C and not C++? Thanks Nipun On Wed, Feb 4, 2009 at 6:29 PM, Mike Stump wrote: > On Feb 4, 2009, at 1:55 PM, Nipun Arora wrote: > > Does LLVM provide any way to parse and extract the AST from C++ > > source files? > > Yes, see clang.llvm.org. If you want to do source to source, see the > rewriter, otherwise the normal ast builder is fine. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090204/a1bfddf7/attachment.html From mrs at apple.com Wed Feb 4 21:03:10 2009 From: mrs at apple.com (Mike Stump) Date: Wed, 4 Feb 2009 19:03:10 -0800 Subject: [LLVMdev] Creating AST In-Reply-To: References: Message-ID: <80189F32-A621-474C-A398-20E842162E35@apple.com> On Feb 4, 2009, at 6:51 PM, Nipun Arora wrote: > Thanks for the response, doesn't Clang only support C and not C++? Check out the link I sent, under Current Status. From nipun2512 at gmail.com Wed Feb 4 21:18:10 2009 From: nipun2512 at gmail.com (Nipun Arora) Date: Wed, 4 Feb 2009 22:18:10 -0500 Subject: [LLVMdev] Installations problems CLANG Message-ID: Hi, I was having a little trouble installing clang.... while llvm installs properly but clang gives this error on invoking make in Clang make[2]: Leaving directory `/home/na2271/Desktop/llvm-2.3-x/tools/clang/lib/Headers' make[2]: Entering directory `/home/na2271/Desktop/llvm-2.3-x/tools/clang/lib/Basic' llvm[2]: Compiling SourceManager.cpp for Release build SourceManager.cpp: In member function 'void clang::LineTableInfo::clear()': SourceManager.cpp:124: error: 'class llvm::StringMap' has no member named 'clear' SourceManager.cpp: In member function 'const clang::SrcMgr::ContentCache* clang::SourceManager::getOrCreateContentCache(const clang::FileEntry*)': SourceManager.cpp:367: error: no matching function for call to 'llvm::BumpPtrAllocator::Allocate(int, unsigned int&)' SourceManager.cpp: In member function 'const clang::SrcMgr::ContentCache* clang::SourceManager::createMemBufferContentCache(const llvm::MemoryBuffer*)': SourceManager.cpp:382: error: no matching function for call to 'llvm::BumpPtrAllocator::Allocate(int, unsigned int&)' SourceManager.cpp: In function 'void ComputeLineNumbers(clang::SrcMgr::ContentCache*, llvm::BumpPtrAllocator&)': SourceManager.cpp:678: error: no matching function for call to 'llvm::BumpPtrAllocator::Allocate(size_t)' Thanks Nipun -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090204/64ec4d25/attachment.html From kremenek at apple.com Wed Feb 4 21:27:27 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 4 Feb 2009 19:27:27 -0800 Subject: [LLVMdev] Installations problems CLANG In-Reply-To: References: Message-ID: <56405859-FD68-4ED6-B60B-0081B43094C9@apple.com> Hi Nipun, You need to use top-of-tree LLVM (LLVM from SVN). There is no revision of Clang right now that is tied to a specific LLVM release. Ted On Feb 4, 2009, at 7:18 PM, Nipun Arora wrote: > Hi, > > I was having a little trouble installing clang.... while llvm > installs properly but clang gives this error on invoking make in Clang > > make[2]: Leaving directory `/home/na2271/Desktop/llvm-2.3-x/tools/ > clang/lib/Headers' > make[2]: Entering directory `/home/na2271/Desktop/llvm-2.3-x/tools/ > clang/lib/Basic' > llvm[2]: Compiling SourceManager.cpp for Release build > SourceManager.cpp: In member function 'void > clang::LineTableInfo::clear()': > SourceManager.cpp:124: error: 'class llvm::StringMap llvm::BumpPtrAllocator>' has no member named 'clear' > SourceManager.cpp: In member function 'const > clang::SrcMgr::ContentCache* > clang::SourceManager::getOrCreateContentCache(const > clang::FileEntry*)': > SourceManager.cpp:367: error: no matching function for call to > 'llvm::BumpPtrAllocator::Allocate(int, unsigned int&)' > SourceManager.cpp: In member function 'const > clang::SrcMgr::ContentCache* > clang::SourceManager::createMemBufferContentCache(const > llvm::MemoryBuffer*)': > SourceManager.cpp:382: error: no matching function for call to > 'llvm::BumpPtrAllocator::Allocate(int, unsigned int&)' > SourceManager.cpp: In function 'void > ComputeLineNumbers(clang::SrcMgr::ContentCache*, > llvm::BumpPtrAllocator&)': > SourceManager.cpp:678: error: no matching function for call to > 'llvm::BumpPtrAllocator::Allocate(size_t)' > > > Thanks > Nipun > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From tonic at nondot.org Wed Feb 4 23:10:12 2009 From: tonic at nondot.org (Tanya Lattner) Date: Wed, 4 Feb 2009 21:10:12 -0800 Subject: [LLVMdev] problems building googletest for 2.5 Message-ID: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> Google Test requires these CPP FLAGS "-Wno-missing-field-initializers - Wno-variadic-macros" in order to not output warnings. However, these flags are only available with gcc 4.X. We don't want to prevent users from being able to build with gcc 3.X which is the current situation (http://llvm.org/PR3487 ). I've disabled building Google Test in the 2.5 branch. Hopefully someone can fix this issue and we can get it merged into the release branch. Ideally, configure should detect what version of gcc you have and either use those CPP FLAGS or not (it will output warnings with 3.X). Or, we just disable building Google Test by default. Can anyone help with this? Thanks, Tanya -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090204/348df8fc/attachment.html From mrs at apple.com Wed Feb 4 23:26:28 2009 From: mrs at apple.com (Mike Stump) Date: Wed, 4 Feb 2009 21:26:28 -0800 Subject: [LLVMdev] problems building googletest for 2.5 In-Reply-To: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> References: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> Message-ID: <56DE3698-E325-4D9F-A37E-724F0EDF8BE5@apple.com> On Feb 4, 2009, at 9:10 PM, Tanya Lattner wrote: > Ideally, configure should detect what version of gcc you have and > either use those CPP FLAGS or not (it will output warnings with > 3.X). Or, we just disable building Google Test by default. > Can anyone help with this? Here is a Makefile fragment that will dynamically test gcc and add flags, if the flag is supported. I'll leave it up to others to consider and/or integrate it and consider if /dev/null is portable enough. FLAGS := $(shell gcc -Wall -fsyntax-only -xc /dev/null 2>/dev/null && echo -Wall) CFLAGS := $(shell gcc -Wallme -fsyntax-only -xc /dev/null 2>/dev/null && echo -Wallme) all: @echo flags are $(FLAGS) @echo flags are $(CFLAGS) The down side, these execute every time the fragment is read. If limited to just a few directories, it should be fine, Makefile.common would hurt. From clattner at apple.com Wed Feb 4 23:39:29 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 4 Feb 2009 21:39:29 -0800 Subject: [LLVMdev] problems building googletest for 2.5 In-Reply-To: <56DE3698-E325-4D9F-A37E-724F0EDF8BE5@apple.com> References: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> <56DE3698-E325-4D9F-A37E-724F0EDF8BE5@apple.com> Message-ID: On Feb 4, 2009, at 9:26 PM, Mike Stump wrote: > On Feb 4, 2009, at 9:10 PM, Tanya Lattner wrote: >> Ideally, configure should detect what version of gcc you have and >> either use those CPP FLAGS or not (it will output warnings with >> 3.X). Or, we just disable building Google Test by default. > >> Can anyone help with this? > > Here is a Makefile fragment that will dynamically test gcc and add > flags, if the flag is supported. I'll leave it up to others to > consider and/or integrate it and consider if /dev/null is portable > enough. > > FLAGS := $(shell gcc -Wall -fsyntax-only -xc /dev/null 2>/dev/null && > echo -Wall) > CFLAGS := $(shell gcc -Wallme -fsyntax-only -xc /dev/null 2>/dev/null > && echo -Wallme) > all: > @echo flags are $(FLAGS) > @echo flags are $(CFLAGS) > > The down side, these execute every time the fragment is read. If > limited to just a few directories, it should be fine, Makefile.common > would hurt. Putting it into the one directory (utils/unittest) that needs it should be fine. Can you please test and apply a patch to mainline? Thanks Mike, -Chris From zhousheng at autoesl.com Wed Feb 4 23:59:30 2009 From: zhousheng at autoesl.com (Sheng Zhou) Date: Thu, 05 Feb 2009 13:59:30 +0800 Subject: [LLVMdev] make TEST=dbgopt donesn't work? Message-ID: <498A8042.2050608@gmail.com> > > Are you able to run nightly test ? > Yes, I can run nightly test. > Here is what I see... > > $ make TEST=dbgopt > /Developer/usr/bin//llvm-gcc sse.expandfft.c -g --emit-llvm -c -o > Output/sse.expandfft.bc > /Volumes/Nanpura/mainline/llvm/Debug/bin/opt Output/sse.expandfft.bc - > strip-nondebug -strip-debug -std-compile-opts -strip -f -o Output/ > sse.expandfft.t.bc > /Volumes/Nanpura/mainline/llvm/Debug/bin/llvm-dis Output/ > sse.expandfft.t.bc -f -o Output/sse.expandfft.first.ll > /Volumes/Nanpura/mainline/llvm/Debug/bin/opt Output/sse.expandfft.bc - > strip-nondebug -std-compile-opts -strip-debug -strip -f -o Output/ > sse.expandfft.t.bc > /Volumes/Nanpura/mainline/llvm/Debug/bin/llvm-dis Output/ > sse.expandfft.t.bc -f -o Output/sse.expandfft.second.ll > --------- TEST-PASS: sse.expandfft I modified the TEST.dbgopt.Makefile, just adding the full path of the input C/C++ source file, like sse.expandfft.c Now it works for SingleSource directory. But for MultiSource, I see... make[1]: Entering directory `/developer/home2/zsth/projects/llvm.org/build/llvmobj/projects/llvm-test/MultiSource' make[2]: Entering directory `/developer/home2/zsth/projects/llvm.org/build/llvmobj/projects/llvm-test/MultiSource/Applications' make[3]: Entering directory `/developer/home2/zsth/projects/llvm.org/build/llvmobj/projects/llvm-test/MultiSource/Applications/Burg' make[3]: *** No rule to make target `Output/burg.diff', needed by `test.dbgopt.burg'. Stop. make[3]: Leaving directory `/developer/home2/zsth/projects/llvm.org/build/llvmobj/projects/llvm-test/MultiSource/Applications/Burg' make[2]: *** [Burg/.maketest] Error 2 make[2]: Leaving directory `/developer/home2/zsth/projects/llvm.org/build/llvmobj/projects/llvm-test/MultiSource/Applications' make[1]: *** [Applications/.maketest] Error 2 make[1]: Leaving directory `/developer/home2/zsth/projects/llvm.org/build/llvmobj/projects/llvm-test/MultiSource' make: *** [MultiSource/.maketest] Error 2 From jon at ffconsultancy.com Thu Feb 5 00:26:08 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 5 Feb 2009 06:26:08 +0000 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <200902022052.12984.baldrick@free.fr> References: <200902011338.25003.jon@ffconsultancy.com> <200902022052.12984.baldrick@free.fr> Message-ID: <200902050626.08845.jon@ffconsultancy.com> On Monday 02 February 2009 19:52:12 Duncan Sands wrote: > Hi Jon, check out > > http://llvm.org/docs/LangRef.html#aggregateops Wonderful, thank you. I missed this because it wasn't in the C and OCaml bindings in LLVM 2.4... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From jon at ffconsultancy.com Thu Feb 5 00:48:51 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 5 Feb 2009 06:48:51 +0000 Subject: [LLVMdev] C++ ray tracer performance: gcc 4.3.2 vs llvm-gcc 4.2.1 Message-ID: <200902050648.51745.jon@ffconsultancy.com> On the off chance anyone here is interested in more performance results, I compiled and ran the fastest of the implementations of the ray tracer in C++ from my language comparison: http://www.ffconsultancy.com/languages/ray_tracer/ This is a small program with a relatively large hotpath. Specifically, around 30% of the time is spend in the ray sphere intersection but another 30% is also spent in the intersect function that recursively traverses a hierarchical scene. This is on an (eight core) 2.1GHz Opteron 2352, compile time and then run time for each compiler in 32- and 64-bit mode: x86: $ time g++ -O3 -msse3 -ffast-math ray.cpp -o ray real 0m0.770s $ time ./ray 9 512 >image.pgm real 0m3.772s $ time llvm-g++ -O3 -msse3 -ffast-math ray.cpp -o ray real 0m0.746s $ time ./ray 9 512 >image.pgm real 0m3.278s x64: $ time g++ -O3 -ffast-math ray.cpp -o ray real 0m0.774s $ time ./ray 9 512 >image.pgm real 0m3.068s $ llvm-g++ -O3 -ffast-math ray.cpp -o ray real 0m0.741s $ time ./ray 9 512 >image.pgm real 0m3.009s Note that llvm-gcc is generating faster code than GCC in both cases and, in particular, is 15% faster on x86! In fact, I have tried many different command line options to GCC (including -march=barcelona) and none can beat LLVM. I find these recent results so compelling that I intend to benchmark larger programs such as FFTW in the future. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From romix.llvm at googlemail.com Thu Feb 5 10:08:33 2009 From: romix.llvm at googlemail.com (Roman Levenstein) Date: Thu, 5 Feb 2009 17:08:33 +0100 Subject: [LLVMdev] LLVM misses some cross-MBB and loop optimizations compared to GCC Message-ID: Hi, While testing my new register allocators on some test-cases, I've noticed that LLVM misses sometimes some optimization opportunities: 1) LocalSpiller::RewriteMBB seems not to propagate the information about e.g. Spills between MBBs.In many cases, where MBB B1 has only one predecessor MBB B2, B1 could reuse the information about the physical registers that are in the live-out set of B2. This could help to e.g. eliminate some useless reloads from spill slots, if the value is available on the required physical register already. For example, in the example below, the marked "movl 12(%esp), %ecx" instruction could be eliminated. .LBB2_2: # bb31 movl 12(%esp), %ecx movl 8(%esp), %eax cmpl $0, up+28(%eax,%ecx,4) je .LBB2_9 # bb569 .LBB2_3: # bb41 ; <--- bb31 is the only predecessor of bb41 movl 12(%esp), %ecx ; <--- This could be eliminated!!! movl 4(%esp), %eax cmpl $0, down(%eax,%ecx,4) je .LBB2_9 # bb569 It is also worth mentioning, that currently reloads from spill slots are not recorded in the Spills set using the addAvailable method, as far as I can see. Wouldn't it make sense? I have the feeling that these improvements are rather easy to achieve and would not require too much changes to the LocalSpiller. Probably, we just need to keep the live-out set of the MBB around after rewriting it, so that its successors can use it in some cases as initial value for the Spills set. Any opinions? 2) Moving of sub-expressions from loops and replacement of array accesses via pointer-based induction variables is also not optimal in some situations. In the example mentioned above, both blocks are executed inside a loop enclosing them. And they keep evaluating e.g. the down(%eax,%ecx,4) expression on every iteration. GCC at the same time hoists this expression outside of the loop and replaces it with a simple pointer, as you can see below: .LBB2_2: movl -32(%ebp), %edx movl 28(%edx), %eax testl %eax, %eax je .L5 .LBB2_3: movl -48(%ebp), %eax movl (%eax), %edi testl %edi, %edi je .L5 To make it possible for you to analyze this test-case, I attach the source file, the BC file and the output of the code produced by LLVM and by "GCC -O6". -Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: 8q_speed.c.s Type: application/octet-stream Size: 10447 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/ae02bf82/attachment-0004.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 8q_speed.s.gcc Type: application/octet-stream Size: 12531 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/ae02bf82/attachment-0005.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 8q_speed.c.bc Type: application/octet-stream Size: 4720 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/ae02bf82/attachment-0006.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 8q_speed.c Type: application/octet-stream Size: 594 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/ae02bf82/attachment-0007.obj From Micah.Villmow at amd.com Thu Feb 5 10:40:26 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 5 Feb 2009 08:40:26 -0800 Subject: [LLVMdev] CallingConv Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785ADF@ssanexmb1.amd.com> Currently with my understanding of using callingconv.td I still need to lower three functions, FORMAL_ARGUMENTS, CALL, and RET. Is there any known way to have LLVM automagically generate code from tablegen without having to custom lower these functions? The reasoning for this is that all registers are virtual in my backend and I have specified for llvm to use it's generic dynamic stack allocation. So if I can give llvm a list of registers to use, it should be able to handle these functions for me, correct? Is this possible, or am I wanting to much? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/865d18ba/attachment.html From gohman at apple.com Thu Feb 5 13:02:28 2009 From: gohman at apple.com (Dan Gohman) Date: Thu, 5 Feb 2009 11:02:28 -0800 Subject: [LLVMdev] CallingConv In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785ADF@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785ADF@ssanexmb1.amd.com> Message-ID: <6FA079C3-FDEA-4FFA-A96B-F78F93BE12CE@apple.com> No, the current tablegen CallingConv infrastructure isn't yet able to do that. I agree that this seems like something it should be able to do though. Patches would be welcome :-). Dan On Feb 5, 2009, at 8:40 AM, Villmow, Micah wrote: > Currently with my understanding of using callingconv.td I still need > to lower three functions, FORMAL_ARGUMENTS, CALL, and RET. Is there > any known way to have LLVM automagically generate code from tablegen > without having to custom lower these functions? The reasoning for > this is that all registers are virtual in my backend and I have > specified for llvm to use it?s generic dynamic stack allocation. So > if I can give llvm a list of registers to use, it should be able to > handle these functions for me, correct? > > Is this possible, or am I wanting to much? > > Thanks, > Micah Villmow > Systems Engineer > Advanced Technology & Performance > Advanced Micro Devices Inc. > S1-609 One AMD Place > Sunnyvale, CA. 94085 > P: 408-749-3966 > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From gohman at apple.com Thu Feb 5 13:07:30 2009 From: gohman at apple.com (Dan Gohman) Date: Thu, 5 Feb 2009 11:07:30 -0800 Subject: [LLVMdev] make TEST=dbgopt donesn't work? In-Reply-To: <8abe0dc60902032333j5705368dt8f4317606bbe997f@mail.gmail.com> References: <8abe0dc60902032333j5705368dt8f4317606bbe997f@mail.gmail.com> Message-ID: This is a long-standing quirk of the test-suite Makefiles. When you run the main llvm configure script, you need to have an llvm-gcc in your path. Building and testing llvm on a platform that doesn't already have an llvm-gcc seems to require this sequence: build llvm build and install llvm-gcc rerun llvm's configure script, with llvm-gcc in PATH build test-suite Dan On Feb 3, 2009, at 11:33 PM, Zhou Sheng wrote: > Hi, > > I'm following http://llvm.org/docs/SourceLevelDebugging.html#debugopt > to do the dbgopt testing. But seems, there is something wrong with > the Makefile, it told me : > > llvm-gcc sse.expandfft.c -g --emit-llvm -c -o Output/sse.expandfft.bc > llvm-gcc: sse.expandfft.c: No such file or directory > llvm-gcc: no input files > > > Am I missing something, like the configure option? > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From jon at ffconsultancy.com Thu Feb 5 13:22:14 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 5 Feb 2009 19:22:14 +0000 Subject: [LLVMdev] GEPping GEPs and first-class structs In-Reply-To: <796EE1FB-ECE7-4F14-A794-6454DD681EFE@apple.com> References: <200902011338.25003.jon@ffconsultancy.com> <796EE1FB-ECE7-4F14-A794-6454DD681EFE@apple.com> Message-ID: <200902051922.14719.jon@ffconsultancy.com> On Monday 02 February 2009 19:25:46 Chris Lattner wrote: > On Feb 1, 2009, at 5:38 AM, Jon Harrop wrote: > > As I understand it, first-class structs will allow structs to be > > passed as > > first-class structs already exist. :) Hmm, I cannot get them to work. I suspect the problem is somewhere between OCaml and their implementation within LLVM because I am not even seeing the instructions when I visualize my function. I have augmented "bindings/ocaml/llvm/llvm.ml*", "include/llvm-c/Core.h" and "lib/VMCore/Core.cpp" with functions to handle InsertValue and ExtractValue. Any ideas? Maybe I should leave this until 2.5... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From Micah.Villmow at amd.com Thu Feb 5 13:47:50 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 5 Feb 2009 11:47:50 -0800 Subject: [LLVMdev] 16 bit floats Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> I need to support 16 bit floats for some operations, outside of datatypes.td and the constants class, is there anything else I will need to modify to add f16 support? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/3933e397/attachment.html From dekruijf at cs.wisc.edu Thu Feb 5 14:29:21 2009 From: dekruijf at cs.wisc.edu (Marc de Kruijf) Date: Thu, 5 Feb 2009 14:29:21 -0600 Subject: [LLVMdev] Linking with OpenMP support Message-ID: I'm trying to compile and link an x86 assembly file with OpenMP calls using llvm-gcc 4.2.1 and I get the following errors: /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(team.o): In function `gomp_team_start': (.text+0x15a): undefined reference to `__sync_bool_compare_and_swap_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(team.o): In function `gomp_team_start': (.text+0x192): undefined reference to `__sync_lock_test_and_set_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(team.o): In function `gomp_team_start': (.text+0x31f): undefined reference to `__sync_bool_compare_and_swap_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(team.o): In function `gomp_team_start': (.text+0x357): undefined reference to `__sync_lock_test_and_set_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(team.o): In function `gomp_team_end': (.text+0x48d): undefined reference to `__sync_bool_compare_and_swap_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(mutex.o): In function `gomp_mutex_lock_slow': (.text+0x20): undefined reference to `__sync_val_compare_and_swap_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(mutex.o): In function `gomp_mutex_lock_slow': (.text+0x4e): undefined reference to `__sync_bool_compare_and_swap_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(bar.o): In function `gomp_barrier_wait_end': (.text+0x2e): undefined reference to `__sync_lock_test_and_set_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(bar.o): In function `gomp_barrier_wait_end': (.text+0x6c): undefined reference to `__sync_add_and_fetch_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(bar.o): In function `gomp_barrier_wait_end': (.text+0x83): undefined reference to `__sync_lock_test_and_set_4' /afs/ cs.wisc.edu/p/vertical/tools/llvm-gcc-4.2/bin/../lib/gcc/i686-pc-linux-gnu/4.2.1/../../../libgomp.a(bar.o): In function `gomp_barrier_wait': (.text+0xde): undefined reference to `__sync_bool_compare_and_swap_4' collect2: ld returned 1 exit status I don't get this problem when using gcc 4.2.1. I this not supported then? Any ideas? Marc -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/36e289b3/attachment.html From mrs at apple.com Thu Feb 5 14:47:34 2009 From: mrs at apple.com (Mike Stump) Date: Thu, 5 Feb 2009 12:47:34 -0800 Subject: [LLVMdev] problems building googletest for 2.5 In-Reply-To: References: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> <56DE3698-E325-4D9F-A37E-724F0EDF8BE5@apple.com> Message-ID: On Feb 4, 2009, at 9:39 PM, Chris Lattner wrote: > Putting it into the one directory (utils/unittest) that needs it > should be fine. Can you please test and apply a patch to mainline? Sure, but I'm skeptical this will actually help gcc 3.X. I checked in the code, and tested on a gcc 4.2 system. Someone else would have to try 3.X. From cr88192 at hotmail.com Thu Feb 5 14:51:03 2009 From: cr88192 at hotmail.com (BGB) Date: Fri, 6 Feb 2009 06:51:03 +1000 Subject: [LLVMdev] 16 bit floats References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> Message-ID: ----- Original Message ----- From: Villmow, Micah To: LLVM Developers Mailing List Sent: Friday, February 06, 2009 5:47 AM Subject: [LLVMdev] 16 bit floats I need to support 16 bit floats for some operations, outside of datatypes.td and the constants class, is there anything else I will need to modify to add f16 support? probably also code generation (can't give specifics, no real expert on the LLVM codebase). this would be because, even if the core typesystem knows of the type, the codegen might not know how to emit operations on that type. now, of note: in my project (not LLVM based), float16 had not been supported directly (since it is not known to the CPU), rather, some loader and saver thunks were used which converted to/from float32 (this used as the 'internal' representation of the type). in most cases, I would think this would be faster than directly operating on the float16, since the CPU supports float32, but float16 would have to be emulated. (unless of course newer CPUs are adding native float16 support or similar?...). Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 ------------------------------------------------------------------------------ _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/3e747c48/attachment.html From clattner at apple.com Thu Feb 5 14:53:06 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 5 Feb 2009 12:53:06 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> Message-ID: <8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com> On Feb 5, 2009, at 12:51 PM, BGB wrote: > > ----- Original Message ----- > From: Villmow, Micah > To: LLVM Developers Mailing List > Sent: Friday, February 06, 2009 5:47 AM > Subject: [LLVMdev] 16 bit floats > > I need to support 16 bit floats for some operations, outside of > datatypes.td and the constants class, is there anything else I will > need to modify to add f16 support? > > probably also code generation (can't give specifics, no real expert > on the LLVM codebase). > this would be because, even if the core typesystem knows of the > type, the codegen might not know how to emit operations on that type. > > now, of note: > in my project (not LLVM based), float16 had not been supported > directly (since it is not known to the CPU), rather, some loader and > saver thunks were used which converted to/from float32 (this used as > the 'internal' representation of the type). in most cases, I would > think this would be faster than directly operating on the float16, > since the CPU supports float32, but float16 would have to be emulated. > > (unless of course newer CPUs are adding native float16 support or > similar?...). > Right. Micah, does your CPU support float16 operations like add/sub etc natively? -Chirs -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/0d12c121/attachment.html From Micah.Villmow at amd.com Thu Feb 5 15:34:36 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 5 Feb 2009 13:34:36 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: <8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> <8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> BGB/Chris, I need to do a similar where I convert the 16bit floats to 32bit floats on memory operations for both scalar and vector formats. So can these operations be implemented without adding 16 bit float support natively to LLVM? If so, how? ________________________________ From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Lattner Sent: Thursday, February 05, 2009 12:53 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] 16 bit floats On Feb 5, 2009, at 12:51 PM, BGB wrote: ----- Original Message ----- From: Villmow, Micah To: LLVM Developers Mailing List Sent: Friday, February 06, 2009 5:47 AM Subject: [LLVMdev] 16 bit floats I need to support 16 bit floats for some operations, outside of datatypes.td and the constants class, is there anything else I will need to modify to add f16 support? probably also code generation (can't give specifics, no real expert on the LLVM codebase). this would be because, even if the core typesystem knows of the type, the codegen might not know how to emit operations on that type. now, of note: in my project (not LLVM based), float16 had not been supported directly (since it is not known to the CPU), rather, some loader and saver thunks were used which converted to/from float32 (this used as the 'internal' representation of the type). in most cases, I would think this would be faster than directly operating on the float16, since the CPU supports float32, but float16 would have to be emulated. (unless of course newer CPUs are adding native float16 support or similar?...). Right. Micah, does your CPU support float16 operations like add/sub etc natively? -Chirs -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/a69bcc58/attachment-0001.html From eli.friedman at gmail.com Thu Feb 5 15:53:53 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 5 Feb 2009 13:53:53 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> <8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com> <5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> Message-ID: On Thu, Feb 5, 2009 at 1:34 PM, Villmow, Micah wrote: > I need to do a similar where I convert the 16bit floats to 32bit floats on > memory operations for both scalar and vector formats. So can these > operations be implemented without adding 16 bit float support natively to > LLVM? If so, how? In this case, you only really need two currently unsupported instructions: one that does f16->f32, and one that does f32->f16; adding target intrinsics to do that should be easy. You can make the instrinsics take an i16 so that the type system doesn't have to be aware of f16 values. -Eli From clattner at apple.com Thu Feb 5 16:33:40 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 5 Feb 2009 14:33:40 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> <8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com> <5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> Message-ID: <0EC43A9F-F673-4CE0-B41A-6DB9DDC00B55@apple.com> On Feb 5, 2009, at 1:34 PM, Villmow, Micah wrote: > BGB/Chris, > I need to do a similar where I convert the 16bit floats to 32bit > floats on memory operations for both scalar and vector formats. So > can these operations be implemented without adding 16 bit float > support natively to LLVM? If so, how? Just codegen them as i16 in LLVM IR, and use a library function to convert the i16 into a 32-bit float doing the necessary unpacking. Similarly for store. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/d756b658/attachment.html From Micah.Villmow at amd.com Thu Feb 5 16:33:22 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 5 Feb 2009 14:33:22 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com><8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com><5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785B90@ssanexmb1.amd.com> Eli, This is similar to what I was originally thinking, but I also need to support i16 data type and conversions between it and floating point values. So would there be a way for me to distinguish between a half and a short? For example, I have the short a = load_from_memory(short_ptr, index); and half a = load_from_memory(half_ptr, index); if I force it to use i16 wouldn't the function be the sam in the IR? i.e. declare i16 @load_from_memory(i16*, i32)? Thanks, Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Eli Friedman Sent: Thursday, February 05, 2009 1:54 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] 16 bit floats On Thu, Feb 5, 2009 at 1:34 PM, Villmow, Micah wrote: > I need to do a similar where I convert the 16bit floats to 32bit floats on > memory operations for both scalar and vector formats. So can these > operations be implemented without adding 16 bit float support natively to > LLVM? If so, how? In this case, you only really need two currently unsupported instructions: one that does f16->f32, and one that does f32->f16; adding target intrinsics to do that should be easy. You can make the instrinsics take an i16 so that the type system doesn't have to be aware of f16 values. -Eli _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From Micah.Villmow at amd.com Thu Feb 5 16:38:09 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 5 Feb 2009 14:38:09 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: <0EC43A9F-F673-4CE0-B41A-6DB9DDC00B55@apple.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com><8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com><5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> <0EC43A9F-F673-4CE0-B41A-6DB9DDC00B55@apple.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C785B92@ssanexmb1.amd.com> Thanks for the clarification. This is probably what I will end up doing. ________________________________ From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Lattner Sent: Thursday, February 05, 2009 2:34 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] 16 bit floats On Feb 5, 2009, at 1:34 PM, Villmow, Micah wrote: BGB/Chris, I need to do a similar where I convert the 16bit floats to 32bit floats on memory operations for both scalar and vector formats. So can these operations be implemented without adding 16 bit float support natively to LLVM? If so, how? Just codegen them as i16 in LLVM IR, and use a library function to convert the i16 into a 32-bit float doing the necessary unpacking. Similarly for store. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/6344154b/attachment.html From jon at ffconsultancy.com Thu Feb 5 17:42:58 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 5 Feb 2009 23:42:58 +0000 Subject: [LLVMdev] First class function pointers Message-ID: <200902052342.58602.jon@ffconsultancy.com> Unless I am mistaken, LLVM barfs if I try to pass an LLVM function to another function as an argument because functions are second class. However, if I const bitcast the LLVM function to a function pointer when it is defined then I can use that as a first-class function pointer. In particular, I can invoke it directly by emitting a "call" without having to cast it back beforehand. Is that correct and, if so, why are functions not handled as first-class function pointers by default? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From dag at cray.com Thu Feb 5 19:30:19 2009 From: dag at cray.com (David Greene) Date: Thu, 5 Feb 2009 19:30:19 -0600 Subject: [LLVMdev] undefs in phis In-Reply-To: <43E485F3-6440-4F76-9E8B-D1BE97D9C937@apple.com> References: <200901291647.26270.dag@cray.com> <43E485F3-6440-4F76-9E8B-D1BE97D9C937@apple.com> Message-ID: <200902051930.19876.dag@cray.com> On Monday 02 February 2009 23:55, Evan Cheng wrote: > >> Think about what will happen the 2nd iteration. ?%v1177 will have > >> the value of > >> %v1645 which is wrong. ?This is because %v1176 in bb74 will be > >> replaced with > >> %v1177. ?That's incorrect. > > > > Ok, right. The trick to fixing is to make sure the valno of the def of > > v1177 hasPHIKill to true and make sure the coalescer checks it. What does hasPHIKill mean, what are the consequences of using it and how do I know when I can set it? I assume this would have been set by someone. Or is that part of the bug? > Actually liveintervals can construct a v1177 live range starting from ? > the beginning mbb with a val# of unknown def. What's "the beginning mbb?" The start of the bb containing the phi (bb74 in this case)? Again, how would I instruct liveintervals to do this? The IMPLICIT_DEF is already gone by that point. -Dave From eli.friedman at gmail.com Thu Feb 5 19:47:40 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 5 Feb 2009 17:47:40 -0800 Subject: [LLVMdev] 16 bit floats In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C785B90@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C785B38@ssanexmb1.amd.com> <8457B2F7-743D-47B1-867E-E5725B6BF9D6@apple.com> <5BA674C5FF7B384A92C2C95D8CC71E1C785B65@ssanexmb1.amd.com> <5BA674C5FF7B384A92C2C95D8CC71E1C785B90@ssanexmb1.amd.com> Message-ID: On Thu, Feb 5, 2009 at 2:33 PM, Villmow, Micah wrote: > short a = load_from_memory(short_ptr, index); > and > half a = load_from_memory(half_ptr, index); I was suggesting something more like the following pseudo-C: short a = load_from_memory(short_ptr, index); float a = convert_f16_f32(load_from_memory(half_ptr, index)); -Eli From evan.cheng at apple.com Thu Feb 5 19:55:41 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Thu, 5 Feb 2009 17:55:41 -0800 Subject: [LLVMdev] undefs in phis In-Reply-To: <200902051930.19876.dag@cray.com> References: <200901291647.26270.dag@cray.com> <43E485F3-6440-4F76-9E8B-D1BE97D9C937@apple.com> <200902051930.19876.dag@cray.com> Message-ID: <60412196-95B8-48A9-ADE6-05EC895413E6@apple.com> On Feb 5, 2009, at 5:30 PM, David Greene wrote: > On Monday 02 February 2009 23:55, Evan Cheng wrote: > >>>> Think about what will happen the 2nd iteration. %v1177 will have >>>> the value of >>>> %v1645 which is wrong. This is because %v1176 in bb74 will be >>>> replaced with >>>> %v1177. That's incorrect. >>> >>> Ok, right. The trick to fixing is to make sure the valno of the >>> def of >>> v1177 hasPHIKill to true and make sure the coalescer checks it. > > What does hasPHIKill mean, what are the consequences of using it and > how do I > know when I can set it? I assume this would have been set by > someone. Or is > that part of the bug? hasPHIKill just means it has a phi use so it's not possible to determine where the value is killed. Look for LiveIntervalAnalysis.cpp. > > >> Actually liveintervals can construct a v1177 live range starting from >> the beginning mbb with a val# of unknown def. > > What's "the beginning mbb?" The start of the bb containing the phi > (bb74 in > this case)? Again, how would I instruct liveintervals to do this? > The > IMPLICIT_DEF is already gone by that point. %v1645 = ... loop: %v1176 = %v1645 ... = %v1176 = %v1177 %v1645 = op ... %v1177 = %v1176 jmp loop When the live interval for v1177 is created, the code should recognize v1177 is live-in to the MBB. So even though it's not clear where it's defined, it's still clear it's live-in to the MBB. Evan > > > -Dave > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090205/9922fd00/attachment.html From dag at cray.com Thu Feb 5 20:45:36 2009 From: dag at cray.com (David Greene) Date: Thu, 5 Feb 2009 20:45:36 -0600 Subject: [LLVMdev] undefs in phis In-Reply-To: <60412196-95B8-48A9-ADE6-05EC895413E6@apple.com> References: <200901291647.26270.dag@cray.com> <200902051930.19876.dag@cray.com> <60412196-95B8-48A9-ADE6-05EC895413E6@apple.com> Message-ID: <200902052045.37234.dag@cray.com> On Thursday 05 February 2009 19:55, Evan Cheng wrote: > hasPHIKill just means it has a phi use so it's not possible to > determine where the value is killed. Look for LiveIntervalAnalysis.cpp. Ok. > %v1645 = ... > loop: > %v1176 = %v1645 > ... > = %v1176 > = %v1177 > %v1645 = op ... > %v1177 = %v1176 > jmp loop > > When the live interval for v1177 is created, the code should recognize > v1177 is live-in to the MBB. So even though it's not clear where it's > defined, it's still clear it's live-in to the MBB. Yeah, ok, that should work. Thanks! -Dave From clattner at apple.com Thu Feb 5 21:33:57 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 5 Feb 2009 19:33:57 -0800 Subject: [LLVMdev] First class function pointers In-Reply-To: <200902052342.58602.jon@ffconsultancy.com> References: <200902052342.58602.jon@ffconsultancy.com> Message-ID: <8D70503B-12DA-4B60-AA17-F259E53F6FE4@apple.com> On Feb 5, 2009, at 3:42 PM, Jon Harrop wrote: > > Unless I am mistaken, LLVM barfs if I try to pass an LLVM function > to another > function as an argument because functions are second class. Huh? can you give an example as llvm IR? You can certainly pass functions by-value as a function pointer. -Chris From jon at ffconsultancy.com Thu Feb 5 22:32:16 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Fri, 6 Feb 2009 04:32:16 +0000 Subject: [LLVMdev] First class function pointers In-Reply-To: <8D70503B-12DA-4B60-AA17-F259E53F6FE4@apple.com> References: <200902052342.58602.jon@ffconsultancy.com> <8D70503B-12DA-4B60-AA17-F259E53F6FE4@apple.com> Message-ID: <200902060432.16635.jon@ffconsultancy.com> On Friday 06 February 2009 03:33:57 Chris Lattner wrote: > On Feb 5, 2009, at 3:42 PM, Jon Harrop wrote: > > Unless I am mistaken, LLVM barfs if I try to pass an LLVM function > > to another > > function as an argument because functions are second class. > > Huh? can you give an example as llvm IR? You can certainly pass > functions by-value as a function pointer. Sorry, I seem to have confused myself. I was const bitcasting the function to a function pointer but that is redundant, presumably because the value returned when you define a function already represents a function pointer. Anyway, I still do not understand why functions are not listed as first-class types in the documentation if values of the "function" type can be produced by instructions, passed as arguments and used as operands to instructions? You may be interested in the IR anyway because it is a test of tail calls in LLVM: define fastcc i32 @even(i32 (i32)*, i32) { entry: %2 = add i32 %1, 1 ; [#uses=1] %3 = tail call fastcc i32 %0(i32 %2) ; [#uses=1] ret i32 %3 } define fastcc i32 @odd(i32) { entry: %1 = call i32 (i8*, ...)* @printf(i8* getelementptr ([4 x i8]* @buf, i32 0, i32 0), i32 %0) ; [#uses=0] %2 = icmp slt i32 %0, 1000000 ; [#uses=1] br i1 %2, label %pass, label %fail fail: ; preds = %entry ret i32 0 pass: ; preds = %entry %3 = add i32 %0, 1 ; [#uses=1] %4 = tail call fastcc i32 @even(i32 (i32)* @odd, i32 %3) ; [#uses=1] ret i32 %4 } declare i32 @printf(i8*, ...) define i32 @eval() { entry: %0 = call fastcc i32 @even(i32 (i32)* @odd, i32 0) ; [#uses=0] %1 = call i32 (i8*, ...)* @printf(i8* getelementptr ([3 x i8]* @buf1, i32 0, i32 0)) ; [#uses=0] %2 = call i32 (i8*, ...)* @printf(i8* getelementptr ([2 x i8]* @buf2, i32 0, i32 0)) ; [#uses=0] ret i32 0 } Note that the branch location that "even" jumps to is dynamic, being passed in as an argument. LLVM passes this test but other VMs (such as Mono) fail. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From clattner at apple.com Thu Feb 5 22:37:41 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 5 Feb 2009 20:37:41 -0800 Subject: [LLVMdev] First class function pointers In-Reply-To: <200902060432.16635.jon@ffconsultancy.com> References: <200902052342.58602.jon@ffconsultancy.com> <8D70503B-12DA-4B60-AA17-F259E53F6FE4@apple.com> <200902060432.16635.jon@ffconsultancy.com> Message-ID: On Feb 5, 2009, at 8:32 PM, Jon Harrop wrote: > On Friday 06 February 2009 03:33:57 Chris Lattner wrote: >> On Feb 5, 2009, at 3:42 PM, Jon Harrop wrote: >>> Unless I am mistaken, LLVM barfs if I try to pass an LLVM function >>> to another >>> function as an argument because functions are second class. >> >> Huh? can you give an example as llvm IR? You can certainly pass >> functions by-value as a function pointer. > > Sorry, I seem to have confused myself. I was const bitcasting the > function to > a function pointer but that is redundant, presumably because the value > returned when you define a function already represents a function > pointer. > > Anyway, I still do not understand why functions are not listed as > first-class > types in the documentation if values of the "function" type can be > produced > by instructions, passed as arguments and used as operands to > instructions? "function types" are not first class values because you can't get one, store one, or do anything with one. All functions are referred to via pointer, and all pointer types are first class, including function pointers. If you refer to a function with @foo, you're getting a function pointer, not the function itself. -Chris From echeng at apple.com Fri Feb 6 00:40:50 2009 From: echeng at apple.com (Evan Cheng) Date: Thu, 5 Feb 2009 22:40:50 -0800 Subject: [LLVMdev] LLVM misses some cross-MBB and loop optimizations compared to GCC In-Reply-To: References: Message-ID: <568BBB9C-82E3-4C9B-91B3-EBCBBE3DF218@apple.com> Thanks. Can you file bugzilla reports? I'll look at the first one soon. Evan On Feb 5, 2009, at 8:08 AM, Roman Levenstein wrote: > Hi, > > While testing my new register allocators on some test-cases, I've > noticed that LLVM misses sometimes some optimization opportunities: > > 1) LocalSpiller::RewriteMBB seems not to propagate the information > about e.g. Spills between MBBs.In many cases, where MBB B1 has only > one predecessor MBB B2, B1 could reuse the information about the > physical registers that are in the live-out set of B2. This could help > to e.g. eliminate some useless reloads from spill slots, if the value > is available on the required physical register already. For example, > in the example below, the marked "movl 12(%esp), %ecx" instruction > could be eliminated. > > .LBB2_2: # bb31 > movl 12(%esp), %ecx > movl 8(%esp), %eax > cmpl $0, up+28(%eax,%ecx,4) > je .LBB2_9 # bb569 > .LBB2_3: # bb41 ; <--- bb31 is the only predecessor > of bb41 > movl 12(%esp), %ecx ; <--- This could be eliminated!!! > movl 4(%esp), %eax > cmpl $0, down(%eax,%ecx,4) > je .LBB2_9 # bb569 > > > It is also worth mentioning, that currently reloads from spill slots > are not recorded in the Spills set using the addAvailable method, as > far as I can see. Wouldn't it make sense? > > I have the feeling that these improvements are rather easy to achieve > and would not require too much changes to the LocalSpiller. Probably, > we just need to keep the live-out set of the MBB around after > rewriting it, so that its successors can use it in some cases as > initial value for the Spills set. > > Any opinions? > > 2) Moving of sub-expressions from loops and replacement of array > accesses via pointer-based induction variables is also not optimal in > some situations. > In the example mentioned above, both blocks are executed inside a > loop enclosing them. And they keep evaluating e.g. the > down(%eax,%ecx,4) expression on every iteration. GCC at the same time > hoists this expression outside of the loop and replaces it with a > simple pointer, as you can see below: > > .LBB2_2: > movl -32(%ebp), %edx > movl 28(%edx), %eax > testl %eax, %eax > je .L5 > > .LBB2_3: > movl -48(%ebp), %eax > movl (%eax), %edi > testl %edi, %edi > je .L5 > > > To make it possible for you to analyze this test-case, I attach the > source file, the BC file and the output of the code produced by LLVM > and by "GCC -O6". > > -Roman > <8q_speed.c.s><8q_speed.s.gcc><8q_speed.c.bc><8q_speed.c> From romix.llvm at googlemail.com Fri Feb 6 02:43:52 2009 From: romix.llvm at googlemail.com (Roman Levenstein) Date: Fri, 6 Feb 2009 09:43:52 +0100 Subject: [LLVMdev] LLVM misses some cross-MBB and loop optimizations compared to GCC In-Reply-To: <568BBB9C-82E3-4C9B-91B3-EBCBBE3DF218@apple.com> References: <568BBB9C-82E3-4C9B-91B3-EBCBBE3DF218@apple.com> Message-ID: Done. Please check these Bugzilla entries: http://llvm.org/bugs/show_bug.cgi?id=3495 (LocalSpiller problems) http://llvm.org/bugs/show_bug.cgi?id=3496 (Loop optimization problems) -Roman 2009/2/6 Evan Cheng : > Thanks. Can you file bugzilla reports? I'll look at the first one soon. > > Evan > On Feb 5, 2009, at 8:08 AM, Roman Levenstein wrote: > >> Hi, >> >> While testing my new register allocators on some test-cases, I've >> noticed that LLVM misses sometimes some optimization opportunities: >> >> 1) LocalSpiller::RewriteMBB seems not to propagate the information >> about e.g. Spills between MBBs.In many cases, where MBB B1 has only >> one predecessor MBB B2, B1 could reuse the information about the >> physical registers that are in the live-out set of B2. This could help >> to e.g. eliminate some useless reloads from spill slots, if the value >> is available on the required physical register already. For example, >> in the example below, the marked "movl 12(%esp), %ecx" instruction >> could be eliminated. >> >> .LBB2_2: # bb31 >> movl 12(%esp), %ecx >> movl 8(%esp), %eax >> cmpl $0, up+28(%eax,%ecx,4) >> je .LBB2_9 # bb569 >> .LBB2_3: # bb41 ; <--- bb31 is the only predecessor of bb41 >> movl 12(%esp), %ecx ; <--- This could be eliminated!!! >> movl 4(%esp), %eax >> cmpl $0, down(%eax,%ecx,4) >> je .LBB2_9 # bb569 >> >> >> It is also worth mentioning, that currently reloads from spill slots >> are not recorded in the Spills set using the addAvailable method, as >> far as I can see. Wouldn't it make sense? >> >> I have the feeling that these improvements are rather easy to achieve >> and would not require too much changes to the LocalSpiller. Probably, >> we just need to keep the live-out set of the MBB around after >> rewriting it, so that its successors can use it in some cases as >> initial value for the Spills set. >> >> Any opinions? >> >> 2) Moving of sub-expressions from loops and replacement of array >> accesses via pointer-based induction variables is also not optimal in >> some situations. >> In the example mentioned above, both blocks are executed inside a >> loop enclosing them. And they keep evaluating e.g. the >> down(%eax,%ecx,4) expression on every iteration. GCC at the same time >> hoists this expression outside of the loop and replaces it with a >> simple pointer, as you can see below: >> >> .LBB2_2: >> movl -32(%ebp), %edx >> movl 28(%edx), %eax >> testl %eax, %eax >> je .L5 >> >> .LBB2_3: >> movl -48(%ebp), %eax >> movl (%eax), %edi >> testl %edi, %edi >> je .L5 >> >> >> To make it possible for you to analyze this test-case, I attach the >> source file, the BC file and the output of the code produced by LLVM >> and by "GCC -O6". >> >> -Roman >> <8q_speed.c.s><8q_speed.s.gcc><8q_speed.c.bc><8q_speed.c> > > From tom.primozic at gmail.com Fri Feb 6 03:28:17 2009 From: tom.primozic at gmail.com (=?UTF-8?B?VG9tIFByaW1vxb5pxI0=?=) Date: Fri, 6 Feb 2009 10:28:17 +0100 Subject: [LLVMdev] Register variables Message-ID: Hello! I have been considering using LLVM for my compiler project for some time now, and have been extensively researching its capabilities. However, I have come to a sticking point that I cannot solve. My language is a multi paradigm language, but with the emphasis on "functional". As such, it is expected that programs will allocate a lot of small, short-lived data. This means, in particular, that allocations have to be very fast - increase the heap pointer and check for heap overflow. However, I have not yet found a way to tell the LLVM compiler to keep a global variable in a register at all times (except when using some foreign calling convention, when all registers are saved on the stack). Another reason that I would like to keep the current heap pointer in processor registers is that my language will support multi-threading, with every thread having its own heap (there will be a global heap, too, but allocations will be more expensive). Therefore, I cannot use a global memory location for the heap pointer, as it has to be different for every thread on the system. If any of you has any ideas how to solve this issue, please tell me. I have also looked at some other projects (C--, and some other implementations of compilers for functional languages), but have not yet found anything useful. - Tom Primo?i? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/a97007da/attachment.html From edwintorok at gmail.com Fri Feb 6 03:37:47 2009 From: edwintorok at gmail.com (=?UTF-8?B?VMO2csO2ayBFZHdpbg==?=) Date: Fri, 06 Feb 2009 11:37:47 +0200 Subject: [LLVMdev] Register variables In-Reply-To: References: Message-ID: <498C04EB.7040200@gmail.com> On 2009-02-06 11:28, Tom Primo?i? wrote: > Hello! > > I have been considering using LLVM for my compiler project for some > time now, and have been extensively researching its capabilities. > However, I have come to a sticking point that I cannot solve. > > My language is a multi paradigm language, but with the emphasis on > "functional". As such, it is expected that programs will allocate a > lot of small, short-lived data. This means, in particular, that > allocations have to be very fast - increase the heap pointer and check > for heap overflow. However, I have not yet found a way to tell the > LLVM compiler to keep a global variable in a register at all times > (except when using some foreign calling convention, when all registers > are saved on the stack). Another reason that I would like to keep the > current heap pointer in processor registers is that my language will > support multi-threading, with every thread having its own heap (there > will be a global heap, too, but allocations will be more expensive). > Therefore, I cannot use a global memory location for the heap pointer, > as it has to be different for every thread on the system. > > If any of you has any ideas how to solve this issue, please tell me. I > have also looked at some other projects (C--, and some other > implementations of compilers for functional languages), but have not > yet found anything useful. Wouldn't a thread-local global variable solve your problem? Best regards, --Edwin From jay.foad at gmail.com Fri Feb 6 03:42:30 2009 From: jay.foad at gmail.com (Jay Foad) Date: Fri, 6 Feb 2009 09:42:30 +0000 Subject: [LLVMdev] problems building googletest for 2.5 In-Reply-To: References: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> <56DE3698-E325-4D9F-A37E-724F0EDF8BE5@apple.com> Message-ID: > Someone else would have to > try 3.X. It works for me. I can now build LLVM on Cygwin with GCC 3.4.4, where I used to fall over this problem before. Thanks, Jay. From Christian.Sayer at dibcom.fr Fri Feb 6 04:22:01 2009 From: Christian.Sayer at dibcom.fr (Christian Sayer) Date: Fri, 6 Feb 2009 11:22:01 +0100 Subject: [LLVMdev] list-td scheduler asserts on targets with implicitly defined registers Message-ID: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76620@FRPAR1CL009.coe.adi.dibcom.com> Hi, I just switched to the 2.5 release branch and noticed that llc runs into the following assert in ScheduleDAGList::ScheduleNodeTopDown() using our custom backend: assert(!I->isAssignedRegDep() && "The list-td scheduler doesn't yet support physreg dependencies!"); It turns out that the register dependency concerns the condition code register which is modeled as an implicitly defined register in the backend (the same happens for e.g. for X86 when explicitly giving the -pre-RA-sched=list-td option to llc). My assumption is that the assert should exclude non-allocatable, implicitly defined registers, which is checked in the attatched patch1. This works fine for me, however on X86 the EFLAGS register is not marked non-allocatable (patch2). Is this intentional? Our backend handles condition codes pretty much like X86 and I remember I didn't get it to work without defining the allocation_order_end() function in RegisterInfo.td Anyway, I have no idea if this solution is ok for the general case, maybe the implicit defs information should rather be put into the SDeps when they are created? Regards, Christian -- please ignore: CONFIDENTIAL NOTICE: The contents of this message, including any attachments, are confidential and are intended solely for the use of the person or entity to whom the message was addressed. If you are not the intended recipient of this message, please be advised that any dissemination, distribution, or use of the contents of this message is strictly prohibited. If you received this message in error, please notify the sender. Please also permanently delete all copies of the original message and any attached documentation. Thank you. -------------- next part -------------- A non-text attachment was scrubbed... Name: patch2_X86eflagsAllocatable.patch Type: application/octet-stream Size: 649 bytes Desc: patch2_X86eflagsAllocatable.patch Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/946c9800/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: patch1_physregAlloc.patch Type: application/octet-stream Size: 1641 bytes Desc: patch1_physregAlloc.patch Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/946c9800/attachment-0001.obj From echeng at apple.com Fri Feb 6 11:02:39 2009 From: echeng at apple.com (Evan Cheng) Date: Fri, 6 Feb 2009 09:02:39 -0800 Subject: [LLVMdev] list-td scheduler asserts on targets with implicitly defined registers In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76620@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76620@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: The best fix is to teach this scheduler how to deal with these dependencies. :-) If you just want a check, I think it's easier to just check register class's copy cost. -1 means it's extremely expensive to copy registers in the particular register class. Evan On Feb 6, 2009, at 2:22 AM, Christian Sayer wrote: > Hi, > > I just switched to the 2.5 release branch and noticed that llc runs > into the following assert in ScheduleDAGList::ScheduleNodeTopDown() > using our custom backend: > > assert(!I->isAssignedRegDep() && > "The list-td scheduler doesn't yet support physreg > dependencies!"); > > It turns out that the register dependency concerns the condition > code register which is modeled as an implicitly defined register in > the backend (the same happens for e.g. for X86 when explicitly > giving the -pre-RA-sched=list-td option to llc). > > My assumption is that the assert should exclude non-allocatable, > implicitly defined registers, which is checked in the attatched > patch1. > This works fine for me, however on X86 the EFLAGS register is not > marked non-allocatable (patch2). > Is this intentional? Our backend handles condition codes pretty much > like X86 and I remember I didn't get it to work without defining the > allocation_order_end() function in RegisterInfo.td > Anyway, I have no idea if this solution is ok for the general case, > maybe the implicit defs information should rather be put into the > SDeps when they are created? > > Regards, > Christian > > -- > > > > > > > > please ignore: > > CONFIDENTIAL NOTICE: The contents of this message, including any > attachments, are confidential and are intended solely for the use of > the person or entity to whom the message was addressed. If you are > not the intended recipient of this message, please be advised that > any dissemination, distribution, or use of the contents of this > message is strictly prohibited. If you received this message in > error, please notify the sender. Please also permanently delete all > copies of the original message and any attached documentation. Thank > you. > < > patch2_X86eflagsAllocatable > .patch > > > < > patch1_physregAlloc > .patch>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From mrs at apple.com Fri Feb 6 14:15:38 2009 From: mrs at apple.com (Mike Stump) Date: Fri, 6 Feb 2009 12:15:38 -0800 Subject: [LLVMdev] problems building googletest for 2.5 In-Reply-To: References: <8DA45531-786D-40A0-B316-027E0FEA4B3A@nondot.org> <56DE3698-E325-4D9F-A37E-724F0EDF8BE5@apple.com> Message-ID: <0F5834A7-0CB7-4B6A-9856-8F1699427522@apple.com> On Feb 6, 2009, at 1:42 AM, Jay Foad wrote: > It works for me. I can now build LLVM on Cygwin with GCC 3.4.4, > where I used to fall over this problem before. Thanks for testing. From tom.primozic at gmail.com Fri Feb 6 17:46:00 2009 From: tom.primozic at gmail.com (=?UTF-8?B?VG9tIFByaW1vxb5pxI0=?=) Date: Sat, 7 Feb 2009 00:46:00 +0100 Subject: [LLVMdev] Register variables Message-ID: > Wouldn't a thread-local global variable solve your problem? > > Best regards, > --Edwin Are thread-local variables supported on all platforms? If so, than probably yes... both the pointers would probably be kept in the processor cache, which would speed-wise probably be comparable (will it be?) with keeping variables in the registers. - Tom Primo?i? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090207/3cc45992/attachment.html From Micah.Villmow at amd.com Fri Feb 6 18:59:43 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 6 Feb 2009 16:59:43 -0800 Subject: [LLVMdev] Patch: More data types Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C82704D@ssanexmb1.amd.com> I've patched valuetypes.td/h to add data types that my backend needs to support. There seems to be a lot of assumptions made in other spots of the code that limit the number of data types to 32. I need to add a few more types, but once I go over this limit llvm starts acting wonky. I found all the items that are hard coded to 32 and a section that isn't, but I cannot figure out how to expand it so that there can be up to 64 data types. The section in question is TargetLowering.h and seems to be these two functions. LegalizeAction getTypeAction(MVT VT) const { if (VT.isExtended()) { if (VT.isVector()) return Expand; if (VT.isInteger()) // First promote to a power-of-two size, then expand if necessary. return VT == VT.getRoundIntegerType() ? Expand : Promote; assert(0 && "Unsupported extended type!"); return Legal; } unsigned I = VT.getSimpleVT(); assert(I<4*array_lengthof(ValueTypeActions)*sizeof(ValueTypeActions[0])) ; return (LegalizeAction)((ValueTypeActions[I>>4] >> ((2*I) & 31)) & 3); } void setTypeAction(MVT VT, LegalizeAction Action) { unsigned I = VT.getSimpleVT(); assert(I<4*array_lengthof(ValueTypeActions)*sizeof(ValueTypeActions[0])) ; ValueTypeActions[I>>4] |= Action << ((I*2) & 31); } I am not really sure what is going on here, but would be happy if someone could get it to expand to allow 64 datatypes. Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/fc9e4639/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: datatypes.diff Type: application/octet-stream Size: 10160 bytes Desc: datatypes.diff Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/fc9e4639/attachment-0001.obj From Micah.Villmow at amd.com Fri Feb 6 19:16:10 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 6 Feb 2009 17:16:10 -0800 Subject: [LLVMdev] Patch: More data types In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C82704D@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C82704D@ssanexmb1.amd.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827052@ssanexmb1.amd.com> Forgot to add the patch required tablegen to work correctly. ________________________________ From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Villmow, Micah Sent: Friday, February 06, 2009 5:00 PM To: LLVM Developers Mailing List Subject: [LLVMdev] Patch: More data types I've patched valuetypes.td/h to add data types that my backend needs to support. There seems to be a lot of assumptions made in other spots of the code that limit the number of data types to 32. I need to add a few more types, but once I go over this limit llvm starts acting wonky. I found all the items that are hard coded to 32 and a section that isn't, but I cannot figure out how to expand it so that there can be up to 64 data types. The section in question is TargetLowering.h and seems to be these two functions. LegalizeAction getTypeAction(MVT VT) const { if (VT.isExtended()) { if (VT.isVector()) return Expand; if (VT.isInteger()) // First promote to a power-of-two size, then expand if necessary. return VT == VT.getRoundIntegerType() ? Expand : Promote; assert(0 && "Unsupported extended type!"); return Legal; } unsigned I = VT.getSimpleVT(); assert(I<4*array_lengthof(ValueTypeActions)*sizeof(ValueTypeActions[0])) ; return (LegalizeAction)((ValueTypeActions[I>>4] >> ((2*I) & 31)) & 3); } void setTypeAction(MVT VT, LegalizeAction Action) { unsigned I = VT.getSimpleVT(); assert(I<4*array_lengthof(ValueTypeActions)*sizeof(ValueTypeActions[0])) ; ValueTypeActions[I>>4] |= Action << ((I*2) & 31); } I am not really sure what is going on here, but would be happy if someone could get it to expand to allow 64 datatypes. Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/71ed41de/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: datatypes.diff Type: application/octet-stream Size: 13082 bytes Desc: datatypes.diff Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/71ed41de/attachment-0001.obj From tonic at nondot.org Fri Feb 6 19:42:06 2009 From: tonic at nondot.org (Tanya Lattner) Date: Fri, 6 Feb 2009 17:42:06 -0800 Subject: [LLVMdev] 2.5 Pre-release1 available for testing Message-ID: LLVMers, The 2.5 pre-release is available for testing: http://llvm.org/prereleases/2.5/ If you have time, I'd appreciate anyone who can help test the release. Please do the following: 1) Download/compile llvm source, and either compile llvm-gcc source or use llvm-gcc binary (please compile llvm-gcc with fortran if you can). 2) Run make check, send me the testrun.log 3) Run "make TEST=nightly report" and send me the report.nightly.txt 4) Please provide details on what platform you compiled LLVM on, how you built LLMV (src == obj, or src != obj), and if you compiled llvm- gcc with support for fortran. Please COMPLETE ALL TESTING BY end of the day on Feb. 12th! Thanks, Tanya Lattner -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/2ee9c08d/attachment.html From deeppatel1987 at gmail.com Fri Feb 6 20:02:14 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Fri, 6 Feb 2009 18:02:14 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> Message-ID: <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> I think I've got all the cases handled now, implementing with CCCustom<"foo"> callbacks into C++. This also fixes a crash when returning i128. I've also included a small asm constraint fix that was needed to build newlib. deep On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng wrote: > > On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: > >> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman wrote: >>> >>> One problem with this approach is that since i64 isn't legal, the >>> bitcast would require custom C++ code in the ARM target to >>> handle properly. It might make sense to introduce something >>> like >>> >>> CCIfType<[f64], CCCustom> >>> >>> where CCCustom is a new entity that tells the calling convention >>> code to to let the target do something not easily representable >>> in the tablegen minilanguage. >> >> I am thinking that this requires two changes: add a flag to >> CCValAssign (take a bit from HTP) to indicate isCustom and a way to >> author an arbitrary CCAction by including the source directly in the >> TableGen mini-language. This latter change might want a generic change >> to the TableGen language. For example, the syntax might be like: >> >> class foo : CCCustomAction { >> code <<< EOF >> ....multi-line C++ code goes here that allocates regs & mem and >> sets CCValAssign::isCustom.... >> EOF >> } >> >> Does this seem reasonable? An alternative is for CCCustom to take a >> string that names a function to be called: >> >> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >> >> the function signature for such functions will have to return two >> results: if the CC processing is finished and if it the func succeeded >> or failed: > > I like the second solution better. It seems rather cumbersome to embed > multi-line c++ code in td files. > > Evan >> >> >> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >> MVT LocVT, CCValAssign::LocInfo LocInfo, >> ISD::ArgFlagsTy ArgFlags, CCState &State, >> bool &result); >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_callingconv.diff Type: application/octet-stream Size: 54281 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090206/bfe5289b/attachment-0001.obj From simmon12 at illinois.edu Fri Feb 6 22:52:15 2009 From: simmon12 at illinois.edu (Patrick Simmons) Date: Fri, 06 Feb 2009 22:52:15 -0600 Subject: [LLVMdev] Problem Running llvm-suite Message-ID: <498D137F.7050806@illinois.edu> Hi, I'm trying to run the tests in llvm-suite, but I've run into trouble. First, I had the llvm-suite checkout in a directory alongside the llvm compiler checkout, but, when I ran "make" from llvm-suite, it complained about there not being a Makefile two levels above it, so I moved llvm-suite into the "test" subdirectory inside the llvm compiler checkout. I ran "make" again, but I got this: [simmon12 at maute llvm-suite]$ make make[1]: Entering directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource' make[2]: Entering directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/UnitTests' make[3]: Entering directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/UnitTests/Vector' make[4]: Entering directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/UnitTests/Vector/SSE' make[4]: *** No rule to make target `Output/sse.expandfft.linked.rbc', needed by `Output/sse.expandfft.linked.bc'. Stop. make[4]: Leaving directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/UnitTests/Vector/SSE' make[3]: *** [all] Error 1 make[3]: Leaving directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/UnitTests/Vector' make[2]: *** [all] Error 1 make[2]: Leaving directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/UnitTests' make[1]: *** [UnitTests/.makeall] Error 2 make[1]: Leaving directory `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource' make: *** [SingleSource/.makeall] Error 2 I already ran "./configure" inside llvm-suite, and I built a release build of the LLVM tools in the compiler checkout, so I'm not sure what the problem could be. Has anyone else had this problem and perhaps solved it? --Patrick From dalej at apple.com Fri Feb 6 23:07:36 2009 From: dalej at apple.com (Dale Johannesen) Date: Fri, 6 Feb 2009 21:07:36 -0800 Subject: [LLVMdev] Problem Running llvm-suite In-Reply-To: <498D137F.7050806@illinois.edu> References: <498D137F.7050806@illinois.edu> Message-ID: <5D0D6E75-DB30-4646-8E3B-021C72C2A939@apple.com> On Feb 6, 2009, at 8:52 PM, Patrick Simmons wrote: > Hi, > > I'm trying to run the tests in llvm-suite, but I've run into trouble. > First, I had the llvm-suite checkout in a directory alongside the llvm > compiler checkout, but, when I ran "make" from llvm-suite, it > complained > about there not being a Makefile two levels above it, so I moved > llvm-suite into the "test" subdirectory inside the llvm compiler > checkout. I ran "make" again, but I got this: You need to put it under the "projects" subdirectory. Read this: http://llvm.org/docs/TestingGuide.html It is perhaps a little confusing; looks like you stopped reading after the first "test suite" section. Keep going. From jyasskin at google.com Sat Feb 7 02:27:25 2009 From: jyasskin at google.com (Jeffrey Yasskin) Date: Sat, 7 Feb 2009 00:27:25 -0800 Subject: [LLVMdev] 2.5 Pre-release1 available for testing In-Reply-To: References: Message-ID: I'm trying to build the 2.5 prerelease on my MacBook, and I'm getting a bus error in tblgen: $ rm -r * && ../src/configure --prefix=`pwd`/../install && make -j1 VERBOSE=1 ENABLE_OPTIMIZED=0 ... llvm[1]: Building Intrinsics.gen.tmp from Intrinsics.td /Users/jyasskin/src/llvm-2.5/obj/Debug/bin/tblgen -I /Users/jyasskin/src/llvm-2.5/src/lib/VMCore -I /Users/jyasskin/src/llvm-2.5/src/include -I /Users/jyasskin/src/llvm-2.5/src/include -I /Users/jyasskin/src/llvm-2.5/src/lib/Target /Users/jyasskin/src/llvm-2.5/src/include/llvm/Intrinsics.td -o /Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp -gen-intrinsic make[1]: *** [/Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp] Bus error make: *** [all] Error 1 Looking through the code, I don't see anything wrong. Here's a bunch of maybe-relevant information, but let me know if I should send anything else. $ gdb --args /Users/jyasskin/src/llvm-2.5/obj/Debug/bin/tblgen -I /Users/jyasskin/src/llvm-2.5/src/lib/VMCore -I /Users/jyasskin/src/llvm-2.5/src/include -I /Users/jyasskin/src/llvm-2.5/src/include -I /Users/jyasskin/src/llvm-2.5/src/lib/Target /Users/jyasskin/src/llvm-2.5/src/include/llvm/Intrinsics.td -o /Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp -gen-intrinsic GNU gdb 6.3.50-20050815 (Apple version gdb-962) (Sat Jul 26 08:14:40 UTC 2008) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries .... done (gdb) run Starting program: /Users/jyasskin/src/llvm-2.5/obj/Debug/bin/tblgen -I /Users/jyasskin/src/llvm-2.5/src/lib/VMCore -I /Users/jyasskin/src/llvm-2.5/src/include -I /Users/jyasskin/src/llvm-2.5/src/include -I /Users/jyasskin/src/llvm-2.5/src/lib/Target /Users/jyasskin/src/llvm-2.5/src/include/llvm/Intrinsics.td -o /Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp -gen-intrinsic Reading symbols for shared libraries +++.. done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x0010eb75 0x963646e1 in __gnu_cxx::__exchange_and_add () (gdb) bt #0 0x963646e1 in __gnu_cxx::__exchange_and_add () #1 0x96354070 in std::string::_Rep::_M_dispose () #2 0x963560a6 in std::string::assign () #3 0x000ecd67 in llvm::cl::initializer::apply > > (this=0xbfffdd5c, O=@0x17c200) at CommandLine.h:281 #4 0x000ecdd4 in llvm::cl::applicator >::opt > > (M=@0xbfffdd5c, O=@0x17c200) at CommandLine.h:706 #5 0x000ecdee in llvm::cl::apply, llvm::cl::opt > > (M=@0xbfffdd5c, O=0x17c200) at CommandLine.h:742 #6 0x000ece85 in llvm::cl::opt >::opt > (this=0x17c200, M0=@0x10eb92, M1=@0xbfffdd64, M2=@0xbfffdd60, M3=@0xbfffdd5c) at CommandLine.h:897 #7 0x0015d7be in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /Users/jyasskin/src/llvm-2.5/src/utils/TableGen/TableGen.cpp:97 #8 0x0015d953 in global constructors keyed to _ZN89_GLOBAL__N__Users_jyasskin_src_llvm_2.5_src_utils_TableGen_TableGen.cpp_00000000_BF75FF056ActionE () at /Users/jyasskin/src/llvm-2.5/src/utils/TableGen/TableGen.cpp:236 #9 0x8fe12e76 in __dyld__ZN16ImageLoaderMachO18doModInitFunctionsERKN11ImageLoader11LinkContextE () #10 0x8fe0e723 in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEj () #11 0x8fe0e809 in __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextE () #12 0x8fe04102 in __dyld__ZN4dyld24initializeMainExecutableEv () #13 0x8fe07b5f in __dyld__ZN4dyld5_mainEPK11mach_headermiPPKcS5_S5_ () #14 0x8fe01872 in __dyld__ZN13dyldbootstrap5startEPK11mach_headeriPPKcl () #15 0x8fe01037 in __dyld__dyld_start () Oddly, I can't look at the contents of an opt until I run (gdb) add-symbol-file /Users/jyasskin/src/llvm-2.5/obj/lib/Support/Debug/CommandLine.o Then I get the following at the crash: (gdb) f 3 (gdb) p O $2 = (class llvm::cl::opt, std::allocator >,false,llvm::cl::parser, std::allocator > > > &) @0x17c200: { = { _vptr$Option = 0x15f2c8, NumOccurrences = 0, Flags = 33, Position = 0, AdditionalVals = 0, NextRegistered = 0x0, ArgStr = 0x10eb92 "o", HelpStr = 0x10eb82 "Output filename", ValueStr = 0x10eb79 "filename" }, , std::allocator >,false,true>> = { ,std::allocator >> = { _M_dataplus = { > = { <__gnu_cxx::new_allocator> = {}, }, members of std::basic_string,std::allocator >::_Alloc_hider: _M_p = 0xa06ea6e4 "" } }, }, members of llvm::cl::opt, std::allocator >,false,llvm::cl::parser, std::allocator > > >: Parser = { , std::allocator > >> = { = { _vptr$basic_parser_impl = 0x15ec58 }, }, } } This happens both with and without the binaries from llvm-gcc4.2-2.5-x86-darwin9.tar.gz symlinked into my PATH. It also happens in both release and debug builds. $ gcc --version i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5488) On Fri, Feb 6, 2009 at 5:42 PM, Tanya Lattner wrote: > LLVMers, > > The 2.5 pre-release is available for testing: > http://llvm.org/prereleases/2.5/ > > If you have time, I'd appreciate anyone who can help test the release. > Please do the following: > > 1) Download/compile llvm source, and either compile llvm-gcc source or use > llvm-gcc binary (please compile llvm-gcc with fortran if you can). > 2) Run make check, send me the testrun.log > 3) Run "make TEST=nightly report" and send me the report.nightly.txt > 4) Please provide details on what platform you compiled LLVM on, how you > built LLMV (src == obj, or src != obj), and if you compiled llvm-gcc with > support for fortran. > > Please COMPLETE ALL TESTING BY end of the day on Feb. 12th! > > Thanks, > Tanya Lattner > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > From pdebie at ai.rug.nl Sat Feb 7 10:08:41 2009 From: pdebie at ai.rug.nl (Pieter de Bie) Date: Sat, 7 Feb 2009 16:08:41 +0000 Subject: [LLVMdev] [PATCH] Use the new URL to BugPoint documentation Message-ID: <1234022921-3633-1-git-send-email-pdebie@ai.rug.nl> --- tools/bugpoint/bugpoint.cpp | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) I came across this while running bugpoint --help, hope you don't mind the git patch output :) diff --git a/tools/bugpoint/bugpoint.cpp b/tools/bugpoint/bugpoint.cpp index 2364675..587077e 100644 --- a/tools/bugpoint/bugpoint.cpp +++ b/tools/bugpoint/bugpoint.cpp @@ -67,7 +67,7 @@ int main(int argc, char **argv) { llvm_shutdown_obj X; // Call llvm_shutdown() on exit. cl::ParseCommandLineOptions(argc, argv, "LLVM automatic testcase reducer. See\nhttp://" - "llvm.org/docs/CommandGuide/bugpoint.html" + "llvm.org/cmds/bugpoint.html" " for more information.\n"); sys::PrintStackTraceOnErrorSignal(); sys::SetInterruptFunction(BugpointInterruptFunction); -- 1.6.1.2.458.g9de76 From clattner at apple.com Sat Feb 7 12:56:53 2009 From: clattner at apple.com (Chris Lattner) Date: Sat, 7 Feb 2009 10:56:53 -0800 Subject: [LLVMdev] [PATCH] Use the new URL to BugPoint documentation In-Reply-To: <1234022921-3633-1-git-send-email-pdebie@ai.rug.nl> References: <1234022921-3633-1-git-send-email-pdebie@ai.rug.nl> Message-ID: <6685499E-7665-4B5B-ADDD-6674554BDCAA@apple.com> On Feb 7, 2009, at 8:08 AM, Pieter de Bie wrote: > --- > tools/bugpoint/bugpoint.cpp | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > I came across this while running bugpoint --help, hope you don't > mind the git > patch output :) Applied, thanks! http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090202/073432.html -Chris From jyasskin at google.com Sat Feb 7 13:23:09 2009 From: jyasskin at google.com (Jeffrey Yasskin) Date: Sat, 7 Feb 2009 11:23:09 -0800 Subject: [LLVMdev] 2.5 Pre-release1 available for testing In-Reply-To: References: Message-ID: I figured it out. I had installed llvm-2.4 through macports and set LD_LIBRARY_PATH, CPPFLAGS, and LDFLAGS to search /opt/local in addition to the normal search paths. Unsetting them let the 2.5 prerelease compile. Sorry for the false alarm. On Sat, Feb 7, 2009 at 12:27 AM, Jeffrey Yasskin wrote: > I'm trying to build the 2.5 prerelease on my MacBook, and I'm getting > a bus error in tblgen: > > $ rm -r * && ../src/configure --prefix=`pwd`/../install && make -j1 > VERBOSE=1 ENABLE_OPTIMIZED=0 > ... > llvm[1]: Building Intrinsics.gen.tmp from Intrinsics.td > /Users/jyasskin/src/llvm-2.5/obj/Debug/bin/tblgen -I > /Users/jyasskin/src/llvm-2.5/src/lib/VMCore -I > /Users/jyasskin/src/llvm-2.5/src/include -I > /Users/jyasskin/src/llvm-2.5/src/include -I > /Users/jyasskin/src/llvm-2.5/src/lib/Target > /Users/jyasskin/src/llvm-2.5/src/include/llvm/Intrinsics.td -o > /Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp > -gen-intrinsic > make[1]: *** [/Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp] > Bus error > make: *** [all] Error 1 > > Looking through the code, I don't see anything wrong. Here's a bunch > of maybe-relevant information, but let me know if I should send > anything else. > > $ gdb --args /Users/jyasskin/src/llvm-2.5/obj/Debug/bin/tblgen -I > /Users/jyasskin/src/llvm-2.5/src/lib/VMCore -I > /Users/jyasskin/src/llvm-2.5/src/include -I > /Users/jyasskin/src/llvm-2.5/src/include -I > /Users/jyasskin/src/llvm-2.5/src/lib/Target > /Users/jyasskin/src/llvm-2.5/src/include/llvm/Intrinsics.td -o > /Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp > -gen-intrinsic > GNU gdb 6.3.50-20050815 (Apple version gdb-962) (Sat Jul 26 08:14:40 UTC 2008) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-apple-darwin"...Reading symbols for > shared libraries .... done > > (gdb) run > Starting program: /Users/jyasskin/src/llvm-2.5/obj/Debug/bin/tblgen -I > /Users/jyasskin/src/llvm-2.5/src/lib/VMCore -I > /Users/jyasskin/src/llvm-2.5/src/include -I > /Users/jyasskin/src/llvm-2.5/src/include -I > /Users/jyasskin/src/llvm-2.5/src/lib/Target > /Users/jyasskin/src/llvm-2.5/src/include/llvm/Intrinsics.td -o > /Users/jyasskin/src/llvm-2.5/obj/lib/VMCore/Debug/Intrinsics.gen.tmp > -gen-intrinsic > Reading symbols for shared libraries +++.. done > > Program received signal EXC_BAD_ACCESS, Could not access memory. > Reason: KERN_PROTECTION_FAILURE at address: 0x0010eb75 > 0x963646e1 in __gnu_cxx::__exchange_and_add () > (gdb) bt > #0 0x963646e1 in __gnu_cxx::__exchange_and_add () > #1 0x96354070 in std::string::_Rep::_M_dispose () > #2 0x963560a6 in std::string::assign () > #3 0x000ecd67 in llvm::cl::initializer [2]>::apply llvm::cl::parser > > (this=0xbfffdd5c, O=@0x17c200) at > CommandLine.h:281 > #4 0x000ecdd4 in llvm::cl::applicator >>::opt >> > (M=@0xbfffdd5c, O=@0x17c200) at CommandLine.h:706 > #5 0x000ecdee in llvm::cl::apply, > llvm::cl::opt > > > (M=@0xbfffdd5c, O=0x17c200) at CommandLine.h:742 > #6 0x000ece85 in llvm::cl::opt llvm::cl::parser >::opt llvm::cl::value_desc, llvm::cl::initializer > > (this=0x17c200, M0=@0x10eb92, M1=@0xbfffdd64, M2=@0xbfffdd60, > M3=@0xbfffdd5c) at CommandLine.h:897 > #7 0x0015d7be in __static_initialization_and_destruction_0 > (__initialize_p=1, __priority=65535) at > /Users/jyasskin/src/llvm-2.5/src/utils/TableGen/TableGen.cpp:97 > #8 0x0015d953 in global constructors keyed to > _ZN89_GLOBAL__N__Users_jyasskin_src_llvm_2.5_src_utils_TableGen_TableGen.cpp_00000000_BF75FF056ActionE > () at /Users/jyasskin/src/llvm-2.5/src/utils/TableGen/TableGen.cpp:236 > #9 0x8fe12e76 in > __dyld__ZN16ImageLoaderMachO18doModInitFunctionsERKN11ImageLoader11LinkContextE > () > #10 0x8fe0e723 in > __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEj > () > #11 0x8fe0e809 in > __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextE () > #12 0x8fe04102 in __dyld__ZN4dyld24initializeMainExecutableEv () > #13 0x8fe07b5f in __dyld__ZN4dyld5_mainEPK11mach_headermiPPKcS5_S5_ () > #14 0x8fe01872 in __dyld__ZN13dyldbootstrap5startEPK11mach_headeriPPKcl () > #15 0x8fe01037 in __dyld__dyld_start () > > > Oddly, I can't look at the contents of an opt until I run > (gdb) add-symbol-file > /Users/jyasskin/src/llvm-2.5/obj/lib/Support/Debug/CommandLine.o > > Then I get the following at the crash: > > (gdb) f 3 > (gdb) p O > $2 = (class llvm::cl::opt std::char_traits, std::allocator >>,false,llvm::cl::parser std::char_traits, std::allocator > > > &) @0x17c200: { > = { > _vptr$Option = 0x15f2c8, > NumOccurrences = 0, > Flags = 33, > Position = 0, > AdditionalVals = 0, > NextRegistered = 0x0, > ArgStr = 0x10eb92 "o", > HelpStr = 0x10eb82 "Output filename", > ValueStr = 0x10eb79 "filename" > }, > std::char_traits, std::allocator >,false,true>> = { > ,std::allocator >> = { > _M_dataplus = { > > = { > <__gnu_cxx::new_allocator> = {}, data fields>}, > members of > std::basic_string,std::allocator >>::_Alloc_hider: > _M_p = 0xa06ea6e4 "" > } > }, }, > members of llvm::cl::opt std::char_traits, std::allocator >>,false,llvm::cl::parser std::char_traits, std::allocator > > >: > Parser = { > std::char_traits, std::allocator > >> = { > = { > _vptr$basic_parser_impl = 0x15ec58 > }, }, } > } > > > This happens both with and without the binaries from > llvm-gcc4.2-2.5-x86-darwin9.tar.gz symlinked into my PATH. It also > happens in both release and debug builds. > > $ gcc --version > i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5488) > > > > > On Fri, Feb 6, 2009 at 5:42 PM, Tanya Lattner wrote: >> LLVMers, >> >> The 2.5 pre-release is available for testing: >> http://llvm.org/prereleases/2.5/ >> >> If you have time, I'd appreciate anyone who can help test the release. >> Please do the following: >> >> 1) Download/compile llvm source, and either compile llvm-gcc source or use >> llvm-gcc binary (please compile llvm-gcc with fortran if you can). >> 2) Run make check, send me the testrun.log >> 3) Run "make TEST=nightly report" and send me the report.nightly.txt >> 4) Please provide details on what platform you compiled LLVM on, how you >> built LLMV (src == obj, or src != obj), and if you compiled llvm-gcc with >> support for fortran. >> >> Please COMPLETE ALL TESTING BY end of the day on Feb. 12th! >> >> Thanks, >> Tanya Lattner >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > From clattner at apple.com Sat Feb 7 16:17:29 2009 From: clattner at apple.com (Chris Lattner) Date: Sat, 7 Feb 2009 14:17:29 -0800 Subject: [LLVMdev] overflow + saturation stuff Message-ID: Edwin was asking about how we should handle PR3328, how we should make GEP respect -fwrapv etc. I wrote up some thoughts here if anyone is interested: http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt -Chris From zaimoni at zaimoni.com Sat Feb 7 17:38:17 2009 From: zaimoni at zaimoni.com (Kenneth Boyd) Date: Sat, 07 Feb 2009 17:38:17 -0600 Subject: [LLVMdev] 2.5 prerelease: configure script doesn't have an option to disable GoogleTest Message-ID: <498E1B69.4080207@zaimoni.com> I haven't scheduled time to investigate this yet. My guess is that GoogleTest config is getting confused on Microsoft vs POSIX mkdir when going through autoconf: llvm[3]: Compiling gtest-filepath.cc for Release-Asserts build g:\mingw32\bin\../lib/gcc/mingw32/4.2.1-dw2/../../../../include/io.h: In member function 'bool testing::internal::FilePath::CreateFolder() const': g:\mingw32\bin\../lib/gcc/mingw32/4.2.1-dw2/../../../../include/io.h:176: error: too many arguments to function 'int mkdir(const char*)' g:/Testing/llvm-2.5/utils/unittest/googletest/gtest-filepath.cc:277: error: at this point in file make[3]: *** [/Testing/llvm-2.5.optimized-noassert/utils/unittest/googletest/Release-Asserts/gtest-filepath.o] Error 1 This error shows on all four configure lines I use. Kenneth From regehr at cs.utah.edu Sat Feb 7 21:53:11 2009 From: regehr at cs.utah.edu (John Regehr) Date: Sat, 7 Feb 2009 20:53:11 -0700 (MST) Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: Sounds ambitious! A comprehensive, efficient trapv would be excellent. gcc's implementation seems quite incomplete, for example it fails to trap overflows in the constant folder. John Regehr From isanbard at gmail.com Sun Feb 8 02:32:50 2009 From: isanbard at gmail.com (Bill Wendling) Date: Sun, 8 Feb 2009 00:32:50 -0800 Subject: [LLVMdev] Problem Running llvm-suite In-Reply-To: <498D137F.7050806@illinois.edu> References: <498D137F.7050806@illinois.edu> Message-ID: <95944C26-380E-4459-BFCA-2E3CEC01662C@gmail.com> You will also need to specify where "llvm-gcc" and "llvm-g++" are. So you'll have to build LLVM, then build LLVM-GCC, link the created "gcc" and "g++" to the "llvm-gcc" and "llvm-g++", then reconfigure LLVM with the --with-llvmgccdir pointing to the LLVM-GCC you created. -bw On Feb 6, 2009, at 8:52 PM, Patrick Simmons wrote: > Hi, > > I'm trying to run the tests in llvm-suite, but I've run into trouble. > First, I had the llvm-suite checkout in a directory alongside the llvm > compiler checkout, but, when I ran "make" from llvm-suite, it > complained > about there not being a Makefile two levels above it, so I moved > llvm-suite into the "test" subdirectory inside the llvm compiler > checkout. I ran "make" again, but I got this: > > [simmon12 at maute llvm-suite]$ make > make[1]: Entering directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource' > make[2]: Entering directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/ > UnitTests' > make[3]: Entering directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/ > UnitTests/Vector' > make[4]: Entering directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/ > UnitTests/Vector/SSE' > make[4]: *** No rule to make target `Output/sse.expandfft.linked.rbc', > needed by `Output/sse.expandfft.linked.bc'. Stop. > make[4]: Leaving directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/ > UnitTests/Vector/SSE' > make[3]: *** [all] Error 1 > make[3]: Leaving directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/ > UnitTests/Vector' > make[2]: *** [all] Error 1 > make[2]: Leaving directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource/ > UnitTests' > make[1]: *** [UnitTests/.makeall] Error 2 > make[1]: Leaving directory > `/home/vadve/simmon12/llvm/llvm/test/llvm-suite/SingleSource' > make: *** [SingleSource/.makeall] Error 2 > > I already ran "./configure" inside llvm-suite, and I built a release > build of the LLVM tools in the compiler checkout, so I'm not sure what > the problem could be. Has anyone else had this problem and perhaps > solved it? > > --Patrick > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From jonas.maebe at elis.ugent.be Sun Feb 8 04:59:15 2009 From: jonas.maebe at elis.ugent.be (Jonas Maebe) Date: Sun, 8 Feb 2009 11:59:15 +0100 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: <42F7A90C-63AA-4D99-A39F-6298FE568840@elis.ugent.be> On 07 Feb 2009, at 23:17, Chris Lattner wrote: > Edwin was asking about how we should handle PR3328, how we should make > GEP respect -fwrapv etc. I wrote up some thoughts here if anyone is > interested: > http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt The proposal suggests to change/split the existing sub/add/mul opcodes. This makes me wonder to what extent it is (currently, or ever) advisable for an external compiler to generate LLVM IR. Is there a plan to stabilise at some point and guarantee backwards compatibility to a certain extent, or should compilers that are not integrated in the LLVM infrastructure always target one particular release of LLVM? Jonas From kasra_n500 at yahoo.com Sun Feb 8 06:49:42 2009 From: kasra_n500 at yahoo.com (Kasra) Date: Sun, 8 Feb 2009 04:49:42 -0800 (PST) Subject: [LLVMdev] rol/ror llvm instruction set In-Reply-To: <498D12DC.80100@mxc.ca> Message-ID: <956030.79366.qm@web110011.mail.gq1.yahoo.com> Thanks Nick for the compile. I think the case for rol/ror is closed. Since LLVM optomises the code into rotations anyway. -- Kasra From gordonhenriksen at me.com Sun Feb 8 07:41:23 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Sun, 08 Feb 2009 08:41:23 -0500 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: <42F7A90C-63AA-4D99-A39F-6298FE568840@elis.ugent.be> References: <42F7A90C-63AA-4D99-A39F-6298FE568840@elis.ugent.be> Message-ID: <5EE27981-2BF1-4CDB-85E7-9BFBBE553806@me.com> On 2009-02-08, at 05:59, Jonas Maebe wrote: > The proposal suggests to change/split the existing sub/add/mul > opcodes. This makes me wonder to what extent it is (currently, or > ever) advisable for an external compiler to generate LLVM IR. Is > there a plan to stabilise at some point and guarantee backwards > compatibility to a certain extent, or should compilers that are not > integrated in the LLVM infrastructure always target one particular > release of LLVM? LLVM does guarantee backwards compatibility with compiled bitcode. The C++ interfaces are not frozen, so you may need to upgrade code targeting LLVM when upgrading; reasonable efforts are made to avoid making this process painful. Of course, what code is contributed to the project will be maintained through these changes. ? Gordon From jonas.maebe at elis.ugent.be Sun Feb 8 09:11:56 2009 From: jonas.maebe at elis.ugent.be (Jonas Maebe) Date: Sun, 8 Feb 2009 16:11:56 +0100 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: <5EE27981-2BF1-4CDB-85E7-9BFBBE553806@me.com> References: <42F7A90C-63AA-4D99-A39F-6298FE568840@elis.ugent.be> <5EE27981-2BF1-4CDB-85E7-9BFBBE553806@me.com> Message-ID: <8B727FAB-FB2B-4E1C-9FAC-99E71EBB2E1E@elis.ugent.be> On 08 Feb 2009, at 14:41, Gordon Henriksen wrote: > On 2009-02-08, at 05:59, Jonas Maebe wrote: > >> The proposal suggests to change/split the existing sub/add/mul >> opcodes. This makes me wonder to what extent it is (currently, or >> ever) advisable for an external compiler to generate LLVM IR. Is >> there a plan to stabilise at some point and guarantee backwards >> compatibility to a certain extent, or should compilers that are not >> integrated in the LLVM infrastructure always target one particular >> release of LLVM? > > LLVM does guarantee backwards compatibility with compiled bitcode. The > C++ interfaces are not frozen, so you may need to upgrade code > targeting LLVM when upgrading; reasonable efforts are made to avoid > making this process painful. Sorry for being unclear: I did not mean the C++ interface nor compiled bitcode, but the LLVM IR "assembler" interface (i.e., the .s files that llvm-gcc generats with "-emit-llvm -S"). > Of course, what code is contributed to > the project will be maintained through these changes. In our case, I doubt that would happen since it's a self-hosting Pascal compiler (with several other code generators besides the under- development LLVM backend). But basically, if I understand you correctly: the correct interface would be compiled bitcode rather than the "assembler" level interface? Jonas From gohman at apple.com Sun Feb 8 10:58:37 2009 From: gohman at apple.com (Dan Gohman) Date: Sun, 8 Feb 2009 08:58:37 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: Hi Chris, Would it be better to split add into multiple opcodes instead of using SubclassData bits? Compare this: switch (I->getOpcode()) { case Instruction::Add: { switch (cast(I)->getOverflowBehavior()) { case AddInstruction::Wrapping: // ... case AddInstruction::UndefinedSigned: // ... case AddInstruction::UndefinedUnsigned: // ... } } } with this: switch (I->getOpcode()) { case Instruction::Add: // ... case Instruction::SAdd_Open: // ... case Instruction::UAdd_Open: // ... break; } I'm not sure about the name "Open"; fixed-size integers are "closed" under wrapping and saturating add, so "open" sort of suggests an alternative, and is concise. But regardless, a one-level switch seems more convenient than a two-level one. It's a little less convenient in the case of code that wants to handle all the flavors of add the same way, but it still seems worth it. Encoding might be a concern, as Sub, Mul, Div, and Rem would all have variants, but there are plenty of bits in SubclassID, and it doesn't look like the bitcode representation uses packed opcode fields. Dan On Feb 7, 2009, at 2:17 PM, Chris Lattner wrote: > Edwin was asking about how we should handle PR3328, how we should make > GEP respect -fwrapv etc. I wrote up some thoughts here if anyone is > interested: > http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From nicholas at mxc.ca Sun Feb 8 11:07:29 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sun, 08 Feb 2009 09:07:29 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: <8B727FAB-FB2B-4E1C-9FAC-99E71EBB2E1E@elis.ugent.be> References: <42F7A90C-63AA-4D99-A39F-6298FE568840@elis.ugent.be> <5EE27981-2BF1-4CDB-85E7-9BFBBE553806@me.com> <8B727FAB-FB2B-4E1C-9FAC-99E71EBB2E1E@elis.ugent.be> Message-ID: <498F1151.6070108@mxc.ca> Jonas Maebe wrote: > On 08 Feb 2009, at 14:41, Gordon Henriksen wrote: > >> On 2009-02-08, at 05:59, Jonas Maebe wrote: >> >>> The proposal suggests to change/split the existing sub/add/mul >>> opcodes. This makes me wonder to what extent it is (currently, or >>> ever) advisable for an external compiler to generate LLVM IR. Is >>> there a plan to stabilise at some point and guarantee backwards >>> compatibility to a certain extent, or should compilers that are not >>> integrated in the LLVM infrastructure always target one particular >>> release of LLVM? >> LLVM does guarantee backwards compatibility with compiled bitcode. The >> C++ interfaces are not frozen, so you may need to upgrade code >> targeting LLVM when upgrading; reasonable efforts are made to avoid >> making this process painful. > > Sorry for being unclear: I did not mean the C++ interface nor compiled > bitcode, but the LLVM IR "assembler" interface (i.e., the .s files > that llvm-gcc generats with "-emit-llvm -S"). > >> Of course, what code is contributed to >> the project will be maintained through these changes. > > In our case, I doubt that would happen since it's a self-hosting > Pascal compiler (with several other code generators besides the under- > development LLVM backend). > > But basically, if I understand you correctly: the correct interface > would be compiled bitcode rather than the "assembler" level interface? The textual IR (generally .ll files, not .s) are run through an auto-upgrader, the same as bitcode. Once we reach LLVM 3.0, we may break support for 2.x series .ll and .bc files. Of course, we might get to LLVM 3.0 before implementing this feature. What really happens is that once we've changed the .ll/.bc format enough that backwards compatibility is difficult to maintain, we'll declare LLVM 3.0. Nick > > Jonas > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From simmon12 at illinois.edu Sun Feb 8 13:20:53 2009 From: simmon12 at illinois.edu (Patrick Simmons) Date: Sun, 08 Feb 2009 13:20:53 -0600 Subject: [LLVMdev] Problem Running llvm-suite In-Reply-To: <5D0D6E75-DB30-4646-8E3B-021C72C2A939@apple.com> References: <498D137F.7050806@illinois.edu> <5D0D6E75-DB30-4646-8E3B-021C72C2A939@apple.com> Message-ID: <498F3095.3080402@illinois.edu> Dale Johannesen wrote: > On Feb 6, 2009, at 8:52 PM, Patrick Simmons wrote: > > >> Hi, >> >> I'm trying to run the tests in llvm-suite, but I've run into trouble. >> First, I had the llvm-suite checkout in a directory alongside the llvm >> compiler checkout, but, when I ran "make" from llvm-suite, it >> complained >> about there not being a Makefile two levels above it, so I moved >> llvm-suite into the "test" subdirectory inside the llvm compiler >> checkout. I ran "make" again, but I got this: >> > > You need to put it under the "projects" subdirectory. Read this: > http://llvm.org/docs/TestingGuide.html > > It is perhaps a little confusing; looks like you stopped reading after > the first "test suite" section. Keep going. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > Thanks, Dale. I did what you said, and I can run the test suite now. Unfortunately, the tests all fail, with cc1 complaining about unknown debugging options being passed to it. The two possible causes I can think of for this are that I passed "--enable-optimized" when configuring LLVM, and that I used the GCC frontend located in /home/vadve/shared/llvm-gcc-4.2 instead of compiling the GCC frontend myself (I added the bin directory to my PATH). Do you think either of these things could be responsible? Thanks again, --Patrick -- If I'm not here, I've gone out to find myself. If I get back before I return, please keep me here. From simmon12 at illinois.edu Sun Feb 8 13:23:25 2009 From: simmon12 at illinois.edu (Patrick Simmons) Date: Sun, 08 Feb 2009 13:23:25 -0600 Subject: [LLVMdev] Problem Running llvm-suite In-Reply-To: <498F3095.3080402@illinois.edu> References: <498D137F.7050806@illinois.edu> <5D0D6E75-DB30-4646-8E3B-021C72C2A939@apple.com> <498F3095.3080402@illinois.edu> Message-ID: <498F312D.3020505@illinois.edu> Patrick Simmons wrote: > Dale Johannesen wrote: >> On Feb 6, 2009, at 8:52 PM, Patrick Simmons wrote: >> >> >>> Hi, >>> >>> I'm trying to run the tests in llvm-suite, but I've run into trouble. >>> First, I had the llvm-suite checkout in a directory alongside the llvm >>> compiler checkout, but, when I ran "make" from llvm-suite, it >>> complained >>> about there not being a Makefile two levels above it, so I moved >>> llvm-suite into the "test" subdirectory inside the llvm compiler >>> checkout. I ran "make" again, but I got this: >>> >> >> You need to put it under the "projects" subdirectory. Read this: >> http://llvm.org/docs/TestingGuide.html >> >> It is perhaps a little confusing; looks like you stopped reading >> after the first "test suite" section. Keep going. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > Thanks, Dale. I did what you said, and I can run the test suite now. > Unfortunately, the tests all fail, with cc1 complaining about unknown > debugging options being passed to it. The two possible causes I can > think of for this are that I passed "--enable-optimized" when > configuring LLVM, and that I used the GCC frontend located in > /home/vadve/shared/llvm-gcc-4.2 instead of compiling the GCC frontend > myself (I added the bin directory to my PATH). Do you think either of > these things could be responsible? > > Thanks again, > --Patrick > That should be /home/vadve/shared/llvm-gcc4.2, sorry for the typo. It was correct in my PATH. --Patrick -- If I'm not here, I've gone out to find myself. If I get back before I return, please keep me here. From clattner at apple.com Sun Feb 8 13:25:43 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 8 Feb 2009 11:25:43 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: <9A95EE23-B36F-4B62-A6F3-78C410BD0992@apple.com> On Feb 7, 2009, at 7:53 PM, John Regehr wrote: > Sounds ambitious! A comprehensive, efficient trapv would be > excellent. > gcc's implementation seems quite incomplete, for example it fails to > trap > overflows in the constant folder. GCC's implementation has a huge number of problems, and I really don't think that implementing trapv in llvm-gcc would fare much better (fold mangles trees severely). Clang preserves and hands full unmangled source-level ASTs to codegen, so codegen could handle this properly. That said, I don't know of anyone interested in implementing this in the short term. -Chris From clattner at apple.com Sun Feb 8 13:33:39 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 8 Feb 2009 11:33:39 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: <8558F897-EDC6-46E7-8F08-C38C6FB7AFC7@apple.com> On Feb 8, 2009, at 8:58 AM, Dan Gohman wrote: > Hi Chris, > > Would it be better to split add into multiple opcodes instead of using > SubclassData bits? No, I don't think so. The big difference here is that (like type) "opcode" never changes for an instruction once it is created. I expect that optimizations would want to play with these (e.g. convert them to 'undefined' when it can prove overflow never happens) so I think it is nice to not have it be in the opcode field. This also interacts with FP rounding mode stuff, which I expect to handle the same way with FP operations some day. > Compare this: > > switch (I->getOpcode()) { > case Instruction::Add: { > switch (cast(I)->getOverflowBehavior()) { > case AddInstruction::Wrapping: > // ... > case AddInstruction::UndefinedSigned: > // ... > case AddInstruction::UndefinedUnsigned: Sure, that is ugly. However, I think it would be much more common to look at these in "isa" flavored tests than switches: if (isa(X)) is much nicer than: if (BinaryOperator *BO = dyn_cast(x)) if (BO->getOpcode() == blah::Add && BO->getOverflow() == blah::Undefined) However, we a) already suffer this just for Add, because we don't have an AddInst class, and b) don't care about the opcode anyway. IntrinsicInst is a good example of how we don't actually need opcode bits or concrete classes to make isa "work". It would be a nice cleanup to add new "pseudo instruction" classes like IntrinsicInst for all the arithmetic anyway. If the switch case really does become important, we can just add a getOpcodeWithSubtypes() method that returns a new flattened enum. > -Chris From schlie at comcast.net Sun Feb 8 19:54:58 2009 From: schlie at comcast.net (Paul Schlie) Date: Sun, 08 Feb 2009 20:54:58 -0500 Subject: [LLVMdev] overflow + saturation stuff Message-ID: Are overflow behavior tags meant to enable the specification of a particular instruction's required or presumed overflow behavior? If a required overflow behavior, then it follows that the target must correspondingly implement the behavior; neither natively or emulated? If a presumed overflow behavior, is the target meant to preferably implement or emulate the same; or is it merely meant to enable optimizations which may or may not be representative of the code's target mapped behavior? Regardless, if the target is potentially meant to implement the behavior; it follows that LLVM's assembly level representation must be able to discriminate between operations having differing semantics specified? For example processors like TI's C6X family DSP's support both saturating and 2's-comp operations; and although likely less frequently required, and typically not directly supported by any HLL languages, being able to specify target neutral in-line assembly 1's-comp end-around-carry operations can be helpful on occasion, so would be nice to see as well. From schlie at comcast.net Sun Feb 8 20:16:18 2009 From: schlie at comcast.net (Paul Schlie) Date: Sun, 08 Feb 2009 21:16:18 -0500 Subject: [LLVMdev] overflow + saturation stuff Message-ID: Further, with respect to proposed rotate operations, as with add's etc. carry semantics; all forms of shift/rotate may be specified with a single shift instruction with a tag specifying the source of in-shift bits (being that shifted out, dup, or 0); if an operation's tag is meant to affect the semantics of the specified operation. From clattner at apple.com Mon Feb 9 01:04:27 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 8 Feb 2009 23:04:27 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: <9DADE1D8-2482-49CD-96F3-A01807D153E3@apple.com> On Feb 8, 2009, at 5:54 PM, Paul Schlie wrote: > Are overflow behavior tags meant to enable the specification of a > particular instruction's required or presumed overflow behavior? I'm not sure what you mean. The overflow tags specify what happens if overflow happens (defined wrapping, defined saturating, or undefined behavior), not *when* overflow happens. > If a required overflow behavior, then it follows that the target must > correspondingly implement the behavior; neither natively or emulated? yes, if a target doesn't support saturation, it must emulate it. This is the same as targets that doesn't support rem natively (e.g. ppc). > If a presumed overflow behavior, is the target meant to preferably > implement > or emulate the same; or is it merely meant to enable optimizations > which may > or may not be representative of the code's target mapped behavior? > Regardless, if the target is potentially meant to implement the > behavior; > it follows that LLVM's assembly level representation must be able to > discriminate between operations having differing semantics specified? I don't understand what you mean. > > For example processors like TI's C6X family DSP's support both > saturating > and 2's-comp operations; and although likely less frequently > required, and > typically not directly supported by any HLL languages, being able to > specify > target neutral in-line assembly 1's-comp end-around-carry operations > can be > helpful on occasion, so would be nice to see as well. I'm trying to increase the scope of what llvm can reason about and solve some specific problems, not solve every theoretical problem. -Chris From Christian.Sayer at dibcom.fr Mon Feb 9 03:51:40 2009 From: Christian.Sayer at dibcom.fr (Christian Sayer) Date: Mon, 9 Feb 2009 10:51:40 +0100 Subject: [LLVMdev] list-td scheduler asserts on targets with implicitly defined registers In-Reply-To: References: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76620@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76624@FRPAR1CL009.coe.adi.dibcom.com> > The best fix is to teach this scheduler how to deal with these > dependencies. :-) > > If you just want a check, I think it's easier to just check register > class's copy cost. -1 means it's extremely expensive to copy registers > in the particular register class. Evan, I am not sure what you mean by "if you just want a check" - I was trying to point out that for example the following very simple testcase makes llc -march=x86 -pre-RA-sched=list-td crash : define i32 @cctest(i32 %a, i32 %b) nounwind readnone { entry: %not. = icmp sge i32 %a, %b ; [#uses=1] %.0 = zext i1 %not. to i32 ; [#uses=1] ret i32 %.0 } The assert() which triggers has been introduced since llvm 2.4. So I assume that (at least for the time being) the assert condition has to be made less restrictive. I also assume that this is ok because the dependency is rather a 'flag' dependency than one to be resolved by the scheduler. So the question would be if these assumptions are correct and if it is safe to exclude implicitly defined physreg dependencies from the assert the way I proposed. Furthermore, I was thinking that the exception should not apply on allocatable registers, since the scheduler propably really cannot handle such. When I realized that X86 EFLAGS is actually allocatable, I was wondering why, but there is certainly a reason... Best regards, Christian -- please ignore: CONFIDENTIAL NOTICE: The contents of this message, including any attachments, are confidential and are intended solely for the use of the person or entity to whom the message was addressed. If you are not the intended recipient of this message, please be advised that any dissemination, distribution, or use of the contents of this message is strictly prohibited. If you received this message in error, please notify the sender. Please also permanently delete all copies of the original message and any attached documentation. Thank you. From gohman at apple.com Mon Feb 9 10:20:48 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 9 Feb 2009 08:20:48 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: <8558F897-EDC6-46E7-8F08-C38C6FB7AFC7@apple.com> References: <8558F897-EDC6-46E7-8F08-C38C6FB7AFC7@apple.com> Message-ID: <6AB67E8F-708C-463D-9668-AE0069875FEB@apple.com> On Feb 8, 2009, at 11:33 AM, Chris Lattner wrote: > > On Feb 8, 2009, at 8:58 AM, Dan Gohman wrote: > >> Hi Chris, >> >> Would it be better to split add into multiple opcodes instead of >> using >> SubclassData bits? > > No, I don't think so. The big difference here is that (like type) > "opcode" never changes for an instruction once it is created. I > expect that optimizations would want to play with these (e.g. convert > them to 'undefined' when it can prove overflow never happens) so I > think it is nice to not have it be in the opcode field. Why is this? If SubclassData can be modified, why not SubclassID too? Having it const may help guard against something accidentally changing it to an opcode that would require a different subclass, but it's a private member, so modifications to it could be fairly effectively controlled. I agree that isa/dyn_cast can be quite flexible, but they can handle ranges of opcodes just as well as they can handle opcodes composed from multiple fields. The big-switch idiom is a staple of compiler construction; it would be nice to be able to continue to use it directly. Dan From echeng at apple.com Mon Feb 9 10:54:53 2009 From: echeng at apple.com (Evan Cheng) Date: Mon, 9 Feb 2009 08:54:53 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> Message-ID: Thanks Sandeep. I did a quick scan, this looks really good. But I do have a question: +/// CCCustomFn - This function assigns a location for Val, possibly updating +/// all args to reflect changes and indicates if it handled it. It must set +/// isCustom if it handles the arg and returns true. +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, + MVT &LocVT, CCValAssign::LocInfo &LocInfo, + ISD::ArgFlagsTy &ArgFlags, CCState &State, + bool &result); Is it necessary to return two bools (the second is returned by reference in 'result')? I am confused about the semantics of 'result'. Also, a nitpick: + unsigned i; + for (i = 0; i < 4; ++i) The convention we use is: + for (unsigned i = 0; i < 4; ++i) Thanks, Evan On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: > I think I've got all the cases handled now, implementing with > CCCustom<"foo"> callbacks into C++. > > This also fixes a crash when returning i128. I've also included a > small asm constraint fix that was needed to build newlib. > > deep > > On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng > wrote: >> >> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >> >>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>> wrote: >>>> >>>> One problem with this approach is that since i64 isn't legal, the >>>> bitcast would require custom C++ code in the ARM target to >>>> handle properly. It might make sense to introduce something >>>> like >>>> >>>> CCIfType<[f64], CCCustom> >>>> >>>> where CCCustom is a new entity that tells the calling convention >>>> code to to let the target do something not easily representable >>>> in the tablegen minilanguage. >>> >>> I am thinking that this requires two changes: add a flag to >>> CCValAssign (take a bit from HTP) to indicate isCustom and a way to >>> author an arbitrary CCAction by including the source directly in the >>> TableGen mini-language. This latter change might want a generic >>> change >>> to the TableGen language. For example, the syntax might be like: >>> >>> class foo : CCCustomAction { >>> code <<< EOF >>> ....multi-line C++ code goes here that allocates regs & mem and >>> sets CCValAssign::isCustom.... >>> EOF >>> } >>> >>> Does this seem reasonable? An alternative is for CCCustom to take a >>> string that names a function to be called: >>> >>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>> >>> the function signature for such functions will have to return two >>> results: if the CC processing is finished and if it the func >>> succeeded >>> or failed: >> >> I like the second solution better. It seems rather cumbersome to >> embed >> multi-line c++ code in td files. >> >> Evan >>> >>> >>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>> bool &result); >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From schlie at comcast.net Mon Feb 9 11:18:52 2009 From: schlie at comcast.net (Paul Schlie) Date: Mon, 09 Feb 2009 12:18:52 -0500 Subject: [LLVMdev] overflow + saturation stuff Message-ID: Chris Lattner wrote: > On Feb 8, 2009, at 5:54 PM, Paul Schlie wrote: >> Are overflow behavior tags meant to enable the specification of a >> particular instruction's required or presumed overflow behavior? > > I'm not sure what you mean. The overflow tags specify what happens if > overflow happens (defined wrapping, defined saturating, or undefined > behavior), not *when* overflow happens. - Is undefined behavior meant to imply that if such condition were to occur, the undefined behavior will be warranted to not be expressed as it will be trapped; or merely assumed it won't occur and thereby may be optimized based on this assumption, regardless of the behavior which may actually be expressed in the absents of optimization if and when the condition occurred (and thereby optimizations may legitimately alter logical program behavior if sensitive to an otherwise expressible undefined behavior), and thereby truly just an optimization tag. >> If a required overflow behavior, then it follows that the target must >> correspondingly implement the behavior; either natively or emulated? > > yes, if a target doesn't support saturation, it must emulate it. This > is the same as targets that doesn't support rem natively (e.g. ppc). - Thanks partially understood; as above, the tags seem to have multiple intended purposes; on one hand to be used by optimizers but not affect the instruction selection process; on the other hand must affect the selection process as below? >> If a presumed overflow behavior, is the target meant to preferably >> implement or emulate the same; or is it merely meant to enable >> optimizations which may or may not be representative of the code's >> target mapped behavior? >> >> Regardless, if the target is potentially meant to implement the >> behavior; it follows that LLVM's assembly level representation must >> be able to discriminate between operations having differing semantics >> specified? > > I don't understand what you mean. - Sorry, merely meant: if an instruction's overflow behavior tag is meant to affect target instruction selection semantics, it would seem necessary to be selectable at the llvm assembly code level (i.e. how does one specify a saturating addition vs. 2's-comp addition instruction semantics at the llvm assembly code level of representation)? >> For example processors like TI's C6X family DSP's support both >> saturating and 2's-comp operations; and although likely less frequently >> required, and typically not directly supported by any HLL languages, >> being able to specify target neutral in-line assembly 1's-comp >> end-around-carry operations can be helpful on occasion, so would be >> nice to see as well. > > I'm trying to increase the scope of what llvm can reason about and > solve some specific problems, not solve every theoretical problem. From echeng at apple.com Mon Feb 9 11:23:29 2009 From: echeng at apple.com (Evan Cheng) Date: Mon, 9 Feb 2009 09:23:29 -0800 Subject: [LLVMdev] list-td scheduler asserts on targets with implicitly defined registers In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76624@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76620@FRPAR1CL009.coe.adi.dibcom.com> <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76624@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: <87FB04BE-FF76-4204-AC44-F3D250781387@apple.com> On Feb 9, 2009, at 1:51 AM, Christian Sayer wrote: >> The best fix is to teach this scheduler how to deal with these >> dependencies. :-) >> >> If you just want a check, I think it's easier to just check register >> class's copy cost. -1 means it's extremely expensive to copy >> registers >> in the particular register class. > > Evan, > I am not sure what you mean by "if you just want a check" - I was > trying to point out that for example the following very simple > testcase makes llc -march=x86 -pre-RA-sched=list-td crash : > > define i32 @cctest(i32 %a, i32 %b) nounwind readnone { > entry: > %not. = icmp sge i32 %a, %b ; [#uses=1] > %.0 = zext i1 %not. to i32 ; [#uses=1] > ret i32 %.0 > } > > The assert() which triggers has been introduced since llvm 2.4. > So I assume that (at least for the time being) the assert condition > has to be made less restrictive. I also assume that this is ok > because the dependency is rather a 'flag' dependency than one to be > resolved by the scheduler. > So the question would be if these assumptions are correct and if it > is safe to exclude implicitly defined physreg dependencies from the > assert the way I proposed. Furthermore, I was thinking that the > exception should not apply on allocatable registers, since the > scheduler propably really cannot handle such. When I realized that > X86 EFLAGS is actually allocatable, I was wondering why, but there > is certainly a reason... I was saying that instead of checking whether the physical register is allocatable, you might want to check the CopyCost of the register class (assuming it belongs to only one) is a negative value. That indicates it's extremely expensive to copy the physical register. I think marking EFLAGS non-allocatable is fine. But I don't think it's the right fix. For example, ESP is not allocatable but you can insert a pair of copies (ESP to vreg and then vreg to ESP) to handle the dependency. Evan > > > Best regards, > Christian > > -- > > > > > > > > please ignore: > > CONFIDENTIAL NOTICE: The contents of this message, including any > attachments, are confidential and are intended solely for the use of > the person or entity to whom the message was addressed. If you are > not the intended recipient of this message, please be advised that > any dissemination, distribution, or use of the contents of this > message is strictly prohibited. If you received this message in > error, please notify the sender. Please also permanently delete all > copies of the original message and any attached documentation. Thank > you. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From gohman at apple.com Mon Feb 9 11:46:04 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 9 Feb 2009 09:46:04 -0800 Subject: [LLVMdev] list-td scheduler asserts on targets with implicitly defined registers In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76624@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76620@FRPAR1CL009.coe.adi.dibcom.com> <57C38DA176A0A34A9B9F3CCCE33D3C4AEF27C76624@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: On Feb 9, 2009, at 1:51 AM, Christian Sayer wrote: >> The best fix is to teach this scheduler how to deal with these >> dependencies. :-) >> >> If you just want a check, I think it's easier to just check register >> class's copy cost. -1 means it's extremely expensive to copy >> registers >> in the particular register class. > > Evan, > I am not sure what you mean by "if you just want a check" - I was > trying to point out that for example the following very simple > testcase makes llc -march=x86 -pre-RA-sched=list-td crash : > > define i32 @cctest(i32 %a, i32 %b) nounwind readnone { > entry: > %not. = icmp sge i32 %a, %b ; [#uses=1] > %.0 = zext i1 %not. to i32 ; [#uses=1] > ret i32 %.0 > } > > The assert() which triggers has been introduced since llvm 2.4. > So I assume that (at least for the time being) the assert condition > has to be made less restrictive. The assertion changes potential miscompilations to compiler aborts. It triggers more often than strictly necessary; the above testcase would not have been miscompiled, but slightly more complicated ones would be. > I also assume that this is ok because the dependency is rather a > 'flag' dependency than one to be resolved by the scheduler. > So the question would be if these assumptions are correct and if it > is safe to exclude implicitly defined physreg dependencies from the > assert the way I proposed. Furthermore, I was thinking that the > exception should not apply on allocatable registers, since the > scheduler propably really cannot handle such. When I realized that > X86 EFLAGS is actually allocatable, I was wondering why, but there > is certainly a reason... The basic situation is A defines a value in register R that's read by B, and C clobbers R. Here, C can be scheduled before A and B, or after A and B, but not in between. This can happen with allocatable registers as well as non-allocatable ones. It also can't easily be modeled in a simple dependency graph. CodeGen used to handle this by always using "Flag" dependencies, which effectively require that nothing be scheduled between A and B. This is simple, and it worked with all the schedulers, however, it's over- restrictive. A while ago, the list-burr scheduler was enhanced to support physical register dependencies, and some things were changed to make use of this, instead of using the Flag mechanism. Eventually, more things will be converted over. Patches to add physical register dependency tracking to the other schedulers would be welcome. Dan From baldrick at free.fr Mon Feb 9 12:00:09 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 9 Feb 2009 19:00:09 +0100 Subject: [LLVMdev] 2.5 Pre-release1 available for testing In-Reply-To: References: Message-ID: <200902091900.10593.baldrick@free.fr> Hi Tanya, I see the following warnings when building. I'm not sure how to fix any of them. The last one looks like it might be serious (seems like a job for Chris). llvm[1]: Compiling Path.cpp for Release build In file included from Path.cpp:270: Unix/Path.inc: In member function ?bool llvm::sys::Path::eraseFromDisk(bool, std::string*) const?: Unix/Path.inc:661: warning: ignoring return value of ?int system(const char*)?, declared with attribute warn_unused_result llvm[1]: Compiling raw_ostream.cpp for Release build raw_ostream.cpp: In member function ?virtual void llvm::raw_fd_ostream::flush_impl()?: raw_ostream.cpp:245: warning: ignoring return value of ?ssize_t write(int, const void*, size_t)?, declared with attribute warn_unused_result llvm[2]: Compiling LLParser.cpp for Release build LLParser.cpp: In member function ?bool llvm::LLParser::ParseGlobal(const std::string&, const char*, unsigned int, bool, unsigned int)?: LLParser.cpp:448: warning: ?IsConstant? may be used uninitialized in this function Ciao, Duncan. From jay.foad at gmail.com Mon Feb 9 12:34:43 2009 From: jay.foad at gmail.com (Jay Foad) Date: Mon, 9 Feb 2009 18:34:43 +0000 Subject: [LLVMdev] 2.5 Pre-release1 available for testing In-Reply-To: References: Message-ID: Hi, I've reported a couple of regressions from LLVM 2.4 on test cases from the GCC testsuite: undefined reference to extern inline function http://llvm.org/bugs/show_bug.cgi?id=3517 undefined reference to __compound_literal.* http://llvm.org/bugs/show_bug.cgi?id=3518 Thanks, Jay. From tonic at nondot.org Mon Feb 9 13:10:51 2009 From: tonic at nondot.org (Tanya M. Lattner) Date: Mon, 9 Feb 2009 11:10:51 -0800 (PST) Subject: [LLVMdev] LLVM Release Criteria Message-ID: Hello, I'd like to clarify a few points on how the release process works. Each release must satisfy the following criteria for each supported target: * LLVM-GCC & LLVM must build in both release and debug mode. They must also build srcDir != objDir. LLVM-GCC must build with support for c, c++, fortran, and obj-c/obj-c++ (Mac only). * LLVM-GCC must bootstrap. * "make check" (dejagnu test suite) must pass cleanly. * There are no regressions in llvm-test from the previous release (correctness only). Currently the supported targets are: Mac OS 10.5 x86, Mac OS 10.5 PPC, Linux x86, Mingw x86 (however, this is not subjected to the same release criteria as above). For 2.6, we hope to expand this list once we get the release team in place. Patches are accepted for the 2.5 branch but will only be merged in if they meet the following criteria: 1) During the pre-release1 testing phase, I accept patches that fix known regressions in the llvm-test and dejagnu test suite. Patches may also be accepted that fix regressions or bugs not in those two test suites if they are low risk or fix some extremely critical bug and are approved by the appropriate code owner. However, we try to be extremely conservative and do not accept all patches. Do not expect your patch to be accepted. 2) After the pre-release1 testing phase ends, I begin creating the next release candidate. I can not accept any additional patches during this time. 3) During pre-release2 testing, only patches that fix regressions in llvm-test or the dejagnu test suite will be merged into the release. 4) If more patches are required to be merged, we have a 3rd round of pre-release testing. If you have any questions, please let me know. Thanks, Tanya From cmdkeen at gmx.de Mon Feb 9 07:44:43 2009 From: cmdkeen at gmx.de (Jan Rehders) Date: Mon, 09 Feb 2009 14:44:43 +0100 Subject: [LLVMdev] Building 64-bit libraries on OS X Message-ID: <20090209134443.192770@gmx.net> Hi, how do I compile LLVM for 64-bit on OS X? I want to get 64-bit libraries which generate x86_64 to link them into a 64-bit application. All my attempts ended up with either 32-bit libraries or errors. My machine is an Intel Xeon quad core, 'sysctl hw.cpu64bit_capable' returns 1 so I think the machine is fine. - './configure && make' yields 32-bit libraries and executables - I've tried various variations of ./configure --host X and ./configure --target X. I am not sure which target triple to use (the getting started guide says "The values of these options must be legal target triples that your GCC compiler supports."). Unfortunately I couldn't figure out which target triples to use. I've tried a few but until now I either get an error when calling configure or I simply end up with 32-bit libraries. For instance "i686_x86-apple-darwin9.0.0" results in "Invalid configuration `i686_x86-apple-darwin9.0.0': machine `i686_x86-apple' not recognized". "x86_64-apple-darwin9.0.0" configures and builds fine but all libraries are 32-bit ("file libLTO.dylib" says "libLTO.dylib: Mach-O dynamically linked shared library i386"). So, how can I build a 64-bit version for OS X? (Unfortunately prebuild executables are not an option for me) Jan From mrs at apple.com Mon Feb 9 14:20:35 2009 From: mrs at apple.com (Mike Stump) Date: Mon, 9 Feb 2009 12:20:35 -0800 Subject: [LLVMdev] Building 64-bit libraries on OS X In-Reply-To: <20090209134443.192770@gmx.net> References: <20090209134443.192770@gmx.net> Message-ID: On Feb 9, 2009, at 5:44 AM, Jan Rehders wrote: > - './configure && make' yields 32-bit libraries and executables Something like: CXX="g++ -m64" CC="gcc -m64" configure && make From dalej at apple.com Mon Feb 9 17:02:52 2009 From: dalej at apple.com (Dale Johannesen) Date: Mon, 9 Feb 2009 15:02:52 -0800 Subject: [LLVMdev] Problem Running llvm-suite In-Reply-To: <498F312D.3020505@illinois.edu> References: <498D137F.7050806@illinois.edu> <5D0D6E75-DB30-4646-8E3B-021C72C2A939@apple.com> <498F3095.3080402@illinois.edu> <498F312D.3020505@illinois.edu> Message-ID: On Feb 8, 2009, at 11:23 AMPST, Patrick Simmons wrote: >>> >>> You need to put it under the "projects" subdirectory. Read this: >>> http://llvm.org/docs/TestingGuide.html >>> >>> It is perhaps a little confusing; looks like you stopped reading >>> after the first "test suite" section. Keep going. >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> Thanks, Dale. I did what you said, and I can run the test suite now. >> Unfortunately, the tests all fail, with cc1 complaining about unknown >> debugging options being passed to it. The two possible causes I can >> think of for this are that I passed "--enable-optimized" when >> configuring LLVM, and that I used the GCC frontend located in >> /home/vadve/shared/llvm-gcc-4.2 instead of compiling the GCC frontend >> myself (I added the bin directory to my PATH). Do you think either >> of >> these things could be responsible? >> >> Thanks again, >> --Patrick >> > That should be /home/vadve/shared/llvm-gcc4.2, sorry for the typo. It > was correct in my PATH. I've never used that FE, I've always built my own. So it seems possible that's the problem. I don't personally use -enable-optimized much, but other people do; that seems unlikely. From wurstgebaeck at googlemail.com Mon Feb 9 18:16:18 2009 From: wurstgebaeck at googlemail.com (Jan Rehders) Date: Tue, 10 Feb 2009 01:16:18 +0100 Subject: [LLVMdev] Building 64-bit libraries on OS X Message-ID: <64EE98AC-C26D-46BC-BE80-42B7C8CD9BC2@gmail.com> Hi, how do I compile LLVM for 64-bit on OS X? I want to get 64-bit libraries which generate x86_64 to link them into a 64-bit application. All my attempts ended up with either 32-bit libraries or errors. My machine is an Intel Xeon quad core, 'sysctl hw.cpu64bit_capable' returns 1 so I think the machine is fine. - './configure && make' yields 32-bit libraries and executables - I've tried various variations of ./configure --host X and ./ configure --target X. I am not sure which target triple to use (the getting started guide says "The values of these options must be legal target triples that your GCC compiler supports."). Unfortunately I couldn't figure out which target triples to use. I've tried a few but until now I either get an error when calling configure or I simply end up with 32-bit libraries. For instance "i686_x86-apple-darwin9.0.0" results in "Invalid configuration `i686_x86-apple-darwin9.0.0': machine `i686_x86-apple' not recognized". "x86_64-apple-darwin9.0.0" configures and builds fine but all libraries are 32-bit ("file libLTO.dylib" says "libLTO.dylib: Mach-O dynamically linked shared library i386"). So, how can I build a 64-bit version for OS X? (Unfortunately prebuild executables are not an option for me) Jan From dalej at apple.com Mon Feb 9 18:25:44 2009 From: dalej at apple.com (Dale Johannesen) Date: Mon, 9 Feb 2009 16:25:44 -0800 Subject: [LLVMdev] Building 64-bit libraries on OS X In-Reply-To: <64EE98AC-C26D-46BC-BE80-42B7C8CD9BC2@gmail.com> References: <64EE98AC-C26D-46BC-BE80-42B7C8CD9BC2@gmail.com> Message-ID: <42930EB1-8728-42EE-AFD8-DE1CD4ADFDBA@apple.com> To build 64-bit libraries (i.e. 'file ' shows x86_64) try 'make EXTRA_OPTIONS=-m64" Either 32-bit or 64-bit libraries are able to generate code for either 32-bit or 64-bit, try -m32 or -m64 at runtime On Feb 9, 2009, at 4:16 PMPST, Jan Rehders wrote: > Hi, > > how do I compile LLVM for 64-bit on OS X? I want to get 64-bit > libraries which generate x86_64 to link them into a 64-bit > application. All my attempts ended up with either 32-bit libraries or > errors. My machine is an Intel Xeon quad core, 'sysctl > hw.cpu64bit_capable' returns 1 so I think the machine is fine. > > - './configure && make' yields 32-bit libraries and executables > - I've tried various variations of ./configure --host X and ./ > configure --target X. I am not sure which target triple to use (the > getting started guide says "The values of these options must be legal > target triples that your GCC compiler supports."). Unfortunately I > couldn't figure out which target triples to use. I've tried a few but > until now I either get an error when calling configure or I simply end > up with 32-bit libraries. For instance "i686_x86-apple-darwin9.0.0" > results in "Invalid configuration `i686_x86-apple-darwin9.0.0': > machine `i686_x86-apple' not recognized". "x86_64-apple-darwin9.0.0" > configures and builds fine but all libraries are 32-bit ("file > libLTO.dylib" says "libLTO.dylib: Mach-O dynamically linked shared > library i386"). > > So, how can I build a 64-bit version for OS X? (Unfortunately prebuild > executables are not an option for me) > > Jan > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From howarth at bromo.med.uc.edu Mon Feb 9 18:50:26 2009 From: howarth at bromo.med.uc.edu (Jack Howarth) Date: Mon, 9 Feb 2009 19:50:26 -0500 Subject: [LLVMdev] Building 64-bit libraries on OS X In-Reply-To: <42930EB1-8728-42EE-AFD8-DE1CD4ADFDBA@apple.com> References: <64EE98AC-C26D-46BC-BE80-42B7C8CD9BC2@gmail.com> <42930EB1-8728-42EE-AFD8-DE1CD4ADFDBA@apple.com> Message-ID: <20090210005026.GA21781@bromo.med.uc.edu> Does llvm (not llvm-gcc-4.2) understand the x86_64-apple-darwin target? I am not sure if patches for x86_64-apple-darwin as a target are present in gcc 4.2.1. I do know we only added the full multilib support for the x86_64-apple-darwin target in gcc 4.4. Jack On Mon, Feb 09, 2009 at 04:25:44PM -0800, Dale Johannesen wrote: > To build 64-bit libraries (i.e. 'file ' shows x86_64) try > 'make EXTRA_OPTIONS=-m64" > > Either 32-bit or 64-bit libraries are able to generate code for either > 32-bit or 64-bit, try -m32 or -m64 at runtime > > On Feb 9, 2009, at 4:16 PMPST, Jan Rehders wrote: > > > Hi, > > > > how do I compile LLVM for 64-bit on OS X? I want to get 64-bit > > libraries which generate x86_64 to link them into a 64-bit > > application. All my attempts ended up with either 32-bit libraries or > > errors. My machine is an Intel Xeon quad core, 'sysctl > > hw.cpu64bit_capable' returns 1 so I think the machine is fine. > > > > - './configure && make' yields 32-bit libraries and executables > > - I've tried various variations of ./configure --host X and ./ > > configure --target X. I am not sure which target triple to use (the > > getting started guide says "The values of these options must be legal > > target triples that your GCC compiler supports."). Unfortunately I > > couldn't figure out which target triples to use. I've tried a few but > > until now I either get an error when calling configure or I simply end > > up with 32-bit libraries. For instance "i686_x86-apple-darwin9.0.0" > > results in "Invalid configuration `i686_x86-apple-darwin9.0.0': > > machine `i686_x86-apple' not recognized". "x86_64-apple-darwin9.0.0" > > configures and builds fine but all libraries are 32-bit ("file > > libLTO.dylib" says "libLTO.dylib: Mach-O dynamically linked shared > > library i386"). > > > > So, how can I build a 64-bit version for OS X? (Unfortunately prebuild > > executables are not an option for me) > > > > Jan > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From Micah.Villmow at amd.com Mon Feb 9 19:17:03 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Mon, 9 Feb 2009 17:17:03 -0800 Subject: [LLVMdev] Multiclass patterns Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827237@ssanexmb1.amd.com> Is there a way to define a multi-class pattern in tablegen? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090209/19b2f55b/attachment.html From isanbard at gmail.com Mon Feb 9 19:39:01 2009 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 9 Feb 2009 17:39:01 -0800 Subject: [LLVMdev] Multiclass patterns In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C827237@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827237@ssanexmb1.amd.com> Message-ID: <16e5fdf90902091739q7887976fw126f17eaad945ba0@mail.gmail.com> On Mon, Feb 9, 2009 at 5:17 PM, Villmow, Micah wrote: > Is there a way to define a multi-class pattern in tablegen? > Yes. See "multiclass" and "defm" in, say, X86Instr64bit.td, et al. -bw From clattner at apple.com Mon Feb 9 22:49:43 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 9 Feb 2009 20:49:43 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: <6AB67E8F-708C-463D-9668-AE0069875FEB@apple.com> References: <8558F897-EDC6-46E7-8F08-C38C6FB7AFC7@apple.com> <6AB67E8F-708C-463D-9668-AE0069875FEB@apple.com> Message-ID: <95DA3BC4-7A49-4E08-93C7-5AAC9E97F70E@apple.com> On Feb 9, 2009, at 8:20 AM, Dan Gohman wrote: > > On Feb 8, 2009, at 11:33 AM, Chris Lattner wrote: > >> >> On Feb 8, 2009, at 8:58 AM, Dan Gohman wrote: >> >>> Hi Chris, >>> >>> Would it be better to split add into multiple opcodes instead of >>> using >>> SubclassData bits? >> >> No, I don't think so. The big difference here is that (like type) >> "opcode" never changes for an instruction once it is created. I >> expect that optimizations would want to play with these (e.g. convert >> them to 'undefined' when it can prove overflow never happens) so I >> think it is nice to not have it be in the opcode field. > > Why is this? If SubclassData can be modified, why not SubclassID too? > Having it const may help guard against something accidentally changing > it to an opcode that would require a different subclass, but it's a > private > member, so modifications to it could be fairly effectively controlled. There is no technical reason, it just provides a more clear API and makes it easier to reason about. For example since SubclassID can't change, you don't have issues where you'd need to change the actual class of the impl. -Chris From Micah.Villmow at amd.com Tue Feb 10 10:27:19 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Tue, 10 Feb 2009 08:27:19 -0800 Subject: [LLVMdev] Multiclass patterns In-Reply-To: <16e5fdf90902091739q7887976fw126f17eaad945ba0@mail.gmail.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827237@ssanexmb1.amd.com> <16e5fdf90902091739q7887976fw126f17eaad945ba0@mail.gmail.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C8272C6@ssanexmb1.amd.com> Bill, Sorry if I wasn't clear enough. I wasn't referring to multiclass's that define other classes, but with using patterns inside of a multiclass to reduce redundant code. For example: multiclass IntSubtract { def _i8 : Pat<(sub GPRI8:$src0, GPRI8:$src1), (ADD_i8 GPRI8:$src0, (NEGATE_i8 GPRI8:$src1))>; def _i32 : Pat<(sub GPRI32:$src0, GPRI32:$src1), (ADD_i32 GPRI32:$src0, (NEGATE_i32 GPRI32:$src1))>; } or something similar. I just want to write the pattern once and then have it apply to multiple register types, i.e. a generic pattern rule for many different register classes. Thanks, Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Bill Wendling Sent: Monday, February 09, 2009 5:39 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Multiclass patterns On Mon, Feb 9, 2009 at 5:17 PM, Villmow, Micah wrote: > Is there a way to define a multi-class pattern in tablegen? > Yes. See "multiclass" and "defm" in, say, X86Instr64bit.td, et al. -bw _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From isanbard at gmail.com Tue Feb 10 13:17:07 2009 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 10 Feb 2009 11:17:07 -0800 Subject: [LLVMdev] Multiclass patterns In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C8272C6@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827237@ssanexmb1.amd.com> <16e5fdf90902091739q7887976fw126f17eaad945ba0@mail.gmail.com> <5BA674C5FF7B384A92C2C95D8CC71E1C8272C6@ssanexmb1.amd.com> Message-ID: <16e5fdf90902101117w756f1b32t1d9df5e2df83c44@mail.gmail.com> On Tue, Feb 10, 2009 at 8:27 AM, Villmow, Micah wrote: > Bill, > Sorry if I wasn't clear enough. I wasn't referring to multiclass's that > define other classes, but with using patterns inside of a multiclass to > reduce redundant code. > For example: > multiclass IntSubtract > { > def _i8 : Pat<(sub GPRI8:$src0, GPRI8:$src1), > (ADD_i8 GPRI8:$src0, (NEGATE_i8 GPRI8:$src1))>; > def _i32 : Pat<(sub GPRI32:$src0, GPRI32:$src1), > (ADD_i32 GPRI32:$src0, (NEGATE_i32 GPRI32:$src1))>; > } > > or something similar. > I just want to write the pattern once and then have it apply to multiple > register types, i.e. a generic pattern rule for many different register > classes. > Please look at the documentation for "multiclass" and "defm". In the X86InstrSSE.td file, we have this, which looks very similar to what you have above. let Constraints = "$src1 = $dst" in { multiclass basic_sse1_fp_binop_rm opc, string OpcodeStr, SDNode OpNode, Intrinsic F32Int, bit Commutable = 0> { // Scalar operation, reg+reg. def SSrr : SSI { let isCommutable = Commutable; } // Scalar operation, reg+mem. def SSrm : SSI; // etc. } } // Arithmetic instructions defm ADD : basic_sse1_fp_binop_rm<0x58, "add", fadd, int_x86_sse_add_ss, 1>; -bw From ubub at gmx.net Tue Feb 10 06:51:50 2009 From: ubub at gmx.net (Tobias) Date: Tue, 10 Feb 2009 13:51:50 +0100 Subject: [LLVMdev] direct calls to inttoptr constants Message-ID: <30DC19BB5C45458B9F794306745473D6@dev> I'm compiling code which contains indirect function calls via their absolute addresses, which are known/fixed at compile-time: pseudo c code: int main() { int (* f)(int) = (int (*)(int))12345678; return (*f)(0); } the IR looks like: define i32 @main() nounwind { entry: %0 = tail call i32 inttoptr (i64 12345678 to i32 (i32)*)(i32 0) nounwind ret i32 %0 } on X86 llc 2.4 compiles this to: .text .align 16 .globl main .type main, at function main: subl $4, %esp movl $0, (%esp) movl $12345678, %eax call *%eax addl $4, %esp ret .size main, .-main .section .note.GNU-stack,"", at progbits take a look at: movl $12345678, %eax call *%eax does anyone know a way to cause llc to call the address directly? hints where to start patching the codegen are also welcome. expected assembly: call *12345678 best regards tobias From vadve at cs.uiuc.edu Tue Feb 10 13:30:06 2009 From: vadve at cs.uiuc.edu (Vikram S. Adve) Date: Tue, 10 Feb 2009 13:30:06 -0600 Subject: [LLVMdev] OpenCL kernel to bitcode In-Reply-To: References: Message-ID: I don't think I ever saw a response to this message. RapidMind reported that they are using OpenCL as well as LLVM but their press release wasn't clear about whether they do this. I'd be interested in hearing from Stefanus or anyone else there about how you use OpenCL and whether it is compiled to LLVM. --Vikram Associate Professor, Computer Science University of Illinois at Urbana-Champaign http://llvm.org/~vadve On Feb 2, 2009, at 9:14 AM, Nico wrote: > Hi, > > is there any possibility to compile OpenCL kernels into LLVM-bitcode? > > Thanx, > Nico > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From stefanus.dutoit at rapidmind.com Tue Feb 10 13:48:47 2009 From: stefanus.dutoit at rapidmind.com (Stefanus Du Toit) Date: Tue, 10 Feb 2009 14:48:47 -0500 Subject: [LLVMdev] OpenCL kernel to bitcode In-Reply-To: References: Message-ID: <7581790D-515F-46B9-8A7B-5180076E2AF7@rapidmind.com> Hi Vikram and Nico, RapidMind is a user today of LLVM, and potentially in the future of OpenCL. We use LLVM for x86 code generation today, and intend to use OpenCL to access GPU devices in the future once vendor implementations of OpenCL are available. There are currently no complete available implementations of OpenCL, but that will change soon. It is very likely that some of these implementations will be based on LLVM and related technologies. OpenCL provides functionality to compile kernels written in OpenCL C (a language derived from C99 used to express computations in OpenCL) to some form of binary code, both at runtime and as a compile time tool. The specification does not say what that binary code should look like. LLVM IR is certainly an option for an OpenCL implementation, but so is native binary code for a specific hardware device, or any other IR or machine code representation. Currently there is no provision of portability of binary code between OpenCL implementations. LLVM does not currently (to my knowledge) contain any functionality specific to OpenCL, but LLVM and Clang would provide a good starting point for (the language part of) an OpenCL implementation. I hope that addresses your questions, Stefanus On 10-Feb-09, at 2:30 PM, Vikram S. Adve wrote: > I don't think I ever saw a response to this message. RapidMind > reported that they are using OpenCL as well as LLVM but their press > release wasn't clear about whether they do this. I'd be interested in > hearing from Stefanus or anyone else there about how you use OpenCL > and whether it is compiled to LLVM. > > --Vikram > Associate Professor, Computer Science > University of Illinois at Urbana-Champaign > http://llvm.org/~vadve > > > > On Feb 2, 2009, at 9:14 AM, Nico wrote: > >> Hi, >> >> is there any possibility to compile OpenCL kernels into LLVM-bitcode? >> >> Thanx, >> Nico >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -- Stefanus Du Toit RapidMind Inc. phone: +1 519 885 5455 x116 -- fax: +1 519 885 1463 From Micah.Villmow at amd.com Tue Feb 10 13:56:44 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Tue, 10 Feb 2009 11:56:44 -0800 Subject: [LLVMdev] Multiclass patterns In-Reply-To: <16e5fdf90902101117w756f1b32t1d9df5e2df83c44@mail.gmail.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827237@ssanexmb1.amd.com><16e5fdf90902091739q7887976fw126f17eaad945ba0@mail.gmail.com><5BA674C5FF7B384A92C2C95D8CC71E1C8272C6@ssanexmb1.amd.com> <16e5fdf90902101117w756f1b32t1d9df5e2df83c44@mail.gmail.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827332@ssanexmb1.amd.com> Bill, Thanks for the tip, but is not what I am looking to do. From my understanding of multiclass is that it creates multiple instructions derived from the Instruction class, which goes in the direction of multiple target-independent nodes into a single target-dependent node. However, what I want is something similar but derived from the Pattern class. This has the opposite affect of taking a single target-independent node and producing multiple target-dependent nodes. I could use the standard multiclass, but it has been stated on this list that it is not advised to generate multiple instructions via the text expansion but to use patterns instead. Sorry for the confusion. Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Bill Wendling Sent: Tuesday, February 10, 2009 11:17 AM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Multiclass patterns On Tue, Feb 10, 2009 at 8:27 AM, Villmow, Micah wrote: > Bill, > Sorry if I wasn't clear enough. I wasn't referring to multiclass's that > define other classes, but with using patterns inside of a multiclass to > reduce redundant code. > For example: > multiclass IntSubtract > { > def _i8 : Pat<(sub GPRI8:$src0, GPRI8:$src1), > (ADD_i8 GPRI8:$src0, (NEGATE_i8 GPRI8:$src1))>; > def _i32 : Pat<(sub GPRI32:$src0, GPRI32:$src1), > (ADD_i32 GPRI32:$src0, (NEGATE_i32 GPRI32:$src1))>; > } > > or something similar. > I just want to write the pattern once and then have it apply to multiple > register types, i.e. a generic pattern rule for many different register > classes. > Please look at the documentation for "multiclass" and "defm". In the X86InstrSSE.td file, we have this, which looks very similar to what you have above. let Constraints = "$src1 = $dst" in { multiclass basic_sse1_fp_binop_rm opc, string OpcodeStr, SDNode OpNode, Intrinsic F32Int, bit Commutable = 0> { // Scalar operation, reg+reg. def SSrr : SSI { let isCommutable = Commutable; } // Scalar operation, reg+mem. def SSrm : SSI; // etc. } } // Arithmetic instructions defm ADD : basic_sse1_fp_binop_rm<0x58, "add", fadd, int_x86_sse_add_ss, 1>; -bw _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From mrs at apple.com Tue Feb 10 16:03:16 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 10 Feb 2009 14:03:16 -0800 Subject: [LLVMdev] direct calls to inttoptr constants In-Reply-To: <30DC19BB5C45458B9F794306745473D6@dev> References: <30DC19BB5C45458B9F794306745473D6@dev> Message-ID: On Feb 10, 2009, at 4:51 AM, Tobias wrote: > I'm compiling code which contains indirect function calls > via their absolute addresses, which are known/fixed at compile-time: > call *%eax > > does anyone know a way to cause llc to call the address directly? > hints where to start patching the codegen are also welcome. lib/Target/X86/X86InstrInfo.td, search for call and CALL. The problem is that the instruction only allows for 32-relative addresses. The form: call *0x20 isn't always valid as a 32-bit relative address. If you place that instruction at 1<<60, the destination is more than 1<<32 away. If you want to say something about your model, say, all code is in the first 2bg, then, the instruction would be allowed. Problem is, I suspect there isn't enough model support for this, or your not using it. From mrs at apple.com Tue Feb 10 20:01:28 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 10 Feb 2009 18:01:28 -0800 Subject: [LLVMdev] new warnings, I think Message-ID: new warnings, I think lib/CodeGen/SelectionDAG/DAGCombiner.cpp: In member function ?llvm::SDValue::DAGCombiner::FindBetterChain(llvm::SDNode*, llvm::SDValue)?: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6006: warning: ?SrcValueOffset? may be used uninitialized in this function lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6006: note: ?SrcValueOffset? was declared here lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6005: warning: ?SrcValue? may be used uninitialized in this function lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6005: note: ?SrcValue? was declared here lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6004: warning: ?Size? may be used uninitialized in this function lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6004: note: ?Size? was declared here lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6034: warning: ?OpSrcValueOffset? may be used uninitialized in this function lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6034: note: ?OpSrcValueOffset? was declared here lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6033: warning: ?OpSrcValue? may be used uninitialized in this function lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6033: note: ?OpSrcValue? was declared here lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6032: warning: ?OpSize? may be used uninitialized in this function lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6032: note: ?OpSize? was declared here From aoeullvm at brinckerhoff.org Wed Feb 11 00:28:06 2009 From: aoeullvm at brinckerhoff.org (John Clements) Date: Tue, 10 Feb 2009 22:28:06 -0800 Subject: [LLVMdev] Suggested change to docs re: double/float constant syntax. Message-ID: <021F3BAB-F3FD-46D5-8552-7A9C9FA53BBA@brinckerhoff.org> It appears to me based on my experiments with llvm-as that literal floating-point constants may be specified in the 64-bit IEEE hexadecimal format, but not in the 32-bit format. For instance, this file: target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32- i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64- f80:128:128" target triple = "i386-apple-darwin9.5" @mydouble = constant double 0x432ff973cafa8000 @myfloat = constant float 0x405b126f signals this error: uccello:/tmp clements$ llvm-as foo2.s llvm-as: foo2.s:6:27: floating point constant invalid for type @myfloat = constant float 0x405b126f ^ If I'm correct in this assumption, I suggest mentioning this in the language reference. In the interest of not making work for other people, I'll suggest a drop-in replacement. Here's what's currently there: The one non-intuitive notation for constants is the optional hexadecimal form of floating point constants. For example, the form 'double 0x432ff973cafa8000' is equivalent to (but harder to read than) 'double 4.5e+15'. The only time hexadecimal floating point constants are required (and the only time that they are generated by the disassembler) is when a floating point constant must be emitted but it cannot be represented as a decimal floating point number. For example, NaN's, infinities, and other special values are represented in their IEEE hexadecimal format so that assembly and disassembly do not cause any bits to change in the constants. I would instead write this: Floating-point constants may also be specified using the 64-bit IEEE hexadecimal format. For example, the decimal floating-point form 'double 4.5e+15' may also be written as 'double 0x432ff973cafa8000'. Certain floating-values (NaNs, infinities, and other special values) may only be textually represented in the hexadecimal format. The disassembler will use the more decimal floating-point form whenever possible. All the best, John Clements From ubub at gmx.net Wed Feb 11 02:07:26 2009 From: ubub at gmx.net (Tobias) Date: Wed, 11 Feb 2009 09:07:26 +0100 Subject: [LLVMdev] direct calls to inttoptr constants Message-ID: <0A1521D7F5274AAB9FC7AD18516F42DB@dev> I'm compiling code which contains indirect function calls via their absolute addresses, which are known/fixed at compile-time: pseudo c code: int main() { int (* f)(int) = (int (*)(int))12345678; return (*f)(0); } the IR looks like: define i32 @main() nounwind { entry: %0 = tail call i32 inttoptr (i64 12345678 to i32 (i32)*)(i32 0) nounwind ret i32 %0 } on X86 llc 2.4 compiles this to: .text .align 16 .globl main .type main, at function main: subl $4, %esp movl $0, (%esp) movl $12345678, %eax call *%eax addl $4, %esp ret .size main, .-main .section .note.GNU-stack,"", at progbits take a look at: movl $12345678, %eax call *%eax does anyone know a way to cause llc to call the address directly? hints where to start patching the codegen are also welcome. expected assembly: call *12345678 best regards tobias From alex.lavoro.propio at gmail.com Wed Feb 11 06:07:59 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Wed, 11 Feb 2009 13:07:59 +0100 Subject: [LLVMdev] Eliminate PHI for non-copyable registers Message-ID: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> In my hardware there are two special registers cannot be copied but can only be assigned and referenced (read) in the other instruction. They are allocatable also. br i1 %if_cond, label %then, label %else then: %x1 = fptosi float %y1 to i32 br label %endif else: %x2 = fptosi float %y2 to i32 br label %endif endif: %x3 = phi i32 [%x1, %then], [%x2, %else] PNE::LowerAtomiPHINode() fails because TargetInstrInfo::copyRegToReg() doesn't support the copy of this type of register. Most registers of this hardware are f32. These two special register of type i32 are provided to relative index the other f32 registers. The value of these i32 registers can only be written by a FP-to-INT conversion instruction. But these two i32 registers are not designed to be copied from one to the other. Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/b90c0414/attachment.html From marks at dcs.gla.ac.uk Wed Feb 11 06:47:32 2009 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Wed, 11 Feb 2009 12:47:32 +0000 Subject: [LLVMdev] direct calls to inttoptr constants[MESSAGE NOT SCANNED] In-Reply-To: <0A1521D7F5274AAB9FC7AD18516F42DB@dev> References: <0A1521D7F5274AAB9FC7AD18516F42DB@dev> Message-ID: <4992C8E4.50804@dcs.gla.ac.uk> Tobias, I'm doing something similar (I use LLVM as part of my JIT compiler) and if I remember correctly, LLVM does the correct thing. I think you need to try changing the i64 value to an i32 value. If that doesn't work you could also try replacing the tail call with a normal call. Mark. Tobias wrote: > I'm compiling code which contains indirect function calls > via their absolute addresses, which are known/fixed at compile-time: > > pseudo c code: > int main() { > int (* f)(int) = (int (*)(int))12345678; > return (*f)(0); > } > > the IR looks like: > define i32 @main() nounwind { > entry: > %0 = tail call i32 inttoptr (i64 12345678 to i32 (i32)*)(i32 0) nounwind > ret i32 %0 > } > > on X86 llc 2.4 compiles this to: > .text > .align 16 > .globl main > .type main, at function > main: > subl $4, %esp > movl $0, (%esp) > movl $12345678, %eax > call *%eax > addl $4, %esp > ret > .size main, .-main > > .section .note.GNU-stack,"", at progbits > > take a look at: > movl $12345678, %eax > call *%eax > > does anyone know a way to cause llc to call the address directly? > hints where to start patching the codegen are also welcome. > > expected assembly: > call *12345678 > > best regards > tobias > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From alex.lavoro.propio at gmail.com Wed Feb 11 06:56:17 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Wed, 11 Feb 2009 13:56:17 +0100 Subject: [LLVMdev] Prevent node from being combined Message-ID: <4d77c5f20902110456g56ecff71yb6d832863cadfb1b@mail.gmail.com> How can I prevent some nodes from being combined in DAGCombine.cpp? Maybe what I want to do below doesn't follow the philosophy of LLVM, but I'd like to know if there is any way to avoid node from being combined. TargetLowering::PerformDAGCombine() is only called if DAGCombiner cannot combine a specific node. It seems that there is no chance to stop it from combining a node. I need the shuffle mask in the machine instruction but sometimes if a vector_shuffle can only return LHS or RHS, it's removed/combined so that I cannot match vector_shuffle in the instruction selector. If the vector_shuffle is combined, I have to write the instruction selector like these: def SUBvv: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1, imm:$mask1), [sub (vector_shuffle REG:$src0, REG:$src0, imm:$mask0), (vector_shuffle REG:$src1, REG:$src1, imm:$mask1)] def SUBrv: MyInst<(ins REG:$src0, REG:$src1, imm:$mask1), [sub REG:$src0, (vector_shuffle REG:$src1, REG:$src1, imm:$mask1)] def SUBvr: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1), [sub (vector_shuffle REG:$src0, REG:$src0, imm:$mask0), REG:$src1)] Otherwise, I can write: def SUB: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1, imm:$mask1), [sub (vector_shuffle REG:$src0, REG:$src0, imm:$mask0), (vector_shuffle REG:$src1, REG:$src1, imm:$mask1)] And processing MachineInstr will be easier since the operand index of writemask is always the same for all instructions. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/3dba8a36/attachment.html From ubub at gmx.net Wed Feb 11 07:30:29 2009 From: ubub at gmx.net (Tobias) Date: Wed, 11 Feb 2009 14:30:29 +0100 Subject: [LLVMdev] direct calls to inttoptr constants References: <0A1521D7F5274AAB9FC7AD18516F42DB@dev> <4992C8E4.50804@dcs.gla.ac.uk> Message-ID: Hello Mark, I've followed your advice and changed the IR to: %0 = call i32 inttoptr (i32 12345678 to i32 (i32)*)(i32 0) nounwind the call is still indirect. IMHO llc does not call it directly because the address is neither a globalvalue (JIT) nor a external symbol. That's why it uses a fallback mechanism to call it indirectly assuming the address is not constant and is calculated at runtime. tobias Mark wrote: > I'm doing something similar (I use LLVM as part of my JIT compiler) and > if I remember correctly, LLVM does the correct thing. > > I think you need to try changing the i64 value to an i32 value. > If that doesn't work you could also try replacing the tail call with a > normal call. > > > Mark. From nipun2512 at gmail.com Wed Feb 11 12:42:47 2009 From: nipun2512 at gmail.com (Nipun Arora) Date: Wed, 11 Feb 2009 13:42:47 -0500 Subject: [LLVMdev] Operand, instruction Message-ID: Hi, How can one extract the operand of an instruction in an LLVM pass? Like I can get the opcode bt I'd like to get the operands as well Thanks Nipun -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/3a073512/attachment.html From benl at google.com Wed Feb 11 12:36:48 2009 From: benl at google.com (Ben Laurie) Date: Wed, 11 Feb 2009 13:36:48 -0500 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet Message-ID: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> I needed these for some work I'm doing in clang... -------------- next part -------------- A non-text attachment was scrubbed... Name: set.patch Type: application/octet-stream Size: 1924 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/82192816/attachment.obj From dpatel at apple.com Wed Feb 11 12:49:12 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 11 Feb 2009 10:49:12 -0800 Subject: [LLVMdev] Operand, instruction In-Reply-To: References: Message-ID: On Feb 11, 2009, at 10:42 AM, Nipun Arora wrote: > Hi, > > How can one extract the operand of an instruction in an LLVM pass? > Like I can get the opcode bt I'd like to get the operands as well > getOperand(i). See Instruction.h and User.h - Devang From kremenek at apple.com Wed Feb 11 12:54:05 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 10:54:05 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> Message-ID: On Feb 11, 2009, at 10:36 AM, Ben Laurie wrote: > I needed these for some work I'm doing in clang... > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev Looks good to me. I'll apply these. From isanbard at gmail.com Wed Feb 11 12:54:29 2009 From: isanbard at gmail.com (Bill Wendling) Date: Wed, 11 Feb 2009 10:54:29 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> Message-ID: <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> On Wed, Feb 11, 2009 at 10:36 AM, Ben Laurie wrote: > I needed these for some work I'm doing in clang... > Yes sir! At least this message was informative. One thing: + int size() const { + int n = 0; + for(iterator i = begin() ; i != end() ; ++n, ++i) + ; + return n; + } + bool empty() const { + return size() == 0; + } empty() here isn't a constant-time method. Can you make it's time complexity O(1)? -bw From kremenek at apple.com Wed Feb 11 12:55:02 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 10:55:02 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> Message-ID: Ben, This patch doesn't apply. Can you update to TOT LLVM first and regenerate the patch? Ted On Feb 11, 2009, at 10:36 AM, Ben Laurie wrote: > I needed these for some work I'm doing in clang... > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kremenek at apple.com Wed Feb 11 12:57:27 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 10:57:27 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> Message-ID: <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> On Feb 11, 2009, at 10:54 AM, Bill Wendling wrote: > On Wed, Feb 11, 2009 at 10:36 AM, Ben Laurie wrote: >> I needed these for some work I'm doing in clang... >> > Yes sir! At least this message was informative. One thing: > > + int size() const { > + int n = 0; > + for(iterator i = begin() ; i != end() ; ++n, ++i) > + ; > + return n; > + } > + bool empty() const { > + return size() == 0; > + } > > empty() here isn't a constant-time method. Can you make it's time > complexity O(1)? > > -bw Bill's right; empty can be made constant time. e.g., "return Root == 0"; -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/c57eec74/attachment.html From jlerouge at apple.com Wed Feb 11 13:27:06 2009 From: jlerouge at apple.com (Julien Lerouge) Date: Wed, 11 Feb 2009 11:27:06 -0800 Subject: [LLVMdev] [PATCH] llvm/llvm-gcc broken on mingw32 In-Reply-To: <20090127053257.GA31621@pom.apple.com> References: <20090127053257.GA31621@pom.apple.com> Message-ID: <20090211192705.GA97043@pom.apple.com> On Mon, Jan 26, 2009 at 09:32:58PM -0800, Julien Lerouge wrote: > > Second issue is that llvm-gcc fails for me with the following error: > I filed http://llvm.org/bugs/show_bug.cgi?id=3552 to track this. I found two issues that prevent non bootstrap build of llvm-gcc, the first one was introduced in 61207 and the second one in 61215. I think it affects the 2.5 branch as well. Thanks, Julien -- Julien Lerouge PGP Key Id: 0xB1964A62 PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62 PGP Public Key from: keyserver.pgp.com From kremenek at apple.com Wed Feb 11 14:24:48 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 12:24:48 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> Message-ID: <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> Actually, neither of these methods are needed for ImmutableSet. ImmutableSet already has an 'isEmpty()' method and I have never really seen a case where "size()" needs to be explicitly calculated. If you need size() itself, however, this seems like a perfectly valid addition. On Feb 11, 2009, at 10:57 AM, Ted Kremenek wrote: > > On Feb 11, 2009, at 10:54 AM, Bill Wendling wrote: > >> On Wed, Feb 11, 2009 at 10:36 AM, Ben Laurie wrote: >>> I needed these for some work I'm doing in clang... >>> >> Yes sir! At least this message was informative. One thing: >> >> + int size() const { >> + int n = 0; >> + for(iterator i = begin() ; i != end() ; ++n, ++i) >> + ; >> + return n; >> + } >> + bool empty() const { >> + return size() == 0; >> + } >> >> empty() here isn't a constant-time method. Can you make it's time >> complexity O(1)? >> >> -bw > > Bill's right; empty can be made constant time. e.g., "return Root > == 0"; -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/5b2664b8/attachment-0001.html From criswell at cs.uiuc.edu Wed Feb 11 14:44:21 2009 From: criswell at cs.uiuc.edu (John Criswell) Date: Wed, 11 Feb 2009 14:44:21 -0600 Subject: [LLVMdev] Operand, instruction In-Reply-To: References: Message-ID: <499338A5.5000906@cs.uiuc.edu> Nipun Arora wrote: > Hi, > > How can one extract the operand of an instruction in an LLVM pass? > Like I can get the opcode bt I'd like to get the operands as well > Use the getOperand() method of class Instruction (which I think is inherited from Value or User or some other LLVM class). It takes a single parameter that is an index specifying which operand to return. The return value is a llvm::Value *, IIRC. If you haven't used it yet, I'd recommend using the LLVM doxygen documentation (http://llvm.org/doxygen/hierarchy.html). I've found it to be an invaluable resource for answering these sorts of questions. In this case, just look up the llvm::Instruction class and see if it has a method that does what you want. If it doesn't, check its parent class, the grandparent class, etc. until you find the method you want. -- John T. > Thanks > Nipun > > From kremenek at apple.com Wed Feb 11 14:54:02 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 12:54:02 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> Message-ID: On Feb 11, 2009, at 10:36 AM, Ben Laurie wrote: > I needed these for some work I'm doing in clang... > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev Minor nit: --- include/llvm/ADT/FoldingSet.h (revision 63488) +++ include/llvm/ADT/FoldingSet.h (working copy) @@ -225,6 +225,7 @@ void AddInteger(unsigned long I); void AddInteger(long long I); void AddInteger(unsigned long long I); + void AddBoolean(bool B); void AddString(const std::string &String); void AddString(const char* String); Index: lib/Support/FoldingSet.cpp =================================================================== --- lib/Support/FoldingSet.cpp (revision 63488) +++ lib/Support/FoldingSet.cpp (working copy) @@ -61,6 +61,9 @@ if ((uint64_t)(int)I != I) Bits.push_back(unsigned(I >> 32)); } +void FoldingSetNodeID::AddBoolean(bool B) { + AddInteger(B ? 1 : 0); +} "AddBoolean()" can just be defined inline, since it is so simple. I've committed this change: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090209/073619.html From Micah.Villmow at amd.com Wed Feb 11 14:54:27 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Wed, 11 Feb 2009 12:54:27 -0800 Subject: [LLVMdev] Bug in SelectionDAGBuild.cpp? Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827513@ssanexmb1.amd.com> I'm hitting a problem in SelectionDAGBuild::visitRet(), mainly: MVT VT = ValueVTs[j]; // FIXME: C calling convention requires the return type to be promoted to // at least 32-bit. But this is not necessary for non-C calling // conventions. if (VT.isInteger()) { MVT MinVT = TLI.getRegisterType(MVT::i32); if (VT.bitsLT(MinVT)) VT = MinVT; } This is occurring when VT is a 16bit vector type,<2x i8>. LLVM is then changing it to be a 32bit type and it asserts in : getCopyToParts(DAG, SDValue(RetOp.getNode(), RetOp.getResNo() + j), &Parts[0], NumParts, PartVT, ExtendKind); Here: assert(ValueVT.getVectorElementType() == PartVT && ValueVT.getVectorNumElements() == 1 && "Only trivial vector-to-scalar conversions should get here!"); Because it switched PartVT from a vector type<2xi8> into a scalar integer. Any idea's on how I can get around this constraint? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/7debfe80/attachment.html From dekruijf at cs.wisc.edu Wed Feb 11 15:13:06 2009 From: dekruijf at cs.wisc.edu (Marc de Kruijf) Date: Wed, 11 Feb 2009 15:13:06 -0600 Subject: [LLVMdev] Unnatural loops with O0 In-Reply-To: <6DFA37C7-C2F9-46B7-8658-6F809AC916CB@nondot.org> References: <200806111527.15386.brandner@complang.tuwien.ac.at> <6DFA37C7-C2F9-46B7-8658-6F809AC916CB@nondot.org> Message-ID: I am reviving this thread because I am seeing the same thing (unnatural loops produced by llvm-gcc), but it is not limited to -O0 -- I am seeing it for -O2 and -O3 as well. Some of my research work is relying on LoopInfo to provide loop information for all loops, but it is missing these loops. Is there any work in the pipeline that aims to fix this? Many thanks, Marc On Sat, Jun 21, 2008 at 2:09 PM, Chris Lattner wrote: > > On Jun 11, 2008, at 6:27 AM, Florian Brandner wrote: > > > On Thursday 08 May 2008 18:33:48 Adrian Prantl wrote: > >> we noticed that llvmgcc4.2-2.2 sometimes generates non-natural loops > >> when compiling to bytecode without any optimizations. Apparently what > >> happens is that the loop header is duplicated, which results in two > >> entry points for the loop. > > > > this is actually a problem with the tailduplication pass of llvm. it > > does not > > consider loops at all, and thus duplicates loop headers. the result > > is that > > two paths now lead into the loop --> it is not natural anymore and > > further > > loop optimizations fail. > > I think the patch was reverted because using loopinfo is bad for > taildup to do. The best answer is to actually just remove the tail > dup pass altogether. It is really bad for compile time performance, > and has some other issues. A second best solution is to change it to > do a quick depth first search of the CFG when it starts analyzing a > function and just keep a SmallPtrSet of loop headers. > > -Chris > > > > > > > besides, the tailduplication pass does not invalidate the loopinfo > > analysis, > > as it should do in these cases. > > > > i've attached a minimized version of adrians original testcase. you > > need to > > adjust the tailduplication threshhold to trigger the tailduplication > > for this > > example. > > > > some more tests, using mibench (+some other benchmarks) with our > > llvm-2.1 > > based compiler, showed that in 29 benchmark programs 19 non-natural > > loops > > appear - one single function contained 6 of them alone. > > > > all but 5 of them could be avoided using a simple patch that > > disables tail > > duplication of loop headers - 3 of them in one single function. the > > patch > > applies and compiles with svn trunk, it also works for the small > > testcase, > > but i did not run the testsuites. > > > > florian > > > > -- > > Brandner Florian > > > > CD Laboratory - Compilation Techniques for Embedded Processors > > Institut f?r Computersprachen E185/1 > > Technische Universit?t Wien > > > > Argentinierstra?e 8 / 185 > > A-1040 Wien, Austria > > > > Tel.: (+431) 58801-58521 > > > > E-Mail: brandner at complang.tuwien.ac.at > > > > > loopheader.patch>_______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090211/04af21ae/attachment-0001.html From dalej at apple.com Wed Feb 11 15:37:54 2009 From: dalej at apple.com (Dale Johannesen) Date: Wed, 11 Feb 2009 13:37:54 -0800 Subject: [LLVMdev] Suggested change to docs re: double/float constant syntax. In-Reply-To: <021F3BAB-F3FD-46D5-8552-7A9C9FA53BBA@brinckerhoff.org> References: <021F3BAB-F3FD-46D5-8552-7A9C9FA53BBA@brinckerhoff.org> Message-ID: On Feb 10, 2009, at 10:28 PMPST, John Clements wrote: > It appears to me based on my experiments with llvm-as that literal > floating-point constants may be specified in the 64-bit IEEE > hexadecimal format, but not in the 32-bit format. For instance, this > file: > > target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32- > i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64- > f80:128:128" > target triple = "i386-apple-darwin9.5" > > @mydouble = constant double 0x432ff973cafa8000 > @myfloat = constant float 0x405b126f Correct; constants of type float can be specified, but use the double format. Their values must actually fit in a float with no loss of information (i.e. (double)(float)doublevalue bitwise== doublevalue), which is checked by the reader. > signals this error: > > uccello:/tmp clements$ llvm-as foo2.s > llvm-as: foo2.s:6:27: floating point constant invalid for type > @myfloat = constant float 0x405b126f > ^ > > If I'm correct in this assumption, I suggest mentioning this in the > language reference. In the interest of not making work for other > people, I'll suggest a drop-in replacement. > > Here's what's currently there: > > The one non-intuitive notation for constants is the optional > hexadecimal form of floating point constants. For example, the form > 'double 0x432ff973cafa8000' is equivalent to (but harder to read than) > 'double 4.5e+15'. The only time hexadecimal floating point constants > are required (and the only time that they are generated by the > disassembler) is when a floating point constant must be emitted but it > cannot be represented as a decimal floating point number. For > example, > NaN's, infinities, and other special values are represented in their > IEEE hexadecimal format so that assembly and disassembly do not cause > any bits to change in the constants. This is somewhat inaccurate anyway; all non-special binary FP constants can be represented exactly as decimal FP (though the reverse is not true), but the representation can be thousands of digits long. The hex form is used in such cases. There's also long double constants that aren't documented. I'll fix it, thanks. > I would instead write this: > > Floating-point constants may also be specified using the 64-bit IEEE > hexadecimal format. For example, the decimal floating-point form > 'double 4.5e+15' may also be written as 'double 0x432ff973cafa8000'. > Certain floating-values (NaNs, infinities, and other special values) > may only be textually represented in the hexadecimal format. The > disassembler will use the more decimal floating-point form whenever > possible. > > > All the best, > > John Clements > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From clattner at apple.com Wed Feb 11 18:50:50 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 11 Feb 2009 16:50:50 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> Message-ID: <62869F13-8D27-4F2A-BB2D-D028D07BAE60@apple.com> On Feb 11, 2009, at 12:24 PM, Ted Kremenek wrote: > Actually, neither of these methods are needed for ImmutableSet. > ImmutableSet already has an 'isEmpty()' method and I have never > really seen a case where "size()" needs to be explicitly > calculated. If you need size() itself, however, this seems like a > perfectly valid addition. I agree, "size" should also return 'unsigned' not int. -Chris From clattner at apple.com Wed Feb 11 18:51:34 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 11 Feb 2009 16:51:34 -0800 Subject: [LLVMdev] Unnatural loops with O0 In-Reply-To: References: <200806111527.15386.brandner@complang.tuwien.ac.at> <6DFA37C7-C2F9-46B7-8658-6F809AC916CB@nondot.org> Message-ID: On Feb 11, 2009, at 1:13 PM, Marc de Kruijf wrote: > I am reviving this thread because I am seeing the same thing > (unnatural loops produced by llvm-gcc), but it is not limited to -O0 > -- I am seeing it for -O2 and -O3 as well. > Some of my research work is relying on LoopInfo to provide loop > information for all loops, but it is missing these loops. Is there > any work in the pipeline that aims to fix this? Not that I know of. There has been a project on the open projects list to write a pass that converts all loops to natural loops (through code duplication). That would be a nice and self-contained project if anyone is interested. -Chris From clattner at apple.com Wed Feb 11 20:04:11 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 11 Feb 2009 18:04:11 -0800 Subject: [LLVMdev] Eliminate PHI for non-copyable registers In-Reply-To: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> References: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> Message-ID: <8E39BC05-CED5-4796-8031-42FA38F80EDC@apple.com> On Feb 11, 2009, at 4:07 AM, Alex wrote: > In my hardware there are two special registers cannot be copied but > can only be assigned and referenced (read) in the other instruction. > They are allocatable also. > > br i1 %if_cond, label %then, label %else > then: > %x1 = fptosi float %y1 to i32 > br label %endif > else: > %x2 = fptosi float %y2 to i32 > br label %endif > endif: > %x3 = phi i32 [%x1, %then], [%x2, %else] > > PNE::LowerAtomiPHINode() fails because > TargetInstrInfo::copyRegToReg() doesn't support the copy of this > type of register. > > Most registers of this hardware are f32. These two special register > of type i32 are provided to relative index the other f32 registers. > The value of these i32 registers can only be written by a FP-to-INT > conversion instruction. But these two i32 registers are not designed > to be copied from one to the other. > This is a very interesting problem. If you have registers like this, they should be non-allocatable (just like 'flags') which means that you don't have to define copy operations for them. -Chris From kapilanand2 at gmail.com Wed Feb 11 20:05:39 2009 From: kapilanand2 at gmail.com (kapil anand) Date: Wed, 11 Feb 2009 21:05:39 -0500 Subject: [LLVMdev] DominatorTree Information required in CallGraphPass Message-ID: <9f741d560902111805i2c02e412p32b944518c3aaf9f@mail.gmail.com> Hi all, I am implementing a new pass for LLVM which extends Call Graph SCCPass. I need DominatorTree Information when I get to individual function. I have added AU.addrequired() and AU.addRequired() in getAnalysisUsage() function. But, when I get to the pass, Pass Manager gives following runtime error Unable to schedule 'Dominator Tree Construction' required by '......' assertion "0 && "Unable to schedule pass"" failed: file "/usr/src/llvm-2.4/llvm- 2.4/lib/VMCore/PassManager.cpp", line 1074 Can't I use dominator Tree Information in case of CallGraphSCCPass. I found out from documentation that dominator Info can be used in Module Pass ( http://www.llvm.org/docs/WritingAnLLVMPass.html#getAnalysis ) Thus it seemed feasible that Call GraphSCCPass should also be able to use Dominator Info.Do I need to add some flag that I need lower level passes? Thanks Regards, Kapil From nicholas at mxc.ca Wed Feb 11 20:51:25 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Wed, 11 Feb 2009 18:51:25 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> Message-ID: <49938EAD.2050506@mxc.ca> Bill Wendling wrote: > On Wed, Feb 11, 2009 at 10:36 AM, Ben Laurie wrote: >> I needed these for some work I'm doing in clang... >> > Yes sir! At least this message was informative. One thing: > > + int size() const { > + int n = 0; > + for(iterator i = begin() ; i != end() ; ++n, ++i) > + ; Please only call end() once. We use this pattern a lot in LLVM: for (iterator i = begin(), e = end(); i != e; ++n, ++i) ; But really I think you should just have: return std::distance(begin(), end()); Nick > + return n; > + } > + bool empty() const { > + return size() == 0; > + } > > empty() here isn't a constant-time method. Can you make it's time > complexity O(1)? > > -bw > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From benl at google.com Wed Feb 11 22:14:04 2009 From: benl at google.com (Ben Laurie) Date: Thu, 12 Feb 2009 04:14:04 +0000 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> Message-ID: <1b587cab0902112014g61041207raaf5297e5becdfa@mail.gmail.com> On Wed, Feb 11, 2009 at 8:24 PM, Ted Kremenek wrote: > Actually, neither of these methods are needed for ImmutableSet. > ImmutableSet already has an 'isEmpty()' method and I have never really seen > a case where "size()" needs to be explicitly calculated. If you need size() > itself, however, this seems like a perfectly valid addition. I need to check for size() == 1 (in order to test whether a range set has a single possible value). From kremenek at apple.com Wed Feb 11 22:47:35 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 20:47:35 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <1b587cab0902112014g61041207raaf5297e5becdfa@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> <1b587cab0902112014g61041207raaf5297e5becdfa@mail.gmail.com> Message-ID: On Feb 11, 2009, at 8:14 PM, Ben Laurie wrote: > On Wed, Feb 11, 2009 at 8:24 PM, Ted Kremenek > wrote: >> Actually, neither of these methods are needed for ImmutableSet. >> ImmutableSet already has an 'isEmpty()' method and I have never >> really seen >> a case where "size()" needs to be explicitly calculated. If you >> need size() >> itself, however, this seems like a perfectly valid addition. > > I need to check for size() == 1 (in order to test whether a range set > has a single possible value). Ah. If that is the case, we can implement a 'isSingleton()' method with constant time performance (i.e., check if we have a root, and check that the root has no children). Would that work? From benl at google.com Wed Feb 11 22:53:52 2009 From: benl at google.com (Ben Laurie) Date: Thu, 12 Feb 2009 04:53:52 +0000 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> <1b587cab0902112014g61041207raaf5297e5becdfa@mail.gmail.com> Message-ID: <1b587cab0902112053t52d8d21aic48a8c89471731fc@mail.gmail.com> On Thu, Feb 12, 2009 at 4:47 AM, Ted Kremenek wrote: > On Feb 11, 2009, at 8:14 PM, Ben Laurie wrote: > >> On Wed, Feb 11, 2009 at 8:24 PM, Ted Kremenek wrote: >>> >>> Actually, neither of these methods are needed for ImmutableSet. >>> ImmutableSet already has an 'isEmpty()' method and I have never really >>> seen >>> a case where "size()" needs to be explicitly calculated. If you need >>> size() >>> itself, however, this seems like a perfectly valid addition. >> >> I need to check for size() == 1 (in order to test whether a range set >> has a single possible value). > > Ah. If that is the case, we can implement a 'isSingleton()' method with > constant time performance (i.e., check if we have a root, and check that the > root has no children). Would that work? Sure would! From kremenek at apple.com Wed Feb 11 23:18:08 2009 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 11 Feb 2009 21:18:08 -0800 Subject: [LLVMdev] Some enhancements to ImmutableSet and FoldingSet In-Reply-To: <1b587cab0902112053t52d8d21aic48a8c89471731fc@mail.gmail.com> References: <1b587cab0902111036p2090e051je2c5b954bf45d02f@mail.gmail.com> <16e5fdf90902111054j56864a0bm58379bdabbe1603c@mail.gmail.com> <41F43E54-4BFB-4615-BD8D-B4CAFCD92557@apple.com> <8EE805E5-C1CA-4BA2-8B4C-5C11E9DDAA96@apple.com> <1b587cab0902112014g61041207raaf5297e5becdfa@mail.gmail.com> <1b587cab0902112053t52d8d21aic48a8c89471731fc@mail.gmail.com> Message-ID: <2A0A4AFE-A9DC-4924-9C63-EC95EA455D3A@apple.com> On Feb 11, 2009, at 8:53 PM, Ben Laurie wrote: > On Thu, Feb 12, 2009 at 4:47 AM, Ted Kremenek > wrote: >> On Feb 11, 2009, at 8:14 PM, Ben Laurie wrote: >> >>> On Wed, Feb 11, 2009 at 8:24 PM, Ted Kremenek >>> wrote: >>>> >>>> Actually, neither of these methods are needed for ImmutableSet. >>>> ImmutableSet already has an 'isEmpty()' method and I have never >>>> really >>>> seen >>>> a case where "size()" needs to be explicitly calculated. If you >>>> need >>>> size() >>>> itself, however, this seems like a perfectly valid addition. >>> >>> I need to check for size() == 1 (in order to test whether a range >>> set >>> has a single possible value). >> >> Ah. If that is the case, we can implement a 'isSingleton()' method >> with >> constant time performance (i.e., check if we have a root, and check >> that the >> root has no children). Would that work? > > Sure would! Hi Ben, I think this should work: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090209/073634.html If you find it isn't doing the right thing just let me know. Ted From clattner at apple.com Thu Feb 12 00:16:10 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 11 Feb 2009 22:16:10 -0800 Subject: [LLVMdev] overflow + saturation stuff In-Reply-To: References: Message-ID: <2109DC07-3696-42CC-89D4-A7A7E6F53735@apple.com> On Feb 9, 2009, at 9:18 AM, Paul Schlie wrote: > Chris Lattner wrote: >> On Feb 8, 2009, at 5:54 PM, Paul Schlie wrote: >>> Are overflow behavior tags meant to enable the specification of a >>> particular instruction's required or presumed overflow behavior? >> >> I'm not sure what you mean. The overflow tags specify what happens >> if >> overflow happens (defined wrapping, defined saturating, or undefined >> behavior), not *when* overflow happens. > > - Is undefined behavior meant to imply that if such condition were to > occur, the undefined behavior will be warranted to not be expressed as > it will be trapped; or merely assumed it won't occur and thereby may > be > optimized based on this assumption, regardless of the behavior which > may > actually be expressed in the absents of optimization if and when the > condition occurred (and thereby optimizations may legitimately alter > logical program behavior if sensitive to an otherwise expressible > undefined behavior), and thereby truly just an optimization tag. Paul, I have a really hard time understanding what you're getting at. Please break down questions into multiple sentences: this entire paragraph is one sentence. "Undefined on overflow" is an assertion to the optimizer that "something" knows that overflow can't happen. This gives it license to optimized based on the assumption that overflow doesn't happen. > > >>> If a required overflow behavior, then it follows that the target >>> must >>> correspondingly implement the behavior; either natively or emulated? >> >> yes, if a target doesn't support saturation, it must emulate it. >> This >> is the same as targets that doesn't support rem natively (e.g. ppc). > > - Thanks partially understood; as above, the tags seem to have > multiple > intended purposes; on one hand to be used by optimizers but not > affect the > instruction selection process; on the other hand must affect the > selection > process as below? I'm not sure what you're asking. Please ask short and direct questions. >>> If a presumed overflow behavior, is the target meant to preferably >>> implement or emulate the same; or is it merely meant to enable >>> optimizations which may or may not be representative of the code's >>> target mapped behavior? >>> >>> Regardless, if the target is potentially meant to implement the >>> behavior; it follows that LLVM's assembly level representation must >>> be able to discriminate between operations having differing >>> semantics >>> specified? >> >> I don't understand what you mean. > > - Sorry, merely meant: if an instruction's overflow behavior tag is > meant > to affect target instruction selection semantics, it would seem > necessary > to be selectable at the llvm assembly code level (i.e. how does one > specify > a saturating addition vs. 2's-comp addition instruction semantics at > the > llvm assembly code level of representation)? I'm not sure what level of selection you're talking about here. We don't do selection on LLVM IR, we do it on the SelectionDAG data structure. Are you asking how a target author would map a saturating add to a specific target instruction that does that operation? -Chris From alex.lavoro.propio at gmail.com Thu Feb 12 03:41:12 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Thu, 12 Feb 2009 01:41:12 -0800 (PST) Subject: [LLVMdev] Eliminate PHI for non-copyable registers In-Reply-To: <8E39BC05-CED5-4796-8031-42FA38F80EDC@apple.com> References: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> <8E39BC05-CED5-4796-8031-42FA38F80EDC@apple.com> Message-ID: <21972748.post@talk.nabble.com> Chris Lattner-2 wrote: > > > On Feb 11, 2009, at 4:07 AM, Alex wrote: > >> In my hardware there are two special registers cannot be copied but >> can only be assigned and referenced (read) in the other instruction. >> They are allocatable also. >> >> br i1 %if_cond, label %then, label %else >> then: >> %x1 = fptosi float %y1 to i32 >> br label %endif >> else: >> %x2 = fptosi float %y2 to i32 >> br label %endif >> endif: >> %x3 = phi i32 [%x1, %then], [%x2, %else] >> >> PNE::LowerAtomiPHINode() fails because >> TargetInstrInfo::copyRegToReg() doesn't support the copy of this >> type of register. >> >> Most registers of this hardware are f32. These two special register >> of type i32 are provided to relative index the other f32 registers. >> The value of these i32 registers can only be written by a FP-to-INT >> conversion instruction. But these two i32 registers are not designed >> to be copied from one to the other. >> > > This is a very interesting problem. If you have registers like this, > they should be non-allocatable (just like 'flags') which means that > you don't have to define copy operations for them. > They "should" be non-allocatable if the hardware implements the same number of these i32 registers as the "specification". The input language (which is converted to LLVM IR) may use up to 4 registers but the hardware only has 2. So they must be allocatable, right? For example, the input uses up to 3 registers INT0, INT1, INT2 (Rx are FP registers): fp2int INT0, R0 fp2int INT1, R1 fp2int INT2, R2 add R0, R0, R[INT1+1] mul R0, R[INT2+2], R[INT0+1] Since the hardware doesn't has INT2, the final machine should be like: fp2int INT0, R0 fp2int INT1, R1 add R0, R0, R[INT1+1] fp2int INT1, R2 <==== rename INT2 to INT1 mul R0, R[INT1+2], R[INT0+1] I use the method suggested in "Kaleidoscope: Extending the Language: Mutable Variables" (http://llvm.org/docs/tutorial/LangImpl7.html) and rely on mem2reg to promote these loads to registers. By the way, all registers are non-spillable. -- View this message in context: http://www.nabble.com/Eliminate-PHI-for-non-copyable-registers-tp21953583p21972748.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From jay.foad at gmail.com Thu Feb 12 06:49:44 2009 From: jay.foad at gmail.com (Jay Foad) Date: Thu, 12 Feb 2009 12:49:44 +0000 Subject: [LLVMdev] problems running test suite (-mllvm -disable-llvm-optzns) Message-ID: I'm trying to run some of the test suite using the instructions here: http://llvm.org/docs/TestingGuide.html#quicktestsuite I've built llvm myself, but I'm using pre-built binaries of llvm-gcc (from http://llvm.org/prereleases/2.5/llvm-gcc4.2-2.5-x86-linux-RHEL4.tar.gz). Here's what happens: foad at debian:~/svn/llvm-project/test-suite/trunk$ ./configure --with-llvmgccdir=/home/foad/llvm/llvm-gcc4.2-2.5-x86-linux-RHEL4 --with-llvmobj=/home/foad/llvm/objdir-svn --with-llvmsrc=/home/foad/svn/llvm-project/llvm/trunk [...] foad at debian:~/svn/llvm-project/test-suite/trunk$ make -C MultiSource/Applications/minisat/ make: Entering directory `/home/foad/svn/llvm-project/test-suite/trunk/MultiSource/Applications/minisat' Compiling Main.cpp to Output/Main.bc cc1plus: warning: unrecognized gcc debugging option: i cc1plus: warning: unrecognized gcc debugging option: s cc1plus: warning: unrecognized gcc debugging option: b cc1plus: warning: unrecognized gcc debugging option: l cc1plus: warning: unrecognized gcc debugging option: e cc1plus: warning: unrecognized gcc debugging option: - cc1plus: warning: unrecognized gcc debugging option: l cc1plus: warning: unrecognized gcc debugging option: l cc1plus: warning: unrecognized gcc debugging option: m cc1plus: warning: unrecognized gcc debugging option: - cc1plus: warning: unrecognized gcc debugging option: o cc1plus: warning: unrecognized gcc debugging option: t cc1plus: warning: unrecognized gcc debugging option: z cc1plus: warning: unrecognized gcc debugging option: n cc1plus: warning: unrecognized gcc debugging option: s cc1plus: Unknown command line argument '-mtune=generic'. Try: 'cc1plus --help' make: [Output/Main.bc] Error 1 (ignored) Compiling Solver.cpp to Output/Solver.bc cc1plus: warning: unrecognized gcc debugging option: i cc1plus: warning: unrecognized gcc debugging option: s cc1plus: warning: unrecognized gcc debugging option: b cc1plus: warning: unrecognized gcc debugging option: l cc1plus: warning: unrecognized gcc debugging option: e cc1plus: warning: unrecognized gcc debugging option: - cc1plus: warning: unrecognized gcc debugging option: l cc1plus: warning: unrecognized gcc debugging option: l cc1plus: warning: unrecognized gcc debugging option: m cc1plus: warning: unrecognized gcc debugging option: - cc1plus: warning: unrecognized gcc debugging option: o cc1plus: warning: unrecognized gcc debugging option: t cc1plus: warning: unrecognized gcc debugging option: z cc1plus: warning: unrecognized gcc debugging option: n cc1plus: warning: unrecognized gcc debugging option: s cc1plus: Unknown command line argument '-mtune=generic'. Try: 'cc1plus --help' make: [Output/Solver.bc] Error 1 (ignored) /home/foad/llvm/objdir-svn/Debug/bin/llvm-ld -link-as-library -disable-opt Output/Main.bc Output/Solver.bc -o Output/minisat.linked.rbc llvm-ld: error: Cannot find linker input 'Output/Main.bc' make: [Output/minisat.linked.rbc] Error 1 (ignored) /home/foad/llvm/objdir-svn/Debug/bin/opt -std-compile-opts -info-output-file=/home/foad/svn/llvm-project/test-suite/trunk/MultiSource/Applications/minisat/Output/minisat.linked.bc.info -stats -time-passes Output/minisat.linked.rbc -o Output/minisat.linked.bc -f /home/foad/llvm/objdir-svn/Debug/bin/opt: could not open file make: [Output/minisat.linked.bc] Error 1 (ignored) [... lots more errors like this ...] I guess this is because the test suite is trying to run "llvm-gcc -mllvm -disable-llvm-optzns", which never seems to work, because llvm-gcc mangles the command line before it gets to cc1plus. Is it just me having this problem? How can I fix it? Thanks, Jay. From benl at google.com Thu Feb 12 06:59:25 2009 From: benl at google.com (Ben Laurie) Date: Thu, 12 Feb 2009 12:59:25 +0000 Subject: [LLVMdev] Add -> operator to ImmutableSet::iterator Message-ID: <1b587cab0902120459v6a903122x617abcaa13dd152@mail.gmail.com> What it says on the tin... -------------- next part -------------- A non-text attachment was scrubbed... Name: mg.diff Type: application/octet-stream Size: 609 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/709956df/attachment.obj From baldrick at free.fr Thu Feb 12 07:28:39 2009 From: baldrick at free.fr (Duncan Sands) Date: Thu, 12 Feb 2009 14:28:39 +0100 Subject: [LLVMdev] problems running test suite (-mllvm -disable-llvm-optzns) In-Reply-To: References: Message-ID: <200902121428.40153.baldrick@free.fr> Hi, > I'm trying to run some of the test suite using the instructions here: > > http://llvm.org/docs/TestingGuide.html#quicktestsuite > > I've built llvm myself, but I'm using pre-built binaries of llvm-gcc > (from http://llvm.org/prereleases/2.5/llvm-gcc4.2-2.5-x86-linux-RHEL4.tar.gz). > > Here's what happens: the llvm testsuite (from svn, right?) uses features that are not available in the 2.5 prerelease candidate. Use the prerelease version of llvm too. Ciao, Duncan. > > foad at debian:~/svn/llvm-project/test-suite/trunk$ ./configure > --with-llvmgccdir=/home/foad/llvm/llvm-gcc4.2-2.5-x86-linux-RHEL4 > --with-llvmobj=/home/foad/llvm/objdir-svn > --with-llvmsrc=/home/foad/svn/llvm-project/llvm/trunk > [...] > foad at debian:~/svn/llvm-project/test-suite/trunk$ make -C > MultiSource/Applications/minisat/ > make: Entering directory > `/home/foad/svn/llvm-project/test-suite/trunk/MultiSource/Applications/minisat' > Compiling Main.cpp to Output/Main.bc > cc1plus: warning: unrecognized gcc debugging option: i > cc1plus: warning: unrecognized gcc debugging option: s > cc1plus: warning: unrecognized gcc debugging option: b > cc1plus: warning: unrecognized gcc debugging option: l > cc1plus: warning: unrecognized gcc debugging option: e > cc1plus: warning: unrecognized gcc debugging option: - > cc1plus: warning: unrecognized gcc debugging option: l > cc1plus: warning: unrecognized gcc debugging option: l > cc1plus: warning: unrecognized gcc debugging option: m > cc1plus: warning: unrecognized gcc debugging option: - > cc1plus: warning: unrecognized gcc debugging option: o > cc1plus: warning: unrecognized gcc debugging option: t > cc1plus: warning: unrecognized gcc debugging option: z > cc1plus: warning: unrecognized gcc debugging option: n > cc1plus: warning: unrecognized gcc debugging option: s > cc1plus: Unknown command line argument '-mtune=generic'. Try: 'cc1plus --help' > make: [Output/Main.bc] Error 1 (ignored) > Compiling Solver.cpp to Output/Solver.bc > cc1plus: warning: unrecognized gcc debugging option: i > cc1plus: warning: unrecognized gcc debugging option: s > cc1plus: warning: unrecognized gcc debugging option: b > cc1plus: warning: unrecognized gcc debugging option: l > cc1plus: warning: unrecognized gcc debugging option: e > cc1plus: warning: unrecognized gcc debugging option: - > cc1plus: warning: unrecognized gcc debugging option: l > cc1plus: warning: unrecognized gcc debugging option: l > cc1plus: warning: unrecognized gcc debugging option: m > cc1plus: warning: unrecognized gcc debugging option: - > cc1plus: warning: unrecognized gcc debugging option: o > cc1plus: warning: unrecognized gcc debugging option: t > cc1plus: warning: unrecognized gcc debugging option: z > cc1plus: warning: unrecognized gcc debugging option: n > cc1plus: warning: unrecognized gcc debugging option: s > cc1plus: Unknown command line argument '-mtune=generic'. Try: 'cc1plus --help' > make: [Output/Solver.bc] Error 1 (ignored) > /home/foad/llvm/objdir-svn/Debug/bin/llvm-ld -link-as-library > -disable-opt Output/Main.bc Output/Solver.bc -o > Output/minisat.linked.rbc > llvm-ld: error: Cannot find linker input 'Output/Main.bc' > make: [Output/minisat.linked.rbc] Error 1 (ignored) > /home/foad/llvm/objdir-svn/Debug/bin/opt -std-compile-opts > -info-output-file=/home/foad/svn/llvm-project/test-suite/trunk/MultiSource/Applications/minisat/Output/minisat.linked.bc.info > -stats -time-passes Output/minisat.linked.rbc -o > Output/minisat.linked.bc -f > /home/foad/llvm/objdir-svn/Debug/bin/opt: could not open file > make: [Output/minisat.linked.bc] Error 1 (ignored) > [... lots more errors like this ...] > > I guess this is because the test suite is trying to run "llvm-gcc > -mllvm -disable-llvm-optzns", which never seems to work, because > llvm-gcc mangles the command line before it gets to cc1plus. > > Is it just me having this problem? > > How can I fix it? > > Thanks, > Jay. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From anton at korobeynikov.info Thu Feb 12 07:31:10 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 12 Feb 2009 16:31:10 +0300 Subject: [LLVMdev] problems running test suite (-mllvm -disable-llvm-optzns) In-Reply-To: References: Message-ID: Hello, Jay > I guess this is because the test suite is trying to run "llvm-gcc > -mllvm -disable-llvm-optzns", which never seems to work, because > llvm-gcc mangles the command line before it gets to cc1plus. That's correct. The driver changes the order of the options provided. You need to provided this option to cc1 / cc1plus directly > Is it just me having this problem? No. > How can I fix it? No idea, but I guess you can either hack on gcc specs. Or don't use this option at all - this is intended only for internal debugging purposes, so I doubt anyone will be really interested in fixing this. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From baldrick at free.fr Thu Feb 12 08:03:51 2009 From: baldrick at free.fr (Duncan Sands) Date: Thu, 12 Feb 2009 15:03:51 +0100 Subject: [LLVMdev] problems running test suite (-mllvm -disable-llvm-optzns) In-Reply-To: References: Message-ID: <200902121503.51582.baldrick@free.fr> Hi, > > I guess this is because the test suite is trying to run "llvm-gcc > > -mllvm -disable-llvm-optzns", which never seems to work, because > > llvm-gcc mangles the command line before it gets to cc1plus. > That's correct. The driver changes the order of the options provided. > You need to provided this option to cc1 / cc1plus directly Dan fixed this recently in svn IIRC - I don't think it's in the prerelease candidate. > > > Is it just me having this problem? > No. Yes :) > > How can I fix it? > No idea, but I guess you can either hack on gcc specs. Or don't use > this option at all - this is intended only for internal debugging > purposes, so I doubt anyone will be really interested in fixing this. Ciao, Duncan. From baldrick at free.fr Thu Feb 12 08:57:59 2009 From: baldrick at free.fr (Duncan Sands) Date: Thu, 12 Feb 2009 15:57:59 +0100 Subject: [LLVMdev] 2.5 Pre-release1 available for testing Message-ID: <200902121557.59884.baldrick@free.fr> PS: Forgot to say that this was with objdir = srcdir for llvm; objdir != srcdir for llvm-gcc; languages c,c++,fortran. From dekruijf at cs.wisc.edu Thu Feb 12 08:58:52 2009 From: dekruijf at cs.wisc.edu (Marc de Kruijf) Date: Thu, 12 Feb 2009 08:58:52 -0600 Subject: [LLVMdev] Unnatural loops with O0 In-Reply-To: References: <200806111527.15386.brandner@complang.tuwien.ac.at> <6DFA37C7-C2F9-46B7-8658-6F809AC916CB@nondot.org> Message-ID: Hi Chris, Is there a compelling reason why llvm-gcc does not always produce natural loops. Is it a code size issue, or are there performance implications as well? I am seeing a simple 'while' loop compiled to an unnatural loop, without any gotos, breaks, or continues. What is the reason for this? Marc On Wed, Feb 11, 2009 at 6:51 PM, Chris Lattner wrote: > > On Feb 11, 2009, at 1:13 PM, Marc de Kruijf wrote: > > > I am reviving this thread because I am seeing the same thing > > (unnatural loops produced by llvm-gcc), but it is not limited to -O0 > > -- I am seeing it for -O2 and -O3 as well. > > Some of my research work is relying on LoopInfo to provide loop > > information for all loops, but it is missing these loops. Is there > > any work in the pipeline that aims to fix this? > > Not that I know of. There has been a project on the open projects > list to write a pass that converts all loops to natural loops (through > code duplication). That would be a nice and self-contained project if > anyone is interested. > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/6364bba2/attachment.html From marks at dcs.gla.ac.uk Thu Feb 12 09:01:32 2009 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Thu, 12 Feb 2009 15:01:32 +0000 Subject: [LLVMdev] direct calls to inttoptr constants In-Reply-To: References: <0A1521D7F5274AAB9FC7AD18516F42DB@dev> <4992C8E4.50804@dcs.gla.ac.uk> Message-ID: <499439CC.5060307@dcs.gla.ac.uk> Tobias, I've looked into this a bit more. You are right. The confusion arose as I have two versions of my compiler: The ahead-of-time compiler uses symbolic info and does the right thing. The JIT compiler uses runtime addresses (in effect integers) and when I examined the code in the debugger I found that LLVM produces indirect calls, like this: mov $0x8153c8c,%eax call *%eax Sadly, however, I have no idea how to fix this :( but I will try and investigate. Do you have any ideas yet? Mark. Tobias wrote: > Hello Mark, > > I've followed your advice and changed the IR to: > %0 = call i32 inttoptr (i32 12345678 to i32 (i32)*)(i32 0) nounwind > the call is still indirect. > > IMHO llc does not call it directly because the address is neither > a globalvalue (JIT) nor a external symbol. > That's why it uses a fallback mechanism to call it indirectly > assuming the address is not constant and is calculated at runtime. > > tobias > > Mark wrote: >> I'm doing something similar (I use LLVM as part of my JIT compiler) and >> if I remember correctly, LLVM does the correct thing. >> >> I think you need to try changing the i64 value to an i32 value. >> If that doesn't work you could also try replacing the tail call with a >> normal call. >> >> >> Mark. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From baldrick at free.fr Thu Feb 12 09:55:48 2009 From: baldrick at free.fr (Duncan Sands) Date: Thu, 12 Feb 2009 16:55:48 +0100 Subject: [LLVMdev] Unnatural loops with O0 In-Reply-To: References: Message-ID: <200902121655.49444.baldrick@free.fr> Hi Marc, > Is there a compelling reason why llvm-gcc does not always produce natural > loops. Is it a code size issue, or are there performance implications as > well? I am seeing a simple 'while' loop compiled to an unnatural loop, > without any gotos, breaks, or continues. What is the reason for this? is it already an unnatural loop when it comes out of the gcc parts of llvm-gcc (you can check this by compiling with: -O0 -emit-llvm)? Or is it llvm itself that creates the unnatural loops? Ciao, Duncan. From baldrick at free.fr Thu Feb 12 08:54:47 2009 From: baldrick at free.fr (Duncan Sands) Date: Thu, 12 Feb 2009 15:54:47 +0100 Subject: [LLVMdev] 2.5 Pre-release1 available for testing In-Reply-To: References: Message-ID: <200902121554.47655.baldrick@free.fr> Hi Tanya, some test results (testrun.log and report.nightly.txt attached). Platform: x86-64-linux (ubuntu), system gcc is gcc-4.3. Release build (= default) for llvm and llvm-gcc; PIC not enabled; in fact the only non-default option was --disable-multilib, which was for llvm-gcc. I had to kill off some of the exception handling tests because they got into some kind of infinite loop. This may be because the gcc-4.3 libgcc unwind code is somehow incompatible with llvm exception handling (PR2998). If this is a problem for the release, I can try to solve it ASAP. Ciao, Duncan. PS: x86-32-linux testresults coming next! -------------- next part -------------- A non-text attachment was scrubbed... Name: testrun.log Type: text/x-log Size: 344708 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/5aa42b90/attachment-0001.bin -------------- next part -------------- Program | GCCAS Bytecode LLC compile LLC-BETA compile JIT codegen | GCC CBE LLC LLC-BETA JIT | GCC/CBE GCC/LLC GCC/LLC-BETA LLC/LLC-BETA MultiSource/Applications/Burg/burg | 0.4000 104232 0.1440 * 0.2880 | 0.00 0.00 0.00 * 0.66 | - - n/a n/a MultiSource/Applications/ClamAV/clamscan | 4.3242 1354352 1.9801 * 1.1600 | 0.25 0.18 0.25 * 3.49 | 1.39 1.00 n/a n/a MultiSource/Applications/JM/ldecod/ldecod | 2.1521 612540 0.8600 * 0.6960 | 0.47 0.45 0.49 * 2.34 | 1.04 0.96 n/a n/a MultiSource/Applications/JM/lencod/lencod | 4.5162 1313180 1.8561 * 1.3960 | 8.16 8.96 8.47 * 14.48 | 0.91 0.96 n/a n/a MultiSource/Applications/SIBsim4/SIBsim4 | 0.3960 82688 0.1280 * 0.1400 | 4.49 4.86 4.84 * 5.45 | 0.92 0.93 n/a n/a MultiSource/Applications/SPASS/SPASS | 5.6243 1625108 1.7041 * 1.1120 | 10.84 10.33 10.60 * 15.08 | 1.05 1.02 n/a n/a MultiSource/Applications/aha/aha | 0.0560 6848 0.0080 * 0.0160 | 2.14 2.80 2.70 * 3.02 | 0.76 0.79 n/a n/a MultiSource/Applications/d/make_dparser | 1.0240 313340 0.3320 * 0.3200 | 0.03 0.03 0.06 * 1.02 | - - n/a n/a MultiSource/Applications/hbd/hbd | 0.2120 78108 0.1200 * 0.1000 | 0.00 0.00 0.00 * 0.28 | - - n/a n/a MultiSource/Applications/hexxagon/hexxagon | 0.1520 42900 0.0520 * 0.0520 | 11.97 10.65 9.45 * 9.14 | 1.12 1.27 n/a n/a MultiSource/Applications/kimwitu++/kc | 4.4162 1764296 1.6801 * 1.0480 | 0.15 0.22 0.16 * 4.03 | 0.68 0.94 n/a n/a MultiSource/Applications/lambda-0.1.3/lambda | 0.2000 67428 0.0920 * 0.1200 | 5.52 5.38 5.42 * 6.52 | 1.03 1.02 n/a n/a MultiSource/Applications/lemon/lemon | 0.2920 99572 0.1240 * 0.1480 | 1.76 1.51 1.83 * 117.05 | 1.17 0.96 n/a n/a MultiSource/Applications/lua/lua | 1.3760 545616 0.8040 * * | 12.57 * 13.60 * * | n/a 0.92 n/a n/a MultiSource/Applications/minisat/minisat | 0.2200 44712 0.0600 * * | 9.91 9.26 9.31 * * | 1.07 1.06 n/a n/a MultiSource/Applications/obsequi/Obsequi | 0.3600 57720 0.1120 * 0.0920 | 2.41 2.90 2.92 * 3.23 | 0.83 0.83 n/a n/a MultiSource/Applications/oggenc/oggenc | 1.3120 806736 0.4200 * 0.3120 | 0.18 0.32 0.17 * 1.17 | 0.56 1.06 n/a n/a MultiSource/Applications/sgefa/sgefa | 0.1000 16504 0.0120 * 0.0280 | 0.64 0.74 0.88 * 1.00 | 0.86 0.73 n/a n/a MultiSource/Applications/siod/siod | 0.7320 350024 0.6800 * 0.2640 | 3.84 3.52 3.57 * 4.28 | 1.09 1.08 n/a n/a MultiSource/Applications/spiff/spiff | 0.1680 54732 0.0680 * * | 0.68 0.79 0.69 * * | 0.86 0.99 n/a n/a MultiSource/Applications/sqlite3/sqlite3 | * * * * * | 0.00 * * * * | n/a n/a n/a n/a MultiSource/Applications/treecc/treecc | 0.6960 341008 0.5400 * 0.1000 | 0.00 0.00 0.00 * 0.31 | - - n/a n/a MultiSource/Applications/viterbi/viterbi | 0.0440 5020 0.0080 * 0.0080 | 5.44 5.40 11.88 * 11.67 | 1.01 0.46 n/a n/a MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 | 2.0841 396104 0.5840 * * | 4.51 5.75 5.59 * * | 0.78 0.81 n/a n/a MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk | 0.1320 12756 0.0200 * 0.0280 | 17.57 18.12 18.25 * 18.43 | 0.97 0.96 n/a n/a MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk | 0.0280 6248 0.0080 * 0.0120 | 8.42 8.72 9.83 * 9.70 | 0.97 0.86 n/a n/a MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk | 0.0200 5312 0.0080 * 0.0120 | 9.34 7.00 6.90 * 7.05 | 1.33 1.35 n/a n/a MultiSource/Benchmarks/BitBench/drop3/drop3 | 0.0200 3380 0.0040 * 0.0120 | 0.51 0.58 0.45 * 0.52 | 0.88 1.13 n/a n/a MultiSource/Benchmarks/BitBench/five11/five11 | 0.0120 2732 0.0040 * 0.0080 | 3.42 3.37 3.43 * 3.68 | 1.01 1.00 n/a n/a MultiSource/Benchmarks/BitBench/uudecode/uudecode | 0.0080 2628 0.0040 * 0.0040 | 0.10 0.12 0.12 * 0.18 | 0.83 0.83 n/a n/a MultiSource/Benchmarks/BitBench/uuencode/uuencode | 0.0080 2700 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1 | 0.0440 6984 0.0120 * 0.0160 | 1.64 1.69 1.66 * 2.08 | 0.97 0.99 n/a n/a MultiSource/Benchmarks/Fhourstones/fhourstones | 0.0320 10160 0.0160 * 0.0200 | 1.30 1.40 1.44 * 1.48 | 0.93 0.90 n/a n/a MultiSource/Benchmarks/FreeBench/analyzer/analyzer | 0.0360 9168 0.0200 * 0.0360 | 0.09 0.15 0.11 * 0.20 | - - n/a n/a MultiSource/Benchmarks/FreeBench/distray/distray | 0.0280 8056 0.0120 * 0.0080 | 0.20 0.22 0.15 * 0.19 | 0.91 1.33 n/a n/a MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow | 0.0600 17632 0.0360 * 0.0159 | 0.32 0.38 0.35 * 0.45 | 0.84 0.91 n/a n/a MultiSource/Benchmarks/FreeBench/mason/mason | 0.0160 5832 0.0040 * 0.0040 | 0.43 0.21 0.19 * 0.21 | 2.05 2.26 n/a n/a MultiSource/Benchmarks/FreeBench/neural/neural | 0.0520 8448 0.0040 * 0.0280 | 0.23 0.24 0.24 * 0.37 | 0.96 0.96 n/a n/a MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 | 0.0280 10200 0.0200 * 0.0480 | 0.25 0.32 0.26 * 0.37 | 0.78 0.96 n/a n/a MultiSource/Benchmarks/FreeBench/pifft/pifft | 0.2160 52460 0.1040 * 0.0840 | 0.19 0.13 0.15 * 0.40 | 1.46 1.27 n/a n/a MultiSource/Benchmarks/MallocBench/cfrac/cfrac | 0.1960 92460 0.1120 * 0.1080 | 1.70 1.67 1.72 * 1.99 | 1.02 0.99 n/a n/a MultiSource/Benchmarks/MallocBench/espresso/espresso | 1.2200 393660 0.5920 * 0.4360 | 0.58 0.58 0.66 * 1.77 | 1.00 0.88 n/a n/a MultiSource/Benchmarks/MallocBench/gs/gs | 1.0880 431608 * * * | 0.00 * * * * | n/a n/a n/a n/a MultiSource/Benchmarks/McCat/01-qbsort/qbsort | 0.0080 3404 0.0080 * 0.0040 | 0.07 0.06 0.06 * 0.15 | - - n/a n/a MultiSource/Benchmarks/McCat/03-testtrie/testtrie | 0.0160 3344 0.0120 * 0.0080 | 0.01 0.00 0.01 * 0.03 | - - n/a n/a MultiSource/Benchmarks/McCat/04-bisect/bisect | 0.0160 3880 0.0080 * 0.0080 | 0.10 0.20 0.21 * 0.14 | 0.50 0.48 n/a n/a MultiSource/Benchmarks/McCat/05-eks/eks | 0.1040 6624 0.0080 * 0.0120 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/McCat/08-main/main | 0.0960 11020 0.0079 * 0.0239 | 0.04 0.02 0.02 * 0.10 | - - n/a n/a MultiSource/Benchmarks/McCat/09-vor/vor | 0.0840 23308 0.0480 * 0.0560 | 0.13 0.16 0.19 * 0.24 | 0.81 0.68 n/a n/a MultiSource/Benchmarks/McCat/12-IOtest/iotest | 0.0160 2408 0.0040 * 0.0040 | 0.30 0.21 0.20 * 0.28 | 1.43 1.50 n/a n/a MultiSource/Benchmarks/McCat/15-trie/trie | 0.0160 3104 0.0120 * 0.0120 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/McCat/17-bintr/bintr | 0.0040 3468 0.0120 * 0.0039 | 0.09 0.10 0.11 * 0.13 | - - n/a n/a MultiSource/Benchmarks/McCat/18-imp/imp | 0.0600 19364 0.0400 * 0.0360 | 0.06 0.07 0.10 * 0.18 | - - n/a n/a MultiSource/Benchmarks/MiBench/automotive-basicmath/automotive-basicmath | 0.0240 4916 0.0120 * 0.0080 | 0.46 0.47 0.45 * 0.52 | 0.98 1.02 n/a n/a MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount | 0.0160 3476 0.0040 * 0.0040 | 0.11 0.12 0.16 * 0.17 | 0.92 0.69 n/a n/a MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan | 0.1960 52076 0.0640 * 0.0720 | 0.05 0.06 0.06 * 0.26 | - - n/a n/a MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg | 1.0520 208284 0.2720 * 0.1320 | 0.01 0.00 0.01 * 0.40 | - - n/a n/a MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame | 0.9920 312056 0.3440 * 0.2600 | 0.29 0.30 0.29 * 1.09 | 0.97 1.00 n/a n/a MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset | 5.0483 1355260 1.9601 * 1.6041 | 0.30 0.20 0.20 * 4.49 | 1.50 1.50 n/a n/a MultiSource/Benchmarks/MiBench/network-dijkstra/network-dijkstra | 0.0160 3244 0.0080 * 0.0080 | 0.06 0.04 0.05 * 0.08 | - - n/a n/a MultiSource/Benchmarks/MiBench/network-patricia/network-patricia | 0.0160 4132 0.0040 * 0.0080 | 0.21 0.15 0.17 * 0.22 | 1.40 1.24 n/a n/a MultiSource/Benchmarks/MiBench/office-ispell/office-ispell | 0.4160 121944 0.2160 * 0.0400 | 0.00 0.00 0.00 * 0.17 | - - n/a n/a MultiSource/Benchmarks/MiBench/office-stringsearch/office-stringsearch | 0.0280 13004 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a MultiSource/Benchmarks/MiBench/security-blowfish/security-blowfish | 0.0800 30600 0.0480 * 0.0120 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/MiBench/security-rijndael/security-rijndael | 0.1680 47948 0.0400 * 0.0240 | 0.03 0.04 0.04 * 0.11 | - - n/a n/a MultiSource/Benchmarks/MiBench/security-sha/security-sha | 0.0080 4836 0.0080 * 0.0160 | 0.04 0.02 0.03 * 0.11 | - - n/a n/a MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 | 0.0120 3136 0.0040 * 0.0000 | 0.31 0.29 0.27 * 0.31 | 1.07 1.15 n/a n/a MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft | 0.0200 4696 0.0120 * 0.0160 | 0.06 0.05 0.08 * 0.09 | - - n/a n/a MultiSource/Benchmarks/MiBench/telecomm-adpcm/telecomm-adpcm | 0.0200 2328 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm | 0.2720 78528 0.0800 * 0.0640 | 0.21 0.22 0.26 * 0.44 | 0.95 0.81 n/a n/a MultiSource/Benchmarks/NPB-serial/is/is | 0.0160 5840 * * * | 0.00 * * * * | n/a n/a n/a n/a MultiSource/Benchmarks/Olden/bh/bh | 0.0680 14716 0.0360 * 0.0240 | 1.84 1.61 1.59 * 1.63 | 1.14 1.16 n/a n/a MultiSource/Benchmarks/Olden/bisort/bisort | 0.0160 3512 0.0040 * 0.0120 | 0.96 0.93 1.03 * 1.28 | 1.03 0.93 n/a n/a MultiSource/Benchmarks/Olden/em3d/em3d | 0.0160 6084 0.0120 * 0.0159 | 3.44 4.99 3.60 * 3.78 | 0.69 0.96 n/a n/a MultiSource/Benchmarks/Olden/health/health | 0.0280 7352 0.0080 * 0.0120 | 0.70 0.71 0.74 * 0.79 | 0.99 0.95 n/a n/a MultiSource/Benchmarks/Olden/mst/mst | 0.0120 4384 0.0080 * 0.0040 | 0.16 0.13 0.14 * 0.20 | 1.23 1.14 n/a n/a MultiSource/Benchmarks/Olden/perimeter/perimeter | 0.0200 12920 0.0360 * 0.0360 | 0.36 0.38 0.37 * 0.45 | 0.95 0.97 n/a n/a MultiSource/Benchmarks/Olden/power/power | 0.0240 8324 0.0080 * 0.0240 | 2.23 1.87 1.62 * 1.76 | 1.19 1.38 n/a n/a MultiSource/Benchmarks/Olden/treeadd/treeadd | 0.0040 1760 0.0040 * 0.0000 | 6.83 0.47 0.49 * 0.51 | 14.53 13.94 n/a n/a MultiSource/Benchmarks/Olden/tsp/tsp | 0.0240 6864 0.0080 * 0.0160 | 1.41 1.51 1.24 * 2.07 | 0.93 1.14 n/a n/a MultiSource/Benchmarks/Olden/voronoi/voronoi | 0.0680 14196 0.0120 * 0.0239 | 0.53 0.44 0.41 * 0.48 | 1.20 1.29 n/a n/a MultiSource/Benchmarks/OptimizerEval/optimizer-eval | 0.0760 28744 0.0240 * 0.0520 | 103.62 111.14 110.89 * 102.05 | 0.93 0.93 n/a n/a MultiSource/Benchmarks/Prolangs-C++/NP/np | 0.0000 1164 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/city/city | 0.0520 16204 0.0360 * 0.0120 | 0.01 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/deriv1/deriv1 | 0.0080 7044 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/deriv2/deriv2 | 0.0320 8464 0.0120 * 0.0080 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/employ/employ | 0.0400 13052 0.0320 * 0.0119 | 0.01 0.00 0.02 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/family/family | 0.0080 3356 0.0000 * 0.0080 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/fsm/fsm | 0.0080 1972 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/garage/garage | 0.0160 6272 0.0040 * 0.0200 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/life/life | 0.0240 6168 0.0000 * 0.0040 | 1.78 1.61 1.79 * 1.87 | 1.11 0.99 n/a n/a MultiSource/Benchmarks/Prolangs-C++/objects/objects | 0.0840 10068 0.0159 * 0.0000 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/ocean/ocean | 0.0360 8724 0.0080 * 0.0080 | 0.14 0.13 0.14 * 0.23 | 1.08 1.00 n/a n/a MultiSource/Benchmarks/Prolangs-C++/office/office | 0.0120 5276 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/primes/primes | 0.0080 1560 0.0000 * 0.0000 | 0.41 0.40 0.40 * 0.46 | 1.02 1.02 n/a n/a MultiSource/Benchmarks/Prolangs-C++/shapes/shapes | 0.0360 12276 0.0159 * 0.0159 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/simul/simul | 0.0320 4160 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/trees/trees | 0.0520 11176 0.0200 * 0.0080 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C++/vcirc/vcirc | 0.0200 1796 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc | 1.5400 529912 0.7800 * 0.0920 | 0.00 0.00 0.00 * 0.39 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/agrep/agrep | 0.3000 86440 0.1880 * 0.0480 | 0.00 0.00 0.00 * 0.16 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/allroots/allroots | 0.0120 3108 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/archie-client/archie | 0.1400 47064 0.0680 * 0.0160 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/assembler/assembler | 0.1840 60336 0.1000 * 0.0680 | 0.00 0.00 0.00 * 0.21 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/bison/mybison | 0.3320 110756 0.1920 * 0.2520 | 0.00 0.00 0.00 * 0.70 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/cdecl/cdecl | 0.0840 48424 0.0520 * 0.0760 | 0.00 0.00 0.00 * 0.22 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/compiler/compiler | 0.1080 36612 0.1240 * 0.0119 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/fixoutput/fixoutput | 0.0080 5964 0.0080 * 0.0120 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/football/football | 0.2960 83784 0.1480 * 0.0080 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/gnugo/gnugo | 0.0960 35168 0.0720 * 0.0200 | 0.00 0.00 0.00 * 0.08 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/loader/loader | 0.1120 28788 0.0440 * 0.0360 | 0.00 0.00 0.00 * 0.09 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/plot2fig/plot2fig | 0.0320 14084 0.0200 * 0.0720 | 0.00 0.00 0.00 * 0.18 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/simulator/simulator | 0.2600 62760 0.1320 * 0.0080 | 0.00 0.00 0.00 * 0.04 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail | 0.0880 40688 0.1080 * 0.0520 | 0.00 0.00 0.00 * 0.19 | - - n/a n/a MultiSource/Benchmarks/Prolangs-C/unix-tbl/unix-tbl | 0.2280 75964 0.1560 * 0.0120 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a MultiSource/Benchmarks/Ptrdist/anagram/anagram | 0.0240 7184 0.0040 * 0.0120 | 1.81 1.40 1.33 * 1.44 | 1.29 1.36 n/a n/a MultiSource/Benchmarks/Ptrdist/bc/bc | 0.3240 123592 0.2400 * 0.2040 | 0.83 0.80 0.80 * 1.35 | 1.04 1.04 n/a n/a MultiSource/Benchmarks/Ptrdist/ft/ft | 0.0520 6996 0.0160 * 0.0080 | 1.29 1.34 1.30 * 1.42 | 0.96 0.99 n/a n/a MultiSource/Benchmarks/Ptrdist/ks/ks | 0.0440 11604 0.0199 * 0.0240 | 1.98 2.27 2.33 * 2.13 | 0.87 0.85 n/a n/a MultiSource/Benchmarks/Ptrdist/yacr2/yacr2 | 0.1480 43404 0.0880 * 0.0600 | 0.92 0.98 1.24 * 1.51 | 0.94 0.74 n/a n/a MultiSource/Benchmarks/SciMark2-C/scimark2 | 0.0680 13920 0.0320 * 0.0240 | 25.91 29.52 27.68 * 27.88 | 0.88 0.94 n/a n/a MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des | 0.1640 22428 0.0560 * 0.0520 | 2.01 2.05 2.21 * 2.06 | 0.98 0.91 n/a n/a MultiSource/Benchmarks/Trimaran/enc-md5/enc-md5 | 0.0480 7588 0.0120 * 0.0200 | 2.30 2.32 2.12 * 2.08 | 0.99 1.08 n/a n/a MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1 | 0.0240 4072 0.0080 * 0.0040 | 0.80 0.38 0.78 * 0.89 | 2.11 1.03 n/a n/a MultiSource/Benchmarks/Trimaran/enc-rc4/enc-rc4 | 0.0120 2284 0.0000 * 0.0040 | 1.25 1.38 1.59 * 1.46 | 0.91 0.79 n/a n/a MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc | 0.0120 32640 0.0040 * * | 0.91 1.09 1.07 * * | 0.83 0.85 n/a n/a MultiSource/Benchmarks/Trimaran/netbench-url/netbench-url | 0.0320 39964 0.0039 * 0.0160 | 3.39 3.15 3.33 * 3.45 | 1.08 1.02 n/a n/a MultiSource/Benchmarks/VersaBench/8b10b/8b10b | 0.0040 2052 0.0040 * 0.0040 | 8.86 5.80 5.47 * 5.39 | 1.53 1.62 n/a n/a MultiSource/Benchmarks/VersaBench/beamformer/beamformer | 0.0120 4924 0.0040 * 0.0080 | 1.36 2.28 1.73 * 1.78 | 0.60 0.79 n/a n/a MultiSource/Benchmarks/VersaBench/bmm/bmm | 0.0160 2664 0.0000 * 0.0040 | 2.23 2.40 2.29 * 2.91 | 0.93 0.97 n/a n/a MultiSource/Benchmarks/VersaBench/dbms/dbms | 0.0920 35756 0.0640 * 0.0520 | 2.15 2.26 2.05 * 2.50 | 0.95 1.05 n/a n/a MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes | 0.0600 9324 0.0080 * 0.0040 | 2.95 2.82 2.83 * 3.00 | 1.05 1.04 n/a n/a MultiSource/Benchmarks/llubenchmark/llu | 0.0080 3428 0.0080 * 0.0040 | 7.89 8.33 8.22 * 9.28 | 0.95 0.96 n/a n/a MultiSource/Benchmarks/mediabench/adpcm/rawcaudio/rawcaudio | 0.0080 2284 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a MultiSource/Benchmarks/mediabench/adpcm/rawdaudio/rawdaudio | 0.0040 2328 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a MultiSource/Benchmarks/mediabench/g721/g721encode/encode | 0.0600 12768 0.0280 * 0.0320 | 0.06 0.05 0.18 * 0.16 | - - n/a n/a MultiSource/Benchmarks/mediabench/gsm/toast/toast | 0.2600 78528 0.0800 * 0.0600 | 0.08 0.03 0.03 * 0.18 | - - n/a n/a MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg | 1.1080 165044 0.2120 * 0.1560 | 0.00 0.00 0.00 * 0.46 | - - n/a n/a MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode | 0.3680 96980 0.1640 * 0.1320 | 0.01 0.02 0.02 * 0.43 | - - n/a n/a MultiSource/Benchmarks/sim/sim | 0.0840 27636 0.0280 * 0.0480 | 5.39 5.61 5.31 * 5.46 | 0.96 1.02 n/a n/a MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 | 6.7364 2382900 1.8241 * 1.5560 | 0.55 * 0.73 * 5.28 | n/a 0.75 n/a n/a SingleSource/Benchmarks/Adobe-C++/functionobjects | 0.1560 35352 0.0400 * 0.0520 | 3.92 5.01 3.55 * 3.91 | 0.78 1.10 n/a n/a SingleSource/Benchmarks/Adobe-C++/loop_unroll | 0.8440 295780 0.4600 * 0.4640 | 2.48 1.84 1.91 * 3.08 | 1.35 1.30 n/a n/a SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding | 0.8120 181520 0.2960 * 0.3200 | 1.17 1.08 1.89 * 3.49 | 1.08 0.62 n/a n/a SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant | 0.5320 132424 0.2000 * 0.1800 | 2.98 2.36 3.16 * 3.91 | 1.26 0.94 n/a n/a SingleSource/Benchmarks/Adobe-C++/stepanov_abstraction | 0.2080 36836 0.0480 * 0.0600 | 5.03 7.41 5.04 * 5.29 | 0.68 1.00 n/a n/a SingleSource/Benchmarks/Adobe-C++/stepanov_vector | 0.2800 39688 0.0560 * 0.0560 | 3.07 5.50 3.06 * 3.05 | 0.56 1.00 n/a n/a SingleSource/Benchmarks/BenchmarkGame/fannkuch | 0.0120 2040 0.0040 * 0.0040 | 4.04 4.01 3.72 * 3.53 | 1.01 1.09 n/a n/a SingleSource/Benchmarks/BenchmarkGame/fasta | 0.0080 3044 0.0040 * 0.0040 | 1.14 1.56 1.18 * 1.23 | 0.73 0.97 n/a n/a SingleSource/Benchmarks/BenchmarkGame/n-body | 0.0040 2772 0.0040 * 0.0120 | 1.25 1.18 1.37 * 1.49 | 1.06 0.91 n/a n/a SingleSource/Benchmarks/BenchmarkGame/nsieve-bits | 0.0040 1128 0.0000 * 0.0000 | 1.20 1.05 1.12 * 0.91 | 1.14 1.07 n/a n/a SingleSource/Benchmarks/BenchmarkGame/partialsums | 0.0120 1648 0.0000 * 0.0080 | 0.81 0.67 0.83 * 0.85 | 1.21 0.98 n/a n/a SingleSource/Benchmarks/BenchmarkGame/recursive | 0.0120 3656 0.0080 * 0.0080 | 0.74 1.07 0.98 * 1.17 | 0.69 0.76 n/a n/a SingleSource/Benchmarks/BenchmarkGame/spectral-norm | 0.0040 1872 0.0080 * 0.0000 | 2.16 2.20 1.41 * 1.37 | 0.98 1.53 n/a n/a SingleSource/Benchmarks/CoyoteBench/almabench | 0.0200 9648 0.0040 * 0.0000 | 11.14 10.66 17.16 * 17.81 | 1.05 0.65 n/a n/a SingleSource/Benchmarks/CoyoteBench/fftbench | 0.0440 14728 0.0119 * 0.0280 | 2.34 2.12 2.17 * 2.26 | 1.10 1.08 n/a n/a SingleSource/Benchmarks/CoyoteBench/huffbench | 0.0200 6260 0.0160 * 0.0080 | 19.41 20.18 17.45 * 18.86 | 0.96 1.11 n/a n/a SingleSource/Benchmarks/CoyoteBench/lpbench | 0.0280 3720 0.0080 * 0.0040 | 12.16 12.55 12.02 * 12.29 | 0.97 1.01 n/a n/a SingleSource/Benchmarks/Dhrystone/dry | 0.0120 1276 0.0000 * 0.0040 | 4.67 0.86 0.71 * 1.03 | 5.43 6.58 n/a n/a SingleSource/Benchmarks/Dhrystone/fldry | 0.0160 1496 0.0000 * 0.0040 | 4.67 1.53 0.78 * 1.10 | 3.05 5.99 n/a n/a SingleSource/Benchmarks/McGill/chomp | 0.0400 6480 0.0160 * 0.0120 | 1.29 1.05 1.40 * 1.46 | 1.23 0.92 n/a n/a SingleSource/Benchmarks/McGill/exptree | 0.0240 4504 0.0080 * 0.0120 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a SingleSource/Benchmarks/McGill/misr | 0.0080 2724 0.0040 * 0.0040 | 0.32 0.26 0.34 * 0.30 | 1.23 0.94 n/a n/a SingleSource/Benchmarks/McGill/queens | 0.0000 3128 0.0040 * 0.0080 | 2.84 2.55 3.03 * 2.88 | 1.11 0.94 n/a n/a SingleSource/Benchmarks/Misc-C++/bigfib | 0.1000 24736 0.0600 * * | 0.57 * 0.65 * * | n/a 0.88 n/a n/a SingleSource/Benchmarks/Misc-C++/mandel-text | 0.0000 1100 0.0040 * 0.0000 | 2.53 3.32 2.02 * 2.05 | 0.76 1.25 n/a n/a SingleSource/Benchmarks/Misc-C++/oopack_v1p8 | 0.0160 9516 0.0080 * 0.0120 | 0.19 0.30 0.22 * 0.25 | 0.63 0.86 n/a n/a SingleSource/Benchmarks/Misc-C++/ray | 0.0360 11196 0.0120 * 0.0119 | 5.81 5.95 4.00 * 4.02 | 0.98 1.45 n/a n/a SingleSource/Benchmarks/Misc-C++/sphereflake | 0.0400 7696 0.0080 * 0.0040 | 2.88 4.31 3.67 * 3.88 | 0.67 0.78 n/a n/a SingleSource/Benchmarks/Misc-C++/stepanov_container | 0.2600 46176 0.0720 * 0.0520 | 5.61 6.38 5.75 * 6.54 | 0.88 0.98 n/a n/a SingleSource/Benchmarks/Misc-C++/stepanov_v1p2 | 0.0640 10084 0.0080 * 0.0080 | 8.01 8.08 8.20 * 8.21 | 0.99 0.98 n/a n/a SingleSource/Benchmarks/Misc/ReedSolomon | 0.0800 8708 0.0240 * 0.0320 | 6.64 7.42 7.68 * 8.81 | 0.89 0.86 n/a n/a SingleSource/Benchmarks/Misc/fbench | 0.0120 5052 0.0080 * 0.0040 | 2.20 2.30 2.33 * 2.35 | 0.96 0.94 n/a n/a SingleSource/Benchmarks/Misc/ffbench | 0.0120 3836 0.0040 * 0.0080 | 1.22 1.13 1.18 * 1.12 | 1.08 1.03 n/a n/a SingleSource/Benchmarks/Misc/flops | 0.0320 6336 0.0080 * 0.0120 | 9.26 9.61 11.07 * 10.87 | 0.96 0.84 n/a n/a SingleSource/Benchmarks/Misc/flops-1 | 0.0000 1148 0.0040 * 0.0000 | 2.93 2.87 2.63 * 2.59 | 1.02 1.11 n/a n/a SingleSource/Benchmarks/Misc/flops-2 | 0.0040 1268 0.0000 * 0.0000 | 1.52 1.65 1.59 * 1.52 | 0.92 0.96 n/a n/a SingleSource/Benchmarks/Misc/flops-3 | 0.0000 1172 0.0000 * 0.0040 | 2.56 2.72 3.01 * 3.34 | 0.94 0.85 n/a n/a SingleSource/Benchmarks/Misc/flops-4 | 0.0040 1148 0.0000 * 0.0000 | 1.06 1.08 1.30 * 1.31 | 0.98 0.82 n/a n/a SingleSource/Benchmarks/Misc/flops-5 | 0.0040 1272 0.0000 * 0.0040 | 4.12 4.30 4.45 * 4.34 | 0.96 0.93 n/a n/a SingleSource/Benchmarks/Misc/flops-6 | 0.0040 1272 0.0000 * 0.0000 | 2.08 2.16 5.02 * 5.01 | 0.96 0.41 n/a n/a SingleSource/Benchmarks/Misc/flops-7 | 0.0000 1072 0.0000 * 0.0000 | 3.88 3.56 3.74 * 4.50 | 1.09 1.04 n/a n/a SingleSource/Benchmarks/Misc/flops-8 | 0.0080 1276 0.0040 * 0.0040 | 3.84 4.21 2.25 * 2.34 | 0.91 1.71 n/a n/a SingleSource/Benchmarks/Misc/himenobmtxpa | 0.0360 7880 0.0160 * 0.0080 | 1.50 2.09 1.89 * 1.93 | 0.72 0.79 n/a n/a SingleSource/Benchmarks/Misc/mandel | 0.0000 1072 0.0000 * 0.0040 | 0.99 0.88 0.93 * 0.93 | 1.12 1.06 n/a n/a SingleSource/Benchmarks/Misc/oourafft | 0.0520 10124 0.0120 * 0.0120 | 5.90 7.97 8.47 * 7.81 | 0.74 0.70 n/a n/a SingleSource/Benchmarks/Misc/perlin | 0.0200 4348 0.0040 * 0.0080 | 6.20 6.43 6.79 * 7.01 | 0.96 0.91 n/a n/a SingleSource/Benchmarks/Misc/pi | 0.0080 1096 0.0000 * 0.0000 | 0.81 0.75 0.80 * 0.89 | 1.08 1.01 n/a n/a SingleSource/Benchmarks/Misc/richards_benchmark | 0.0240 5288 0.0080 * 0.0040 | 1.12 1.19 1.17 * 1.41 | 0.94 0.96 n/a n/a SingleSource/Benchmarks/Misc/whetstone | 0.0200 3068 0.0040 * 0.0080 | 1.70 1.80 1.54 * 1.55 | 0.94 1.10 n/a n/a SingleSource/Benchmarks/Shootout-C++/ackermann | 0.0040 3444 0.0000 * 0.0000 | 0.76 0.97 1.02 * 1.08 | 0.78 0.75 n/a n/a SingleSource/Benchmarks/Shootout-C++/ary | 0.0280 4280 0.0040 * 0.0080 | 0.12 0.16 0.11 * 0.14 | 0.75 1.09 n/a n/a SingleSource/Benchmarks/Shootout-C++/ary2 | 0.0200 4736 0.0040 * 0.0040 | 0.11 0.12 0.12 * 0.14 | 0.92 0.92 n/a n/a SingleSource/Benchmarks/Shootout-C++/ary3 | 0.0160 4488 0.0040 * 0.0080 | 5.10 5.33 5.17 * 5.07 | 0.96 0.99 n/a n/a SingleSource/Benchmarks/Shootout-C++/except | 0.0120 4404 * * * | 0.27 * * * * | n/a n/a n/a n/a SingleSource/Benchmarks/Shootout-C++/fibo | 0.0040 3088 0.0000 * 0.0000 | 2.76 0.66 0.53 * 0.53 | 4.18 5.21 n/a n/a SingleSource/Benchmarks/Shootout-C++/hash | 0.0600 10768 0.0120 * 0.0320 | 0.68 0.70 0.69 * 0.82 | 0.97 0.99 n/a n/a SingleSource/Benchmarks/Shootout-C++/hash2 | 0.0760 14524 0.0080 * 0.0160 | 3.93 4.74 4.27 * 4.46 | 0.83 0.92 n/a n/a SingleSource/Benchmarks/Shootout-C++/heapsort | 0.0040 3008 0.0040 * 0.0000 | 3.97 5.69 4.47 * 4.55 | 0.70 0.89 n/a n/a SingleSource/Benchmarks/Shootout-C++/hello | 0.0000 2740 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Benchmarks/Shootout-C++/lists | 0.0400 6356 0.0080 * 0.0079 | 8.87 8.86 8.66 * 8.77 | 1.00 1.02 n/a n/a SingleSource/Benchmarks/Shootout-C++/lists1 | 0.0480 7548 0.0120 * 0.0040 | 0.50 0.49 0.52 * 0.57 | 1.02 0.96 n/a n/a SingleSource/Benchmarks/Shootout-C++/matrix | 0.0120 4544 0.0040 * 0.0000 | 3.31 3.37 3.37 * 3.37 | 0.98 0.98 n/a n/a SingleSource/Benchmarks/Shootout-C++/methcall | 0.0080 4740 0.0040 * 0.0000 | 8.97 6.01 6.16 * 7.93 | 1.49 1.46 n/a n/a SingleSource/Benchmarks/Shootout-C++/moments | 0.0440 9252 0.0040 * 0.0079 | 0.18 0.19 0.17 * 0.19 | 0.95 1.06 n/a n/a SingleSource/Benchmarks/Shootout-C++/nestedloop | 0.0040 3548 0.0000 * 0.0040 | 11.47 0.00 0.18 * 0.27 | - 63.72 n/a n/a SingleSource/Benchmarks/Shootout-C++/objinst | 0.0160 5000 0.0080 * 0.0039 | 8.79 8.71 8.87 * 9.05 | 1.01 0.99 n/a n/a SingleSource/Benchmarks/Shootout-C++/random | 0.0040 3036 0.0040 * 0.0000 | 4.75 4.15 4.45 * 4.51 | 1.14 1.07 n/a n/a SingleSource/Benchmarks/Shootout-C++/reversefile | 0.0400 10380 0.0080 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a SingleSource/Benchmarks/Shootout-C++/sieve | 0.0360 5684 0.0080 * 0.0120 | 2.29 2.48 2.56 * 2.56 | 0.92 0.89 n/a n/a SingleSource/Benchmarks/Shootout-C++/spellcheck | 0.0440 14652 0.0080 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a SingleSource/Benchmarks/Shootout-C++/strcat | 0.0040 3532 0.0000 * 0.0040 | 0.17 0.18 0.20 * 0.20 | 0.94 0.85 n/a n/a SingleSource/Benchmarks/Shootout-C++/sumcol | 0.0000 3212 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Benchmarks/Shootout-C++/wc | 0.0080 3484 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/Benchmarks/Shootout-C++/wordfreq | 0.1040 19508 0.0240 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a SingleSource/Benchmarks/Shootout/ackermann | 0.0040 1180 0.0040 * 0.0080 | 0.00 0.01 0.01 * 0.02 | - - n/a n/a SingleSource/Benchmarks/Shootout/ary3 | 0.0040 984 0.0040 * 0.0000 | 5.02 5.13 5.17 * 5.11 | 0.98 0.97 n/a n/a SingleSource/Benchmarks/Shootout/fib2 | 0.0000 924 0.0000 * 0.0040 | 2.66 0.58 0.52 * 0.60 | 4.59 5.12 n/a n/a SingleSource/Benchmarks/Shootout/hash | 0.0160 2764 0.0040 * * | 1.63 1.70 1.70 * * | 0.96 0.96 n/a n/a SingleSource/Benchmarks/Shootout/heapsort | 0.0040 1364 0.0000 * 0.0040 | 4.12 4.88 4.60 * 4.53 | 0.84 0.90 n/a n/a SingleSource/Benchmarks/Shootout/hello | 0.0000 560 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Benchmarks/Shootout/lists | 0.0200 3176 0.0080 * 0.0080 | 6.17 6.37 6.91 * 7.72 | 0.97 0.89 n/a n/a SingleSource/Benchmarks/Shootout/matrix | 0.0200 2924 0.0040 * 0.0040 | 2.61 3.93 3.93 * 3.78 | 0.66 0.66 n/a n/a SingleSource/Benchmarks/Shootout/methcall | 0.0040 1532 0.0040 * 0.0040 | 4.53 4.47 4.31 * 6.22 | 1.01 1.05 n/a n/a SingleSource/Benchmarks/Shootout/nestedloop | 0.0040 1400 0.0040 * 0.0040 | 7.85 0.01 0.22 * 0.18 | - 35.68 n/a n/a SingleSource/Benchmarks/Shootout/objinst | 0.0080 1664 0.0080 * 0.0000 | 8.18 8.05 7.91 * 7.98 | 1.02 1.03 n/a n/a SingleSource/Benchmarks/Shootout/random | 0.0080 784 0.0000 * 0.0000 | 4.21 4.38 4.50 * 4.45 | 0.96 0.94 n/a n/a SingleSource/Benchmarks/Shootout/sieve | 0.0040 1204 0.0040 * 0.0000 | 7.16 6.76 6.34 * 6.51 | 1.06 1.13 n/a n/a SingleSource/Benchmarks/Shootout/strcat | 0.0040 1236 0.0000 * 0.0000 | 0.15 0.14 0.17 * 0.16 | 1.07 0.88 n/a n/a SingleSource/Benchmarks/Stanford/Bubblesort | 0.0160 1240 0.0000 * 0.0040 | 0.05 0.04 0.10 * 0.07 | - - n/a n/a SingleSource/Benchmarks/Stanford/IntMM | 0.0040 1464 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/Benchmarks/Stanford/Oscar | 0.0200 2748 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/Benchmarks/Stanford/Perm | 0.0040 2940 0.0040 * 0.0000 | 0.02 0.03 0.04 * 0.05 | - - n/a n/a SingleSource/Benchmarks/Stanford/Puzzle | 0.0400 4408 0.0000 * 0.0080 | 0.20 0.19 0.28 * 0.25 | 1.05 0.71 n/a n/a SingleSource/Benchmarks/Stanford/Queens | 0.0080 1992 0.0040 * 0.0040 | 0.03 0.05 0.04 * 0.05 | - - n/a n/a SingleSource/Benchmarks/Stanford/Quicksort | 0.0120 1648 0.0040 * 0.0000 | 0.04 0.05 0.07 * 0.05 | - - n/a n/a SingleSource/Benchmarks/Stanford/RealMM | 0.0120 1492 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Benchmarks/Stanford/Towers | 0.0160 3716 0.0080 * 0.0120 | 0.01 0.02 0.02 * 0.04 | - - n/a n/a SingleSource/Benchmarks/Stanford/Treesort | 0.0200 2256 0.0040 * 0.0040 | 0.10 0.09 0.13 * 0.11 | - 0.77 n/a n/a SingleSource/Regression/C++/2003-05-14-array-init | 0.0040 560 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2003-05-14-expr_stmt | 0.0000 476 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2003-06-08-BaseType | 0.0040 488 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2003-06-08-VirtualFunctions | 0.0000 612 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2003-06-13-Crasher | 0.0000 448 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2003-08-20-EnumSizeProblem | 0.0000 448 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2003-09-29-NonPODsByValue | 0.0000 552 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/2008-01-29-ParamAliasesReturn | 0.0000 752 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/BuiltinTypeInfo | 0.0000 704 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/EH/ConditionalExpr | 0.0120 1144 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/EH/ctor_dtor_count | 0.0040 1008 * * * | 0.00 * * * * | n/a n/a n/a n/a SingleSource/Regression/C++/EH/ctor_dtor_count-2 | 0.0080 2084 0.0040 * * | 0.00 * 0.00 * * | n/a - n/a n/a SingleSource/Regression/C++/EH/dead_try_block | 0.0000 532 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/EH/exception_spec_test | 0.0000 2664 * * * | 0.00 * * * * | n/a n/a n/a n/a SingleSource/Regression/C++/EH/function_try_block | 0.0080 3960 * * * | 0.00 * * * * | n/a n/a n/a n/a SingleSource/Regression/C++/EH/simple_rethrow | 0.0040 1604 * * * | 0.00 * * * * | n/a n/a n/a n/a SingleSource/Regression/C++/EH/simple_throw | 0.0040 1096 0.0000 * * | 0.00 * 0.00 * * | n/a - n/a n/a SingleSource/Regression/C++/EH/throw_rethrow_test | 0.0080 2148 0.0040 * * | 0.00 * 0.00 * * | n/a - n/a n/a SingleSource/Regression/C++/global_ctor | 0.0040 1292 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/global_type | 0.0000 448 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/ofstream_ctor | 0.0000 2540 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/pointer_member | 0.0000 588 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/pointer_method | 0.0040 1364 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C++/short_circuit_dtor | 0.0000 784 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-14-initialize-string | 0.0000 660 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-21-BitfieldHandling | 0.0040 596 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-21-UnionBitfields | 0.0000 560 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-21-UnionTest | 0.0040 792 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-22-LocalTypeTest | 0.0040 604 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-22-VarSizeArray | 0.0000 560 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-05-23-TransparentUnion | 0.0000 556 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-06-16-InvalidInitializer | 0.0000 448 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-06-16-VolatileError | 0.0000 448 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2003-10-12-GlobalVarInitializers | 0.0000 644 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2004-02-03-AggregateCopy | 0.0000 568 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2004-03-15-IndirectGoto | 0.0040 816 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2004-08-12-InlinerAndAllocas | 0.0000 816 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2005-05-06-LongLongSignedShift | 0.0000 588 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/2008-01-07-LongDouble | 0.0040 576 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/ConstructorDestructorAttributes | 0.0040 748 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/DuffsDevice | 0.0080 1072 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/PR1386 | 0.0040 728 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/PR491 | 0.0000 448 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/PR640 | 0.0040 2080 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/badidx | 0.0000 876 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/bigstack | 0.0080 2288 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/callargs | 0.0000 904 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/casts | 0.0040 3492 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/Regression/C/globalrefs | 0.0000 1008 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/matrixTranspose | 0.0040 1196 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/pointer_arithmetic | 0.0000 480 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/sumarray | 0.0040 872 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/Regression/C/sumarray2d | 0.0000 1008 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/sumarraymalloc | 0.0080 1144 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/test_indvars | 0.0080 1384 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/Regression/C/testtrace | 0.0000 1124 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-04-17-PrintfChar | 0.0040 560 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/UnitTests/2002-05-02-ArgumentTest | 0.0000 608 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-02-CastTest | 0.0040 1400 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-02-CastTest1 | 0.0000 588 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-02-CastTest2 | 0.0000 680 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-02-CastTest3 | 0.0000 664 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-02-ManyArguments | 0.0000 676 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-03-NotTest | 0.0000 644 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-05-19-DivTest | 0.0000 672 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-08-02-CastTest | 0.0040 564 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-08-02-CastTest2 | 0.0000 592 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-08-19-CodegenBug | 0.0000 552 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-10-09-ArrayResolution | 0.0000 628 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-10-12-StructureArgs | 0.0040 616 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-10-12-StructureArgsSimple | 0.0000 584 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-10-13-BadLoad | 0.0040 556 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2002-12-13-MishaTest | 0.0080 568 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-04-22-Switch | 0.0040 784 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-05-02-DependentPHI | 0.0000 844 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-05-07-VarArgs | 0.0120 2812 0.0040 * 0.0080 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a SingleSource/UnitTests/2003-05-12-MinIntProblem | 0.0000 552 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-05-14-AtExit | 0.0000 676 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-05-26-Shorts | 0.0000 1952 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-05-31-CastToBool | 0.0040 904 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-05-31-LongShifts | 0.0040 876 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-07-06-IntOverflow | 0.0000 792 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-07-08-BitOpsTest | 0.0000 572 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-07-09-LoadShorts | 0.0040 1476 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-07-09-SignedArgs | 0.0040 1648 0.0080 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/UnitTests/2003-07-10-SignConversions | 0.0000 768 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-08-05-CastFPToUint | 0.0000 700 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-08-11-VaListArg | 0.0080 3088 0.0080 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/UnitTests/2003-08-20-FoldBug | 0.0000 552 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-09-18-BitFieldTest | 0.0000 592 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-10-13-SwitchTest | 0.0000 632 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2003-10-29-ScalarReplBug | 0.0040 624 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2004-02-02-NegativeZero | 0.0000 628 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2004-06-20-StaticBitfieldInit | 0.0040 600 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2004-11-28-GlobalBoolLayout | 0.0000 844 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2005-05-11-Popcount-ffs-fls | 0.0160 2092 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/UnitTests/2005-05-12-Int64ToFP | 0.0000 664 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2005-05-13-SDivTwo | 0.0040 644 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2005-07-15-Bitfield-ABI | 0.0000 580 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2005-07-17-INT-To-FP | 0.0000 872 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2005-11-29-LongSwitch | 0.0000 564 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2006-01-23-UnionInit | 0.0040 2088 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2006-01-29-SimpleIndirectCall | 0.0080 784 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2006-02-04-DivRem | 0.0040 660 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2006-12-01-float_varg | 0.0040 640 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.01 | - - n/a n/a SingleSource/UnitTests/2006-12-04-DynAllocAndRestore | 0.0040 672 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2006-12-07-Compare64BitConstant | 0.0000 572 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2006-12-11-LoadConstants | 0.0640 2684 0.0080 * 0.0080 | 0.00 0.00 0.00 * 0.03 | - - n/a n/a SingleSource/UnitTests/2007-01-04-KNR-Args | 0.0000 648 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2007-03-02-VaCopy | 0.0040 920 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2007-04-10-BitfieldTest | 0.0040 584 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2007-04-25-weak | 0.0000 504 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2008-04-18-LoopBug | 0.0080 1004 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2008-04-20-LoopBug2 | 0.0040 1008 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/2008-07-13-InlineSetjmp | 0.0000 828 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.02 | - - n/a n/a SingleSource/UnitTests/AtomicOps | 0.0040 728 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/FloatPrecision | 0.0000 596 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/cast | 0.0120 2480 0.0040 * 0.0080 | 0.01 0.01 0.01 * 0.04 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/cast-bug | 0.0040 616 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/cast2 | 0.0000 568 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/ccc | 0.0040 720 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/div | 0.0040 1276 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/factor | 0.0080 952 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/rem | 0.0080 9112 0.0200 * 0.0200 | 0.00 0.00 0.00 * 0.06 | - - n/a n/a SingleSource/UnitTests/SignlessTypes/shr | 0.0040 1408 0.0040 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/StructModifyTest | 0.0000 672 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/TestLoop | 0.0000 628 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/Threads/tls | 0.0000 872 0.0000 * * | 0.00 0.00 0.00 * * | - - n/a n/a SingleSource/UnitTests/Vector/build | 0.0000 812 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/Vector/build2 | 0.0080 1212 0.0000 * 0.0080 | 1.55 1.49 1.47 * 1.57 | 1.04 1.05 n/a n/a SingleSource/UnitTests/Vector/divides | 0.0040 1036 0.0040 * 0.0000 | 0.00 * 0.00 * 0.00 | n/a - n/a n/a SingleSource/UnitTests/Vector/multiplies | 0.0040 1708 0.0040 * 0.0040 | 0.72 0.66 1.84 * 1.87 | 1.09 0.39 n/a n/a SingleSource/UnitTests/Vector/simple | 0.0040 1360 0.0040 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/Vector/sumarray | 0.0000 904 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/Vector/sumarray-dbl | 0.0040 944 0.0000 * 0.0040 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a SingleSource/UnitTests/printargs | 0.0000 700 0.0000 * 0.0000 | 0.00 0.00 0.00 * 0.00 | - - n/a n/a From jk500500 at yahoo.com Thu Feb 12 10:34:54 2009 From: jk500500 at yahoo.com (Jeff Kuskin) Date: Thu, 12 Feb 2009 08:34:54 -0800 (PST) Subject: [LLVMdev] fastcc, tail calls, and gcc Message-ID: <408979.78931.qm@web53812.mail.re2.yahoo.com> Two related questions. This is with LLVM 2.4 doing a JIT compile to x86-64. (I generate LLVM IR using an IRBuilder instance, compile/optimize, and then call getPointerToFunction() to get a "native" function pointer.) (1) My reading of various mailing list messages seems to indicate that a function marked as using the "fastcc" calling convention ("CallingConv::Fast") cannot be called directly from GCC-generated code (n.b. -- standalone gcc, not llvm-gcc) because the fastcc calling convention is, in general, incompatible with GCC (which I assume uses the "CallingConv::C" calling convention). Correct? If not, how do I call a LLVM JIT-generated fastcc function from a function statically compiled by GCC? (2) Why does the x86-64 JIT backend generate a "ret $0x8" instruction to return from a fastcc function that is (a) marked as fastcc (CallingConv::Fast); but (b) takes no arguments and returns 'void'? The function type is this: std::vector args; /* empty */ FunctionType *ft = FunctionType::get(Type::VoidTy, args, false); The fastcc generated code ends with this: c20800 ret $0x8 However, if I instead mark the very same function to use the usual CallingConv::C calling convention, then the generated code ends with this: c3 ret I assume the "ret 0x8" is meant to be the "callee pops args" portion of the fastcc convention, but in this case the function has no arguments (nor a return value), so why should 8 bytes be popped from the stack on return? Thanks for any help. -- Jeff Kuskin From clattner at apple.com Thu Feb 12 11:14:14 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 12 Feb 2009 09:14:14 -0800 Subject: [LLVMdev] Eliminate PHI for non-copyable registers In-Reply-To: <21972748.post@talk.nabble.com> References: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> <8E39BC05-CED5-4796-8031-42FA38F80EDC@apple.com> <21972748.post@talk.nabble.com> Message-ID: <83956F80-D316-4852-A673-9D450E974C79@apple.com> On Feb 12, 2009, at 1:41 AM, [Alex] wrote: > They "should" be non-allocatable if the hardware implements the same > number > of these i32 registers as the "specification". The input language > (which is > converted to LLVM IR) may use up to 4 registers but the hardware > only has 2. > So they must be allocatable, right? To be allocatable, the code generator must be able to emit copies into and out of the registers and must be able to spill them, even if it means going through another temporary register class. -Chris From clattner at apple.com Thu Feb 12 11:15:30 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 12 Feb 2009 09:15:30 -0800 Subject: [LLVMdev] Add -> operator to ImmutableSet::iterator In-Reply-To: <1b587cab0902120459v6a903122x617abcaa13dd152@mail.gmail.com> References: <1b587cab0902120459v6a903122x617abcaa13dd152@mail.gmail.com> Message-ID: On Feb 12, 2009, at 4:59 AM, Ben Laurie wrote: > What it says on the tin... Looks great, applied thanks! http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090209/073658.html -Chris From llucax at gmail.com Thu Feb 12 11:21:38 2009 From: llucax at gmail.com (Leandro Lucarella) Date: Thu, 12 Feb 2009 15:21:38 -0200 Subject: [LLVMdev] Is boost supposed to work with llvm-g++? Message-ID: <20090212172136.GN1718@burns.springfield.home> Hello, I'm trying to compile a program using boost (Spirit) but I get a llvm-g++ crash: http://pastebin.lugmen.org.ar/4666 I'm using debian packages for both boost and LLVM: libboost1.35-dev 1.35.0-5 llvm-gcc-4.2 2.2-1 (BTW, this compiles just fine, using a *lot* of memory, but fine, using g++ 4.3.x) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Ya ni el cielo me quiere, ya ni la muerte me visita Ya ni el sol me calienta, ya ni el viento me acaricia From clattner at apple.com Thu Feb 12 11:43:11 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 12 Feb 2009 09:43:11 -0800 Subject: [LLVMdev] Is boost supposed to work with llvm-g++? In-Reply-To: <20090212172136.GN1718@burns.springfield.home> References: <20090212172136.GN1718@burns.springfield.home> Message-ID: On Feb 12, 2009, at 9:21 AM, Leandro Lucarella wrote: > Hello, I'm trying to compile a program using boost (Spirit) but I get > a llvm-g++ crash: > http://pastebin.lugmen.org.ar/4666 > > I'm using debian packages for both boost and LLVM: > libboost1.35-dev 1.35.0-5 > llvm-gcc-4.2 2.2-1 > > (BTW, this compiles just fine, using a *lot* of memory, but fine, > using > g++ 4.3.x) Please try upgrading to LLVM SVN mainline or the 2.5 prerelease, 2.2 is really old. -Chris From clattner at apple.com Thu Feb 12 11:51:39 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 12 Feb 2009 09:51:39 -0800 Subject: [LLVMdev] Unnatural loops with O0 In-Reply-To: <200902121655.49444.baldrick@free.fr> References: <200902121655.49444.baldrick@free.fr> Message-ID: <77B9BBBA-B468-4903-9A06-95330B4D27FD@apple.com> On Feb 12, 2009, at 7:55 AM, Duncan Sands wrote: > Hi Marc, > >> Is there a compelling reason why llvm-gcc does not always produce >> natural >> loops. Is it a code size issue, or are there performance >> implications as >> well? I am seeing a simple 'while' loop compiled to an unnatural >> loop, >> without any gotos, breaks, or continues. What is the reason for >> this? > > is it already an unnatural loop when it comes out of the gcc parts of > llvm-gcc (you can check this by compiling with: -O0 -emit-llvm)? Or > is it llvm itself that creates the unnatural loops? Right. There is little we can do about the -O0 -emit-llvm code: this is a literal translation of what the GCC front-end gives us. If some *optimizer* is turning reducible loops into non-reducible control flow, then that is a completely different matter and I would consider that to be a serious bug. If the gcc front-end is doing this to you, you can try out clang, which should not. -Chris From jay.foad at gmail.com Thu Feb 12 11:57:56 2009 From: jay.foad at gmail.com (Jay Foad) Date: Thu, 12 Feb 2009 17:57:56 +0000 Subject: [LLVMdev] problems running test suite (-mllvm -disable-llvm-optzns) In-Reply-To: <200902121428.40153.baldrick@free.fr> References: <200902121428.40153.baldrick@free.fr> Message-ID: >> I'm trying to run some of the test suite using the instructions here: >> >> http://llvm.org/docs/TestingGuide.html#quicktestsuite >> >> I've built llvm myself, but I'm using pre-built binaries of llvm-gcc >> (from http://llvm.org/prereleases/2.5/llvm-gcc4.2-2.5-x86-linux-RHEL4.tar.gz). >> >> Here's what happens: > > the llvm testsuite (from svn, right?) uses features that are not > available in the 2.5 prerelease candidate. Use the prerelease > version of llvm too. Thanks. I switched to using the testsuite from the prerelease candidate and I've got it working now. For the record, there was another problem that frustrated me: the testsuite couldn't find cc1plus, because when I configured llvm I hadn't specified --with-llvmgccdir, so llvm's Makefile.config didn't define LLVMGCCLIBEXEC. But when I configured the testsuite I *had* specified --with-llvmgccdir, so it had all the information it needed to find llvm-gcc. Thanks, Jay. From dpatel at apple.com Thu Feb 12 12:33:19 2009 From: dpatel at apple.com (Devang Patel) Date: Thu, 12 Feb 2009 10:33:19 -0800 Subject: [LLVMdev] DominatorTree Information required in CallGraphPass In-Reply-To: <9f741d560902111805i2c02e412p32b944518c3aaf9f@mail.gmail.com> References: <9f741d560902111805i2c02e412p32b944518c3aaf9f@mail.gmail.com> Message-ID: <539AB34D-E7AB-48F4-8ACE-04307E2FC06A@apple.com> On Feb 11, 2009, at 6:05 PM, kapil anand wrote: > Hi all, > > I am implementing a new pass for LLVM which extends Call Graph > SCCPass. I need DominatorTree Information when I get to individual > function. I have added AU.addrequired() and > AU.addRequired() in getAnalysisUsage() function. > > But, when I get to the pass, Pass Manager gives following runtime > error > > > Unable to schedule 'Dominator Tree Construction' required by '......' > > assertion "0 && "Unable to schedule pass"" failed: file "/usr/src/ > llvm-2.4/llvm- > 2.4/lib/VMCore/PassManager.cpp", line 1074 > > > Can't I use dominator Tree Information in case of CallGraphSCCPass. I > found out from documentation that dominator Info can be used in Module > Pass > > ( http://www.llvm.org/docs/WritingAnLLVMPass.html#getAnalysis ) > > Thus it seemed feasible that Call GraphSCCPass should also be able to > use Dominator Info.Do I need to add some flag that I need lower level > passes? Pl. file a bugzilla report. Call graph pass manager is not implementing addLowerLevelRequiredPass() to support this. If you copy the same method from MPPassManager (see Pasmanager.cpp) then it should work. If it works, please send us patch. Thanks, > > > Thanks > > Regards, > Kapil > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev - Devang From tonic at nondot.org Thu Feb 12 16:16:59 2009 From: tonic at nondot.org (Tanya M. Lattner) Date: Thu, 12 Feb 2009 14:16:59 -0800 (PST) Subject: [LLVMdev] Reminder: 2.5 pre-release1 testing ends today! Message-ID: Just a reminder that 2.5 pre-release1 testing ends today (11:59PM PST). Please have your testing completed and the results sent to me. Thanks! -Tanya From gohman at apple.com Thu Feb 12 16:22:25 2009 From: gohman at apple.com (Dan Gohman) Date: Thu, 12 Feb 2009 14:22:25 -0800 Subject: [LLVMdev] problems running test suite (-mllvm -disable-llvm-optzns) In-Reply-To: <200902121503.51582.baldrick@free.fr> References: <200902121503.51582.baldrick@free.fr> Message-ID: <5A9AED67-56F3-45F3-9F03-3EF05C905C9F@apple.com> On Feb 12, 2009, at 6:03 AM, Duncan Sands wrote: > Hi, > >>> I guess this is because the test suite is trying to run "llvm-gcc >>> -mllvm -disable-llvm-optzns", which never seems to work, because >>> llvm-gcc mangles the command line before it gets to cc1plus. >> That's correct. The driver changes the order of the options provided. >> You need to provided this option to cc1 / cc1plus directly > > Dan fixed this recently in svn IIRC - I don't think it's in the > prerelease candidate. Yes. Until recently -mllvm-disable-llvm-optzns only accidentally worked on Darwin. It should be generally usable in trunk. If you're testing the 2.5 branch, please use the 2.5 branch of the test-suite. Dan From kapilanand2 at gmail.com Thu Feb 12 16:46:27 2009 From: kapilanand2 at gmail.com (kapil anand) Date: Thu, 12 Feb 2009 17:46:27 -0500 Subject: [LLVMdev] DominatorTree Information required in CallGraphPass In-Reply-To: <539AB34D-E7AB-48F4-8ACE-04307E2FC06A@apple.com> References: <9f741d560902111805i2c02e412p32b944518c3aaf9f@mail.gmail.com> <539AB34D-E7AB-48F4-8ACE-04307E2FC06A@apple.com> Message-ID: <9f741d560902121446h7f083b1bo4bc6452026a8c41b@mail.gmail.com> I have modified my pass to extend Module Pass instead of Call Graph SCCPass. I insert getAnalyses and use this explicitly instead of extending CallGraphSCCPass. So, the above change need not be made to get my pass work....( as of now) On Thu, Feb 12, 2009 at 1:33 PM, Devang Patel wrote: > > On Feb 11, 2009, at 6:05 PM, kapil anand wrote: > > > Hi all, > > > > I am implementing a new pass for LLVM which extends Call Graph > > SCCPass. I need DominatorTree Information when I get to individual > > function. I have added AU.addrequired() and > > AU.addRequired() in getAnalysisUsage() function. > > > > > But, when I get to the pass, Pass Manager gives following runtime > > error > > > > > > Unable to schedule 'Dominator Tree Construction' required by '......' > > > > assertion "0 && "Unable to schedule pass"" failed: file "/usr/src/ > > llvm-2.4/llvm- > > 2.4/lib/VMCore/PassManager.cpp", line 1074 > > > > > > Can't I use dominator Tree Information in case of CallGraphSCCPass. I > > found out from documentation that dominator Info can be used in Module > > Pass > > > > ( http://www.llvm.org/docs/WritingAnLLVMPass.html#getAnalysis ) > > > > Thus it seemed feasible that Call GraphSCCPass should also be able to > > use Dominator Info.Do I need to add some flag that I need lower level > > passes? > > Pl. file a bugzilla report. Call graph pass manager is not > implementing addLowerLevelRequiredPass() to support this. If you copy > the same method from MPPassManager (see Pasmanager.cpp) then it should > work. If it works, please send us patch. > > Thanks, > > > > > > Thanks > > > > Regards, > > Kapil > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > - > Devang > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/f3b6e9ab/attachment.html From Dr.Graef at t-online.de Thu Feb 12 17:23:19 2009 From: Dr.Graef at t-online.de (Albert Graef) Date: Fri, 13 Feb 2009 00:23:19 +0100 Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <408979.78931.qm@web53812.mail.re2.yahoo.com> References: <408979.78931.qm@web53812.mail.re2.yahoo.com> Message-ID: <4994AF67.6060403@t-online.de> Jeff Kuskin wrote: > Correct? If not, how do I call a LLVM JIT-generated fastcc function > from a function statically compiled by GCC? Well, you can always generate a little wrapper function with C calling convention which just calls the fastcc function. -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de WWW: http://www.musikinformatik.uni-mainz.de/ag From echristo at apple.com Thu Feb 12 17:28:28 2009 From: echristo at apple.com (Eric Christopher) Date: Thu, 12 Feb 2009 15:28:28 -0800 Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <4994AF67.6060403@t-online.de> References: <408979.78931.qm@web53812.mail.re2.yahoo.com> <4994AF67.6060403@t-online.de> Message-ID: <620C0F92-E35A-4743-A352-E98A2D59A9E7@apple.com> On Feb 12, 2009, at 3:23 PM, Albert Graef wrote: > Jeff Kuskin wrote: >> Correct? If not, how do I call a LLVM JIT-generated fastcc function >> from a function statically compiled by GCC? > > Well, you can always generate a little wrapper function with C calling > convention which just calls the fastcc function. You can do a quick bit of assembly code to make sure that the arguments are in the right registers for the call. -eric From arnold.schwaighofer at gmail.com Thu Feb 12 17:56:42 2009 From: arnold.schwaighofer at gmail.com (Arnold Schwaighofer) Date: Fri, 13 Feb 2009 00:56:42 +0100 Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <408979.78931.qm@web53812.mail.re2.yahoo.com> References: <408979.78931.qm@web53812.mail.re2.yahoo.com> Message-ID: On Thu, Feb 12, 2009 at 5:34 PM, Jeff Kuskin wrote: > Two related questions. > (2) Why does the x86-64 JIT backend generate a "ret $0x8" instruction > to return from a fastcc function that is (a) marked as fastcc > (CallingConv::Fast); but (b) takes no arguments and returns 'void'? fastcc generated code ends with this: > c20800 ret $0x8 > I assume the "ret 0x8" is meant to be the "callee pops args" portion > of the fastcc convention, but in this case the function has no > arguments (nor a return value), so why should 8 bytes be popped from > the stack on return? If i remember correctly this has to do with stack alignment and tail calls. Note that to support tail calls between functions that have an arbitrary number of arguments the stack pointer of the caller of the tail calling function is modified. e.g if foo(i64) tail calls bar() the stack pointer of foo's caller would be adjusted by 8 bytes which could result in a misaligned stack (assuming a platform alignment of 16) on entry to the function bar. Hence when tailcallopt is enabled the size occupied by arguments is rounded up such that such a misalignment cant happen. From Micah.Villmow at amd.com Thu Feb 12 18:53:17 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 12 Feb 2009 16:53:17 -0800 Subject: [LLVMdev] 16bit loads being promoted to 32bit? Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827795@ssanexmb1.amd.com> I have the following function: define void @test_fc_0_kernel(i16 signext %x, i16 signext %y, i16 addrspace(11)* %input, i32 addrspace(11)* %result) { entry: %call = tail call i32 @get_id(i32 0) ; [#uses=2] %cmp = icmp slt i16 %x, %y ; [#uses=1] br i1 %cmp, label %if.then, label %if.end if.end: ; preds = %entry ret void if.then: ; preds = %entry %arrayidx = getelementptr i32 addrspace(11)* %result, i32 %call ; [#uses=1] %arrayidx7 = getelementptr i16 addrspace(11)* %input, i32 %call ; [#uses=1] %tmp8 = load i16 addrspace(11)* %arrayidx7 ; [#uses=1] %conv9 = sext i16 %tmp8 to i32 ; [#uses=1] store i32 %conv9, i32 addrspace(11)* %arrayidx ret void } This function should read from a memory location from input and write to result with a promotion from i16 to i32, which seems simple enough. The problem that I am having is somewhere along the line the 16bit load is being promoted to a 32bit load and then the lower 16 bits are being sign extended away with a shl 16 followed by a shr 16. The problem with this is that 1) I'm limited to 32bit aligned loads and llvm is assuming a 16bit/8bit alignment 2) I have special functions that I need to be called when a load of sub32bit data type occurs which handles the alignment constraints and then does the shifting and masking of the data So my questions are: What bits do I need to flip to stop this optimization from occurring? What do I need to set to correctly get the loads/stores generated if not setting the datalayout string in my targetmachine class? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/510e8534/attachment.html From eli.friedman at gmail.com Thu Feb 12 19:52:09 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 12 Feb 2009 17:52:09 -0800 Subject: [LLVMdev] 16bit loads being promoted to 32bit? In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C827795@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827795@ssanexmb1.amd.com> Message-ID: On Thu, Feb 12, 2009 at 4:53 PM, Villmow, Micah wrote: > The > problem that I am having is somewhere along the line the 16bit load is being > promoted to a 32bit load For the given testcase, that's clearly illegal. Either there's a serious bug in LLVM, or you're misinterpreting the meaning of the DAG. Are you sure you aren't seeing a sign-extending load? If you don't want to bother supporting extending loads, you can use setLoadExtAction to make Legalize take care of it. > 1) I'm limited to 32bit aligned loads and llvm is assuming a > 16bit/8bit alignment You shouldn't be seeing any unaligned loads post-Legalize unless you explicitly ask for them by setting allowUnalignedMemoryAccesses to true. -Eli From deeppatel1987 at gmail.com Thu Feb 12 20:21:16 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Thu, 12 Feb 2009 18:21:16 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> Message-ID: <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> Although it's not generally needed for ARM's use of CCCustom, I return two bools to handle the four possible outcomes to keep the mechanism flexible: * if CCCustomFn handled the arg or not * if CCCustomFn wants to end processing of the arg or not I placed the "unsigned i" outside those loops because i is used after the loop. If there's a better index search pattern, I'd be happy to change it. Attached is an updated patch against HEAD that has DebugLoc changes. I also split out the ARMAsmPrinter fix into it's own patch. deep On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng wrote: > Thanks Sandeep. I did a quick scan, this looks really good. But I do > have a question: > > +/// CCCustomFn - This function assigns a location for Val, possibly > updating > +/// all args to reflect changes and indicates if it handled it. It > must set > +/// isCustom if it handles the arg and returns true. > +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, > + MVT &LocVT, CCValAssign::LocInfo &LocInfo, > + ISD::ArgFlagsTy &ArgFlags, CCState &State, > + bool &result); > > Is it necessary to return two bools (the second is returned by > reference in 'result')? I am confused about the semantics of 'result'. > > Also, a nitpick: > > + unsigned i; > + for (i = 0; i < 4; ++i) > > The convention we use is: > > + for (unsigned i = 0; i < 4; ++i) > > Thanks, > > Evan > > On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: > >> I think I've got all the cases handled now, implementing with >> CCCustom<"foo"> callbacks into C++. >> >> This also fixes a crash when returning i128. I've also included a >> small asm constraint fix that was needed to build newlib. >> >> deep >> >> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >> wrote: >>> >>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>> >>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>> wrote: >>>>> >>>>> One problem with this approach is that since i64 isn't legal, the >>>>> bitcast would require custom C++ code in the ARM target to >>>>> handle properly. It might make sense to introduce something >>>>> like >>>>> >>>>> CCIfType<[f64], CCCustom> >>>>> >>>>> where CCCustom is a new entity that tells the calling convention >>>>> code to to let the target do something not easily representable >>>>> in the tablegen minilanguage. >>>> >>>> I am thinking that this requires two changes: add a flag to >>>> CCValAssign (take a bit from HTP) to indicate isCustom and a way to >>>> author an arbitrary CCAction by including the source directly in the >>>> TableGen mini-language. This latter change might want a generic >>>> change >>>> to the TableGen language. For example, the syntax might be like: >>>> >>>> class foo : CCCustomAction { >>>> code <<< EOF >>>> ....multi-line C++ code goes here that allocates regs & mem and >>>> sets CCValAssign::isCustom.... >>>> EOF >>>> } >>>> >>>> Does this seem reasonable? An alternative is for CCCustom to take a >>>> string that names a function to be called: >>>> >>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>> >>>> the function signature for such functions will have to return two >>>> results: if the CC processing is finished and if it the func >>>> succeeded >>>> or failed: >>> >>> I like the second solution better. It seems rather cumbersome to >>> embed >>> multi-line c++ code in td files. >>> >>> Evan >>>> >>>> >>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>> bool &result); >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_callingconv.diff Type: application/octet-stream Size: 55366 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/31616686/attachment-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_fixes.diff Type: application/octet-stream Size: 589 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090212/31616686/attachment-0003.obj From jon at ffconsultancy.com Fri Feb 13 01:26:33 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Fri, 13 Feb 2009 07:26:33 +0000 Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <408979.78931.qm@web53812.mail.re2.yahoo.com> References: <408979.78931.qm@web53812.mail.re2.yahoo.com> Message-ID: <200902130726.33873.jon@ffconsultancy.com> On Thursday 12 February 2009 16:34:54 Jeff Kuskin wrote: > Two related questions. > > This is with LLVM 2.4 doing a JIT compile to x86-64. (I generate LLVM > IR using an IRBuilder instance, compile/optimize, and then call > getPointerToFunction() to get a "native" function pointer.) > > > (1) My reading of various mailing list messages seems to indicate > that a function marked as using the "fastcc" calling convention > ("CallingConv::Fast") cannot be called directly from GCC-generated > code (n.b. -- standalone gcc, not llvm-gcc) because the fastcc calling > convention is, in general, incompatible with GCC (which I assume uses > the "CallingConv::C" calling convention). > > Correct? Yes. > If not, how do I call a LLVM JIT-generated fastcc function > from a function statically compiled by GCC? You also JIT compile a shim function that is exposed with the C calling convention but contains a fastcc call to your internal function. Note that you may also need to rejig argument passing with things like sret. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From wangtielei at icst.pku.edu.cn Fri Feb 13 01:25:12 2009 From: wangtielei at icst.pku.edu.cn (Tielei Wang) Date: Fri, 13 Feb 2009 15:25:12 +0800 Subject: [LLVMdev] llvm-gcc4.2-2.4 build failure in /gcc/java/lang.c Message-ID: <49952058.6080900@icst.pku.edu.cn> Hi, every body, I get stuck when trying to build llvm-gcc4.2-2.4 on x86_64 Linux with GCC-4.3.3. I meet this error: make[3]: Entering directory `/home/wangtielei/TOOLS/llvm/llvm-gcc-obj/gcc' /home/wangtielei/TOOLS/llvm/llvm-gcc-obj/./prev-gcc/xgcc -B/home/wangtielei/TOOLS/llvm/llvm-gcc-obj/./prev-gcc/ -B/usr/local/x86_64-unknown-linux-gnu/bin/ -c -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute -DHAVE_CONFIG_H -I. -Ijava -I../../llvm-gcc4.2-2.4.source/gcc -I../../llvm-gcc4.2-2.4.source/gcc/java -I../../llvm-gcc4.2-2.4.source/gcc/../include -I../../llvm-gcc4.2-2.4.source/gcc/../libcpp/include -I../../llvm-gcc4.2-2.4.source/gcc/../libdecnumber -I../libdecnumber -I/home/wangtielei/TOOLS/llvm/llvm-obj//include -I/home/wangtielei/TOOLS/llvm/llvm-2.4/include -DENABLE_LLVM -I/home/wangtielei/TOOLS/llvm/llvm-obj/../llvm-2.4/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c -o java/lang.o ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c: In function ?java_init?: ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c:378: error: ?force_align_functions_log? undeclared (first use in this function) ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c:378: error: (Each undeclared identifier is reported only once ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c:378: error: for each function it appears in.) make[3]: *** [java/lang.o] Error 1 make[3]: Leaving directory `/home/wangtielei/TOOLS/llvm/llvm-gcc-obj/gcc' make[2]: *** [all-stage2-gcc] Error 2 make[2]: Leaving directory `/home/wangtielei/TOOLS/llvm/llvm-gcc-obj' make[1]: *** [stage2-bubble] Error 2 make[1]: Leaving directory `/home/wangtielei/TOOLS/llvm/llvm-gcc-obj' make: *** [all] Error 2 Anybody can help me? From clattner at apple.com Fri Feb 13 01:33:06 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 12 Feb 2009 23:33:06 -0800 Subject: [LLVMdev] llvm-gcc4.2-2.4 build failure in /gcc/java/lang.c In-Reply-To: <49952058.6080900@icst.pku.edu.cn> References: <49952058.6080900@icst.pku.edu.cn> Message-ID: <65A457DE-E98A-4B02-B7B5-7C1D13AA0BF7@apple.com> On Feb 12, 2009, at 11:25 PM, Tielei Wang wrote: > Hi, every body, > > I get stuck when trying to build llvm-gcc4.2-2.4 on x86_64 Linux with > GCC-4.3.3. I meet this error: Make sure to follow the README.LLVM file in the llvm-gcc distro. llvm- gcc doesn't support gcj yet at all. -Chris From wangtielei at icst.pku.edu.cn Fri Feb 13 01:58:45 2009 From: wangtielei at icst.pku.edu.cn (Tielei Wang) Date: Fri, 13 Feb 2009 15:58:45 +0800 Subject: [LLVMdev] llvm-gcc4.2-2.4 build failure in /gcc/java/lang.c In-Reply-To: <65A457DE-E98A-4B02-B7B5-7C1D13AA0BF7@apple.com> References: <49952058.6080900@icst.pku.edu.cn> <65A457DE-E98A-4B02-B7B5-7C1D13AA0BF7@apple.com> Message-ID: <49952835.3020801@icst.pku.edu.cn> I did follow the README.LLVM. I configure like: ../llvm-gcc4.2-2.4.source/configure --program-prefix=llvm- --enable-llvm=/home/wangtielei/TOOLS/llvm/llvm-obj/ --enable-language=c,c++ --disable-jit --disable-multilib The error information is /home/wangtielei/TOOLS/llvm/llvm-gcc-obj/./prev-gcc/xgcc -B/home/wangtielei/TOOLS/llvm/llvm-gcc-obj/./prev-gcc/ -B/usr/local/x86_64-unknown-linux-gnu/bin/ -c -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute -DHAVE_CONFIG_H -I. -Ijava -I../../llvm-gcc4.2-2.4.source/gcc -I../../llvm-gcc4.2-2.4.source/gcc/java -I../../llvm-gcc4.2-2.4.source/gcc/../include -I../../llvm-gcc4.2-2.4.source/gcc/../libcpp/include -I../../llvm-gcc4.2-2.4.source/gcc/../libdecnumber -I../libdecnumber -I/home/wangtielei/TOOLS/llvm/llvm-obj//include -I/home/wangtielei/TOOLS/llvm/llvm-2.4/include -DENABLE_LLVM -I/home/wangtielei/TOOLS/llvm/llvm-obj/../llvm-2.4/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c -o java/lang.o ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c: In function ?java_init?: ../../llvm-gcc4.2-2.4.source/gcc/java/lang.c:378: error: ?force_align_functions_log? undeclared (first use in this function) But I do not understand why xgcc is invoked to compile java.c file. Could you give me more hints? Chris Lattner wrote: > > On Feb 12, 2009, at 11:25 PM, Tielei Wang wrote: > >> Hi, every body, >> >> I get stuck when trying to build llvm-gcc4.2-2.4 on x86_64 Linux with >> GCC-4.3.3. I meet this error: > > Make sure to follow the README.LLVM file in the llvm-gcc distro. > llvm-gcc doesn't support gcj yet at all. > > -Chris > From edwintorok at gmail.com Fri Feb 13 02:27:33 2009 From: edwintorok at gmail.com (=?windows-1252?Q?T=F6r=F6k_Edwin?=) Date: Fri, 13 Feb 2009 10:27:33 +0200 Subject: [LLVMdev] llvm-gcc4.2-2.4 build failure in /gcc/java/lang.c In-Reply-To: <49952835.3020801@icst.pku.edu.cn> References: <49952058.6080900@icst.pku.edu.cn> <65A457DE-E98A-4B02-B7B5-7C1D13AA0BF7@apple.com> <49952835.3020801@icst.pku.edu.cn> Message-ID: <49952EF5.9030009@gmail.com> On 2009-02-13 09:58, Tielei Wang wrote: > I did follow the README.LLVM. > > I configure like: > ../llvm-gcc4.2-2.4.source/configure --program-prefix=llvm- > --enable-llvm=/home/wangtielei/TOOLS/llvm/llvm-obj/ > --enable-language=c,c++ --disable-jit --disable-multilib > Typo? It is --enable-languages, not --enable-language Best regards, --Edwin From anton at korobeynikov.info Fri Feb 13 02:46:28 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Fri, 13 Feb 2009 11:46:28 +0300 Subject: [LLVMdev] llvm-gcc4.2-2.4 build failure in /gcc/java/lang.c In-Reply-To: <49952835.3020801@icst.pku.edu.cn> References: <49952058.6080900@icst.pku.edu.cn> <65A457DE-E98A-4B02-B7B5-7C1D13AA0BF7@apple.com> <49952835.3020801@icst.pku.edu.cn> Message-ID: <6A365DB8-23EB-402E-A34B-E3E557E14C01@korobeynikov.info> > I did follow the README.LLVM. No, you didn't > I configure like: > ../llvm-gcc4.2-2.4.source/configure --program-prefix=llvm- > --enable-llvm=/home/wangtielei/TOOLS/llvm/llvm-obj/ > --enable-language=c,c++ --disable-jit --disable-multilib Note: "languages", not "language" --- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From alex.lavoro.propio at gmail.com Fri Feb 13 03:09:33 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Fri, 13 Feb 2009 10:09:33 +0100 Subject: [LLVMdev] #ifdef in TableGen Message-ID: <4d77c5f20902130109u789f66e7lc5750e0da1e7e255@mail.gmail.com> Is there something similar to #ifdef ... #endif in C supported in TableGen *.td files? Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/6324555e/attachment.html From baldrick at free.fr Fri Feb 13 03:22:28 2009 From: baldrick at free.fr (Duncan Sands) Date: Fri, 13 Feb 2009 10:22:28 +0100 Subject: [LLVMdev] 16bit loads being promoted to 32bit? In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C827795@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827795@ssanexmb1.amd.com> Message-ID: <200902131022.28639.baldrick@free.fr> Hi, > This function should read from a memory location from input and write to > result with a promotion from i16 to i32, which seems simple enough. The > problem that I am having is somewhere along the line the 16bit load is > being promoted to a 32bit load and then the lower 16 bits are being sign > extended away with a shl 16 followed by a shr 16. most likely it is being turned into an i32 extending load (extended from i16). You have to read the SDag node dumps carefully to notice this. Being an extending load means that only 16 bits will actually be loaded (only two bytes of memory touched), but the result will be an i32. Hard to say anything more without details. Ciao, Duncan. From marks at dcs.gla.ac.uk Fri Feb 13 04:11:46 2009 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 13 Feb 2009 10:11:46 +0000 Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <4994AF67.6060403@t-online.de> References: <408979.78931.qm@web53812.mail.re2.yahoo.com> <4994AF67.6060403@t-online.de> Message-ID: <49954762.3070500@dcs.gla.ac.uk> Albert Graef wrote: > Jeff Kuskin wrote: >> Correct? If not, how do I call a LLVM JIT-generated fastcc function >> from a function statically compiled by GCC? > > Well, you can always generate a little wrapper function with C calling > convention which just calls the fastcc function. > I use the fastcall convention all the time. LLVM-jitted code calling GCC-compile code and vice-versa. This works for x86 (32 bit): void* llvm_jit_compile(); typedef __attribute__((fastcall)) int (*func_ptr)(int p1, int p2); int g(void) { func_ptr f = (func_ptr)llvm_jit_compile(); return f(1, 2); } Mark From anton at korobeynikov.info Fri Feb 13 05:01:00 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Fri, 13 Feb 2009 14:01:00 +0300 Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <49954762.3070500@dcs.gla.ac.uk> References: <408979.78931.qm@web53812.mail.re2.yahoo.com> <4994AF67.6060403@t-online.de> <49954762.3070500@dcs.gla.ac.uk> Message-ID: > I use the fastcall convention all the time. > LLVM-jitted code calling GCC-compile code and vice-versa. fastcall != fastcc. There are more or less definite rules for fastcall CC. fastcc, oppositely, has no such rules. The only rule is "as fast as possible". This means, that such functions cannot be exposed 'to public'. CC details for such functions can be changed at any time (as it was already 3 or 4 times). -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From jay.foad at gmail.com Fri Feb 13 06:45:46 2009 From: jay.foad at gmail.com (Jay Foad) Date: Fri, 13 Feb 2009 12:45:46 +0000 Subject: [LLVMdev] loop passes vs call graph Message-ID: I'm looking at bug 3367. If I run: $ opt b.bc -inline -loop-rotate -loop-unswitch -debug-pass=Executions ... it eventually crashes in the inliner, because the call graph isn't up to date. (NB if you want to reproduce this you'll have to apply my patch from bug 3367 first.) The reason the call graph isn't up to date is that -loop-unswitch has changed a function and not updated the call graph. But that seems OK, because -loop-unswitch's getAnalysisUsage() method doesn't claim to preserve the call graph. So are loop passed *required* to preserved the call graph, in the same way that CallGraphSCC passes are? Or should the pass manager take care of rebuilding the call graph before calling the inliner on an SCC whose functions have been changed? I don't see any evidence of this happening. I've attached the full output from -debug-pass=Executions -debug-only=inline. You can see that the loop pass manager modifies function readClause(), and then the inliner decides to inline readClause() into parse_DIMACS_main(), but I don't think the call graph is being rebuilt in between those two points. Any ideas? Thanks, Jay. -------------- next part -------------- A non-text attachment was scrubbed... Name: executions Type: application/octet-stream Size: 34889 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/785ae7f8/attachment-0001.obj From baldrick at free.fr Fri Feb 13 07:39:48 2009 From: baldrick at free.fr (Duncan Sands) Date: Fri, 13 Feb 2009 14:39:48 +0100 Subject: [LLVMdev] loop passes vs call graph In-Reply-To: References: Message-ID: <200902131439.48939.baldrick@free.fr> Hi, > I'm looking at bug 3367. > > If I run: > > $ opt b.bc -inline -loop-rotate -loop-unswitch -debug-pass=Executions > > ... it eventually crashes in the inliner, because the call graph isn't > up to date. (NB if you want to reproduce this you'll have to apply my > patch from bug 3367 first.) > > The reason the call graph isn't up to date is that -loop-unswitch has > changed a function and not updated the call graph. But that seems OK, > because -loop-unswitch's getAnalysisUsage() method doesn't claim to > preserve the call graph. given the callgraph F -> G, the pass manager currently does the following: run inliner on G, run loop passes on G, run inliner on F, run loop passes on F. Presumably what is happening is this: the loop passes change the functions that G calls (but don't update the callgraph). Now the inliner visits F and decides to inline G into F. When it does this, it presumably merges the callgraph info for G (i.e. what G calls) into that of F. But this info is wrong, so F ends up having invalid callgraph info which at some point causes trouble. I think what should happen is: if a SCC pass (eg: inline) is followed by function passes that preserve the callgraph, then it should schedule them together like above. However if the SCC pass is followed by a function pass that does not preserve the callgraph then it should be scheduled entirely after the SCC pass. For example, imagine -inline -fpass -loop-unswitch, where fpass is a function pass that preserves the callgraph. Then the pass manager should do: run -inline on G run -fpass on G run -inline on F run -fpass on F run -loop-unswitch on G run -loop-unswitch on F. Just my opinion of course. Ciao, Duncan. > So are loop passed *required* to preserved the call graph, in the same > way that CallGraphSCC passes are? > > Or should the pass manager take care of rebuilding the call graph > before calling the inliner on an SCC whose functions have been > changed? I don't see any evidence of this happening. > > > I've attached the full output from -debug-pass=Executions > -debug-only=inline. You can see that the loop pass manager modifies > function readClause(), and then the inliner decides to inline > readClause() into parse_DIMACS_main(), but I don't think the call > graph is being rebuilt in between those two points. > > Any ideas? > > Thanks, > Jay. > From jay.foad at gmail.com Fri Feb 13 07:52:41 2009 From: jay.foad at gmail.com (Jay Foad) Date: Fri, 13 Feb 2009 13:52:41 +0000 Subject: [LLVMdev] loop passes vs call graph In-Reply-To: <200902131439.48939.baldrick@free.fr> References: <200902131439.48939.baldrick@free.fr> Message-ID: > given the callgraph F -> G, the pass manager currently does the following: > run inliner on G, run loop passes on G, run inliner on F, run loop > passes on F. Presumably what is happening is this: the loop passes change > the functions that G calls (but don't update the callgraph). Now the > inliner visits F and decides to inline G into F. When it does this, it > presumably merges the callgraph info for G (i.e. what G calls) into that of > F. But this info is wrong, so F ends up having invalid callgraph info which > at some point causes trouble. Yes, exactly! > I think what should happen is: if a SCC pass (eg: inline) is followed > by function passes that preserve the callgraph, then it should schedule > them together like above. However if the SCC pass is followed by a > function pass that does not preserve the callgraph then it should be > scheduled entirely after the SCC pass. Sounds good, but it's a bit outside my area of expertise. I think it would be nice to have a call graph verifier, that checks that the call graph is complete and correct whenever the pass manager thinks it ought to be. Maybe it could be part of -verify - I'm not sure how these things work. Thanks, Jay. From jk500500 at yahoo.com Fri Feb 13 08:55:08 2009 From: jk500500 at yahoo.com (Jeff Kuskin) Date: Fri, 13 Feb 2009 06:55:08 -0800 (PST) Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: <620C0F92-E35A-4743-A352-E98A2D59A9E7@apple.com> Message-ID: <299013.60378.qm@web53808.mail.re2.yahoo.com> --- On Thu, 2/12/09, Eric Christopher wrote: > From: Eric Christopher > Subject: Re: [LLVMdev] fastcc, tail calls, and gcc > To: "LLVM Developers Mailing List" > Date: Thursday, February 12, 2009, 6:28 PM > On Feb 12, 2009, at 3:23 PM, Albert Graef wrote: > > > Jeff Kuskin wrote: > >> Correct? If not, how do I call a LLVM > JIT-generated fastcc function > >> from a function statically compiled by GCC? > > > > Well, you can always generate a little wrapper > function with C calling > > convention which just calls the fastcc function. > > You can do a quick bit of assembly code to make sure that > the > arguments are in the right registers for the call. I am trying this now. Thanks to all for the suggestions. -- Jeff From jk500500 at yahoo.com Fri Feb 13 08:57:59 2009 From: jk500500 at yahoo.com (Jeff Kuskin) Date: Fri, 13 Feb 2009 06:57:59 -0800 (PST) Subject: [LLVMdev] fastcc, tail calls, and gcc In-Reply-To: Message-ID: <607245.67522.qm@web53808.mail.re2.yahoo.com> --- On Thu, 2/12/09, Arnold Schwaighofer wrote: > From: Arnold Schwaighofer > Subject: Re: [LLVMdev] fastcc, tail calls, and gcc > To: "LLVM Developers Mailing List" > Date: Thursday, February 12, 2009, 6:56 PM > On Thu, Feb 12, 2009 at 5:34 PM, Jeff Kuskin > wrote: > > Two related questions. > > > (2) Why does the x86-64 JIT backend generate a > "ret $0x8" instruction > > to return from a fastcc function that is (a) marked as > fastcc > > (CallingConv::Fast); but (b) takes no arguments and > returns 'void'? > fastcc generated code ends with this: > > c20800 ret $0x8 > > I assume the "ret 0x8" is meant to be the > "callee pops args" portion > > of the fastcc convention, but in this case the > function has no > > arguments (nor a return value), so why should 8 bytes > be popped from > > the stack on return? > If i remember correctly this has to do with stack > alignment and tail > calls. Note that to support tail calls between functions > that have an > arbitrary number of arguments the stack pointer of the > caller of the > tail calling function is modified. > e.g if foo(i64) tail calls bar() the stack pointer of > foo's caller > would be adjusted by 8 bytes which could result in a > misaligned stack > (assuming a platform alignment of 16) on entry to the > function bar. > Hence when tailcallopt is enabled the size occupied by > arguments is > rounded up such that such a misalignment cant happen. Hmmm. I think I understand, but I don't see how the "ret 8" is correct for a function that has no arguments. In this case, it seems to me that the "size occupied by the arguments" is zero, and should remain zero even after rounding up. Perhaps I misunderstand. Let me try to generate an actual testcase with real C and x86 asm code. Thanks. -- Jeff > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From Chareos at gmx.de Fri Feb 13 09:00:43 2009 From: Chareos at gmx.de (Ralf Karrenberg) Date: Fri, 13 Feb 2009 16:00:43 +0100 Subject: [LLVMdev] Incremental SSA update Message-ID: <49958B1B.1050905@gmx.de> Hi, does LLVM have a mechanism to automatically update SSA form, e.g. after insertion of additional definitions of a variable? This would recursively traverse the dominance tree of all uses of the definition backwards and insert phi-functions where ever they are needed. http://portal.acm.org/citation.cfm?id=277656&dl=GUIDE, (Paragraph 4.5) provides an algorithm for such an incremental SSA update which I would implement if nothing similar(ly efficient) already exists. By now I have implemented both a Control Dependence Analysis and an If-Conversion pass. Both passes are not largely tested yet and probably will not match any desired quality of code. Yet I would like to contribute them at some point, but I am unsure about requirements etc. and would appreciate some hints on how to proceed :). Regards, Ralf From criswell at cs.uiuc.edu Fri Feb 13 09:55:12 2009 From: criswell at cs.uiuc.edu (John Criswell) Date: Fri, 13 Feb 2009 09:55:12 -0600 Subject: [LLVMdev] Incremental SSA update In-Reply-To: <49958B1B.1050905@gmx.de> References: <49958B1B.1050905@gmx.de> Message-ID: <499597E0.1010900@cs.uiuc.edu> Ralf Karrenberg wrote: > Hi, > > does LLVM have a mechanism to automatically update SSA form, e.g. after > insertion of additional definitions of a variable? This would > recursively traverse the dominance tree of all uses of the definition > backwards and insert phi-functions where ever they are needed. > I'm not sure what you mean. LLVM virual registers are *always* in SSA form. I don't think it's possible to generate non-SSA code (although it is possible to generate ill-formed IR, such as a def that does not dominate all of its uses). For example, if you create a new LLVM instruction in your transform, then it will have a unique name and be a unique definition. There is no need to convert it into SSA form because it already is in SSA form. Perhaps what you are asking is whether an alloca which is used to create a variable can be promoted into an SSA virtual register. The answer is yes, and there is already a pass (mem2reg) that will do this for you. Front-end often create variables as stack-allocated memory objects (because stack memory can be loaded from/stored to multiple times) and then let mem2reg promote the alloca'ed memory into SSA virtual registers if possible. That's the only non-SSA to SSA conversion that happens in LLVM, as far as I know. > http://portal.acm.org/citation.cfm?id=277656&dl=GUIDE, (Paragraph 4.5) > provides an algorithm for such an incremental SSA update which I would > implement if nothing similar(ly efficient) already exists. > > > By now I have implemented both a Control Dependence Analysis and an > If-Conversion pass. Both passes are not largely tested yet and probably > will not match any desired quality of code. > Yet I would like to contribute them at some point, but I am unsure about > requirements etc. and would appreciate some hints on how to proceed :). > Very cool. Code contribution policies are at http://llvm.org/docs/DeveloperPolicy.html. If I understand it correctly, you submit a patch for review either to llvmdev or the llvm-commits list, someone with commit access reviews it, you iterate through requested code changes as necessary, and then they commit. You'll also need at least one testcase to show that your code works. -- John T. > Regards, > Ralf > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From clattner at apple.com Fri Feb 13 10:27:54 2009 From: clattner at apple.com (Chris Lattner) Date: Fri, 13 Feb 2009 08:27:54 -0800 Subject: [LLVMdev] #ifdef in TableGen In-Reply-To: <4d77c5f20902130109u789f66e7lc5750e0da1e7e255@mail.gmail.com> References: <4d77c5f20902130109u789f66e7lc5750e0da1e7e255@mail.gmail.com> Message-ID: On Feb 13, 2009, at 1:09 AM, Alex wrote: > > Is there something similar to #ifdef ... #endif in C supported in > TableGen *.td files? Not as such, what are you trying to do? -Chris From Micah.Villmow at amd.com Fri Feb 13 11:21:50 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 13 Feb 2009 09:21:50 -0800 Subject: [LLVMdev] 16bit loads being promoted to 32bit? In-Reply-To: References: <5BA674C5FF7B384A92C2C95D8CC71E1C827795@ssanexmb1.amd.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C82783A@ssanexmb1.amd.com> Eli, Thanks that worked. ;) Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Eli Friedman Sent: Thursday, February 12, 2009 5:52 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] 16bit loads being promoted to 32bit? On Thu, Feb 12, 2009 at 4:53 PM, Villmow, Micah wrote: > The > problem that I am having is somewhere along the line the 16bit load is being > promoted to a 32bit load For the given testcase, that's clearly illegal. Either there's a serious bug in LLVM, or you're misinterpreting the meaning of the DAG. Are you sure you aren't seeing a sign-extending load? If you don't want to bother supporting extending loads, you can use setLoadExtAction to make Legalize take care of it. > 1) I'm limited to 32bit aligned loads and llvm is assuming a > 16bit/8bit alignment You shouldn't be seeing any unaligned loads post-Legalize unless you explicitly ask for them by setting allowUnalignedMemoryAccesses to true. -Eli _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From alex.lavoro.propio at gmail.com Fri Feb 13 11:47:52 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Fri, 13 Feb 2009 18:47:52 +0100 Subject: [LLVMdev] Modeling GPU vector registers, again (with my implementation) Message-ID: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com> It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written by two different instructions. My question is how should I model these sub-registers. If I treat each component as a register, and do the register allocation individually, it seems very difficult to merge the scalars operations back into one vetor operation. // each %reg is a sub-register // r1, r2, r3, r4 here are virtual register number mul %reg1024, r1, r2 // x mul %reg1025, r1, r2 // y mul %reg1026, r1, r2 // z add %reg1027, r3, r4 // w sub %reg1028, %reg1024, r1 sub %reg1029, %reg1025, r1 sub %reg1030, %reg1026, r1 sub %reg1031, %reg1027, r1 So I decided to model each 4-element register as one Register in *.td file. Here are the details. Since all the 4 elements of a vector register occupy the same 'alloca', during the conversion of shader assembly to LLVM IR, I check if a vector register is written (to different elements) by different instructions. When the second write happens, I generate a shufflevector to multiplex the existing value and the new value, and store the result of shufflevector. Input assembly language: mul r0.xy, r1, r2 add r0.zw, r3, r4 sub r5, r0, r1 is converted to LLVM IR: %r0 = alloca <4 x float> %mul_1 = mul <4 x float> %r1, %r2 store <4 x float> %mul_1, <4 x float>* %r0 ... %add_1 = add <4 x float> %r3, %r4 ; a store does not immediately happen here %load_1 = load <4 x float>* %r0 ; select the first two elements from the existing value, ; the last two elements from the newly generated value %merge_1 = shufflevector <4 x float> %load_1, <4 x float> %add_1, <4 x i32> < i32 0, i32 1, i32 6, i32 7 > ; store the multiplexed value store <4 x float> %merge_1, <4 x float>* %r0 After mem2reg: %mul_1 = mul <4 x float> %r1, %r2 %add_1 = add <4 x float> %r3, %r4 %merge_1 = shufflevector <4 x float> %mul_1, <4 x float> %add_1, <4 x i32> < i32 0, i32 1, i32 6, i32 7 > After instruction selection: MUL %reg1024, %reg1025, %reg1026 ADD %reg1027, %reg1028, %reg1029 MERGE %reg1030, %reg1024, "xy", %reg1027, "zw" The 'shufflevector' is selected to a MERGE instruction by the default LLVM instruction selector. The hardware doesn't have this instruction. I have a *pre*-register allocation FunctionPass to remember: The phyicial regsiter allocated to the destination register of MERGE (%reg1030) should replace the destination register allocated to the destination register of MUL (%reg1024) and ADD(%reg1027). In this way I ensure MUL and ADD write to the same physical register. This replacement is done in the other FunctionPass *after* register allocation. MUL and ADD have an 'OptionalDefOperand' writemask. By default the writemask is "xyzw" (all elmenets are written). // 0xF == all elements are written by default def WRITEMASK : OptionalDefOperand {...} def MUL : MyInst<(outs REG4X32:$dst), (ins REG4X32:$src0, REG4X32:$src1, WRITEMASK:$wm), In the said post-register-allocation FunctionPass, in addition to replace the destination registers as described before, the writemask ($wm) of each instruction is also replaced with the writemask operands of MERGE. So: MUL %R0, %R1, %R2, "xyzw" ADD %R5, %R3, %R4, "xyzw" MERGE %R6, %R0, "xy", %R5, "zw" ==> MUL %R6, %R1, %R2, "xy" // "xy" comes from MERGE operand 2 ADD %R6, %R3, %R4, "zw" // MERGE %R6, %R0, "xy", %R5, "zw" <== REMOVED Final machine code: MUL r6.xy, r1, r2 ADD r6.zw, r3, r4 SUB r8, r6, r1 I don't feel very comfortable with these two very ad-hoc FunctionPass. Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/ec84f395/attachment.html From nlewycky at google.com Fri Feb 13 12:23:37 2009 From: nlewycky at google.com (Nick Lewycky) Date: Fri, 13 Feb 2009 10:23:37 -0800 Subject: [LLVMdev] making libraries depend on external libraries? Message-ID: How can I specify that LLVMInterpreter depends on 'libffi' in the Makefile? Modifying LD.Flags in Interpreter/Makefile doesn't help since llvm-config doesn't pick up on that, causing a linker error when building lli. I'd like "llvm-config --libs interpreter" return -lffi along with the LLVM libraries it lists. You'd think there would be an example of this already, but I looked and didn't find one. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/3346b107/attachment.html From evan.cheng at apple.com Fri Feb 13 14:33:39 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 13 Feb 2009 12:33:39 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> Message-ID: On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: > Although it's not generally needed for ARM's use of CCCustom, I return > two bools to handle the four possible outcomes to keep the mechanism > flexible: > > * if CCCustomFn handled the arg or not > * if CCCustomFn wants to end processing of the arg or not +/// CCCustomFn - This function assigns a location for Val, possibly updating +/// all args to reflect changes and indicates if it handled it. It must set +/// isCustom if it handles the arg and returns true. +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, + MVT &LocVT, CCValAssign::LocInfo &LocInfo, + ISD::ArgFlagsTy &ArgFlags, CCState &State, + bool &result); Is "result" what you refer to as "isCustom" in the comments? Sorry, I am still confused. You mean it could return true but set 'result' to false? That means it has handled the argument but it would not process any more arguments? What scenario do you envision that this will be useful? I'd rather keep it simple. > > > I placed the "unsigned i" outside those loops because i is used after > the loop. If there's a better index search pattern, I'd be happy to > change it. Ok. One more nitpick: +/// CCCustom - calls a custom arg handling function Please capitalize "calls" and end with a period. Thanks, Evan > > > Attached is an updated patch against HEAD that has DebugLoc changes. I > also split out the ARMAsmPrinter fix into it's own patch. > > deep > > On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng wrote: >> Thanks Sandeep. I did a quick scan, this looks really good. But I do >> have a question: >> >> +/// CCCustomFn - This function assigns a location for Val, possibly >> updating >> +/// all args to reflect changes and indicates if it handled it. It >> must set >> +/// isCustom if it handles the arg and returns true. >> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >> + bool &result); >> >> Is it necessary to return two bools (the second is returned by >> reference in 'result')? I am confused about the semantics of >> 'result'. >> >> Also, a nitpick: >> >> + unsigned i; >> + for (i = 0; i < 4; ++i) >> >> The convention we use is: >> >> + for (unsigned i = 0; i < 4; ++i) >> >> Thanks, >> >> Evan >> >> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >> >>> I think I've got all the cases handled now, implementing with >>> CCCustom<"foo"> callbacks into C++. >>> >>> This also fixes a crash when returning i128. I've also included a >>> small asm constraint fix that was needed to build newlib. >>> >>> deep >>> >>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>> wrote: >>>> >>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>> >>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>> wrote: >>>>>> >>>>>> One problem with this approach is that since i64 isn't legal, the >>>>>> bitcast would require custom C++ code in the ARM target to >>>>>> handle properly. It might make sense to introduce something >>>>>> like >>>>>> >>>>>> CCIfType<[f64], CCCustom> >>>>>> >>>>>> where CCCustom is a new entity that tells the calling convention >>>>>> code to to let the target do something not easily representable >>>>>> in the tablegen minilanguage. >>>>> >>>>> I am thinking that this requires two changes: add a flag to >>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a way >>>>> to >>>>> author an arbitrary CCAction by including the source directly in >>>>> the >>>>> TableGen mini-language. This latter change might want a generic >>>>> change >>>>> to the TableGen language. For example, the syntax might be like: >>>>> >>>>> class foo : CCCustomAction { >>>>> code <<< EOF >>>>> ....multi-line C++ code goes here that allocates regs & mem and >>>>> sets CCValAssign::isCustom.... >>>>> EOF >>>>> } >>>>> >>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>> take a >>>>> string that names a function to be called: >>>>> >>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>> >>>>> the function signature for such functions will have to return two >>>>> results: if the CC processing is finished and if it the func >>>>> succeeded >>>>> or failed: >>>> >>>> I like the second solution better. It seems rather cumbersome to >>>> embed >>>> multi-line c++ code in td files. >>>> >>>> Evan >>>>> >>>>> >>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>> bool &result); >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> < >>> arm_callingconv.diff>_______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > < > arm_callingconv > .diff>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From bharadwajy at gmail.com Fri Feb 13 16:18:32 2009 From: bharadwajy at gmail.com (S. Bharadwaj Yadavalli) Date: Fri, 13 Feb 2009 17:18:32 -0500 Subject: [LLVMdev] Cross compiling GCC 4.2 build errors Message-ID: I get the following assertion failure during my attempt to build an x86_64->ARM cross compiler. /./gcc/xgcc -B/./gcc/ -B/arm-none-linux-gnueabi/bin/ -B/arm-none-linux-gnueabi/lib/ -isystem /arm-none-linux-gnueabi/include -isystem /arm-none-linux-gnueabi/sys-include -O2 -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I/gcc -I/gcc/. -I/gcc/../include -I/gcc/../libcpp/include -I/gcc/../libdecnumber -I../libdecnumber -I/llvm/include -I/llvm/lib/CodeGen/RegisterScavenging.cpp:273: void llvm::RegScavenger::forward(): Assertion `(isReserved(Reg) || isUnused(Reg) || IsImpDef || isImplicitlyDefined(Reg) || isLiveInButUnusedBefore(Reg, MI, MBB, TRI, MRI)) && "Re-defining a live register!"' failed. ../../../../src/llvm-gcc-4.2/gcc/libgcc2.c:1914: internal compiler error: Aborted Please submit a full bug report, with preprocessed source if appropriate. See for instructions. Can some one please tell me what the problem is? Here is some relevant info: LLVM and llvm-gcc-4.2 source rev 64487 $ svn info Path: . URL: http://llvm.org/svn/llvm-project/llvm/trunk Repository Root: http://llvm.org/svn/llvm-project Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8 Revision: 64487 Node Kind: directory Schedule: normal Last Changed Author: djg Last Changed Rev: 64468 Last Changed Date: 2009-02-13 12:45:12 -0500 (Fri, 13 Feb 2009) $ svn info Path: . URL: http://llvm.org/svn/llvm-project/llvm-gcc-4.2/trunk Repository Root: http://llvm.org/svn/llvm-project Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8 Revision: 64487 Node Kind: directory Schedule: normal Last Changed Author: baldrick Last Changed Rev: 64234 Last Changed Date: 2009-02-10 15:43:26 -0500 (Tue, 10 Feb 2009) LLVM configured and successfully built as : /configure --with-llvmgccdir=/llvm-gcc-4.2 --enable-optimized --enable-jit --prefix=/configure --prefix=/llvm --disable-multilib --target=arm-none-linux-gnueabi --with-sysroot= --enable-languages=c,c++ Thanks, Bharadwaj -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/6b051df5/attachment.html From deeppatel1987 at gmail.com Fri Feb 13 16:20:36 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Fri, 13 Feb 2009 14:20:36 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> Message-ID: <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng wrote: > > On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: > >> Although it's not generally needed for ARM's use of CCCustom, I return >> two bools to handle the four possible outcomes to keep the mechanism >> flexible: >> >> * if CCCustomFn handled the arg or not >> * if CCCustomFn wants to end processing of the arg or not > > +/// CCCustomFn - This function assigns a location for Val, possibly > updating > +/// all args to reflect changes and indicates if it handled it. It > must set > +/// isCustom if it handles the arg and returns true. > +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, > + MVT &LocVT, CCValAssign::LocInfo &LocInfo, > + ISD::ArgFlagsTy &ArgFlags, CCState &State, > + bool &result); > > Is "result" what you refer to as "isCustom" in the comments? > > Sorry, I am still confused. You mean it could return true but set > 'result' to false? That means it has handled the argument but it would > not process any more arguments? What scenario do you envision that > this will be useful? I'd rather keep it simple. As you note there are three actual legitimate cases (of the four combos): 1. The CCCustomFn wants the arg handling to proceed. This might be used akin to CCPromoteToType. 2. The CCCustomFn entirely handled the arg. This might be used akin to CCAssignToReg. 3. The CCCustomFn tried to handle the arg, but failed. these results are conveyed the following ways: 1. The CCCustomFn returns false, &result is not used. 2. The CCCustomFn returns true, &result is false; 3. The CCCustomFn returns true, &result is true. I tried to keep these CCCustomFns looking like TableGen generated code. Suggestions of how to reorganize these results are welcome. :-) Perhaps better comments around the typedef for CCCustomFn would suffice? The isCustom flag is simply a means for this machinery to convey to the TargetLowering functions to process this arg specially. It may not always be possible for the TargetLowering functions to determine that the arg needs special handling after all the changes made by the CCCustomFn or CCPromoteToType and other transformations. >> I placed the "unsigned i" outside those loops because i is used after >> the loop. If there's a better index search pattern, I'd be happy to >> change it. > > Ok. > > One more nitpick: > > +/// CCCustom - calls a custom arg handling function > > Please capitalize "calls" and end with a period. Once we settle on the result handling changes, I'll submit an update with this change. > Thanks, > > Evan > >> >> >> Attached is an updated patch against HEAD that has DebugLoc changes. I >> also split out the ARMAsmPrinter fix into it's own patch. >> >> deep >> >> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng wrote: >>> Thanks Sandeep. I did a quick scan, this looks really good. But I do >>> have a question: >>> >>> +/// CCCustomFn - This function assigns a location for Val, possibly >>> updating >>> +/// all args to reflect changes and indicates if it handled it. It >>> must set >>> +/// isCustom if it handles the arg and returns true. >>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>> + bool &result); >>> >>> Is it necessary to return two bools (the second is returned by >>> reference in 'result')? I am confused about the semantics of >>> 'result'. >>> >>> Also, a nitpick: >>> >>> + unsigned i; >>> + for (i = 0; i < 4; ++i) >>> >>> The convention we use is: >>> >>> + for (unsigned i = 0; i < 4; ++i) >>> >>> Thanks, >>> >>> Evan >>> >>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>> >>>> I think I've got all the cases handled now, implementing with >>>> CCCustom<"foo"> callbacks into C++. >>>> >>>> This also fixes a crash when returning i128. I've also included a >>>> small asm constraint fix that was needed to build newlib. >>>> >>>> deep >>>> >>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>> wrote: >>>>> >>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>> >>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>> wrote: >>>>>>> >>>>>>> One problem with this approach is that since i64 isn't legal, the >>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>> handle properly. It might make sense to introduce something >>>>>>> like >>>>>>> >>>>>>> CCIfType<[f64], CCCustom> >>>>>>> >>>>>>> where CCCustom is a new entity that tells the calling convention >>>>>>> code to to let the target do something not easily representable >>>>>>> in the tablegen minilanguage. >>>>>> >>>>>> I am thinking that this requires two changes: add a flag to >>>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a way >>>>>> to >>>>>> author an arbitrary CCAction by including the source directly in >>>>>> the >>>>>> TableGen mini-language. This latter change might want a generic >>>>>> change >>>>>> to the TableGen language. For example, the syntax might be like: >>>>>> >>>>>> class foo : CCCustomAction { >>>>>> code <<< EOF >>>>>> ....multi-line C++ code goes here that allocates regs & mem and >>>>>> sets CCValAssign::isCustom.... >>>>>> EOF >>>>>> } >>>>>> >>>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>>> take a >>>>>> string that names a function to be called: >>>>>> >>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>> >>>>>> the function signature for such functions will have to return two >>>>>> results: if the CC processing is finished and if it the func >>>>>> succeeded >>>>>> or failed: >>>>> >>>>> I like the second solution better. It seems rather cumbersome to >>>>> embed >>>>> multi-line c++ code in td files. >>>>> >>>>> Evan >>>>>> >>>>>> >>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>> bool &result); >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> < >>>> arm_callingconv.diff>_______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> < >> arm_callingconv >> .diff>_______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From evan.cheng at apple.com Fri Feb 13 16:34:48 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 13 Feb 2009 14:34:48 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> Message-ID: On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: > On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng > wrote: >> >> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >> >>> Although it's not generally needed for ARM's use of CCCustom, I >>> return >>> two bools to handle the four possible outcomes to keep the mechanism >>> flexible: >>> >>> * if CCCustomFn handled the arg or not >>> * if CCCustomFn wants to end processing of the arg or not >> >> +/// CCCustomFn - This function assigns a location for Val, possibly >> updating >> +/// all args to reflect changes and indicates if it handled it. It >> must set >> +/// isCustom if it handles the arg and returns true. >> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >> + bool &result); >> >> Is "result" what you refer to as "isCustom" in the comments? >> >> Sorry, I am still confused. You mean it could return true but set >> 'result' to false? That means it has handled the argument but it >> would >> not process any more arguments? What scenario do you envision that >> this will be useful? I'd rather keep it simple. > > As you note there are three actual legitimate cases (of the four > combos): > > 1. The CCCustomFn wants the arg handling to proceed. This might be > used akin to CCPromoteToType. > 2. The CCCustomFn entirely handled the arg. This might be used akin to > CCAssignToReg. > 3. The CCCustomFn tried to handle the arg, but failed. > > these results are conveyed the following ways: > > 1. The CCCustomFn returns false, &result is not used. > 2. The CCCustomFn returns true, &result is false; > 3. The CCCustomFn returns true, &result is true. I don't think we want to support #1. If the target want to add custom code to handle an argument, if should be responsible for outputting legal code. Is there an actual need to support #1? Evan > > > I tried to keep these CCCustomFns looking like TableGen generated > code. Suggestions of how to reorganize these results are welcome. :-) > Perhaps better comments around the typedef for CCCustomFn would > suffice? > > The isCustom flag is simply a means for this machinery to convey to > the TargetLowering functions to process this arg specially. It may not > always be possible for the TargetLowering functions to determine that > the arg needs special handling after all the changes made by the > CCCustomFn or CCPromoteToType and other transformations. > >>> I placed the "unsigned i" outside those loops because i is used >>> after >>> the loop. If there's a better index search pattern, I'd be happy to >>> change it. >> >> Ok. >> >> One more nitpick: >> >> +/// CCCustom - calls a custom arg handling function >> >> Please capitalize "calls" and end with a period. > > Once we settle on the result handling changes, I'll submit an update > with this change. > >> Thanks, >> >> Evan >> >>> >>> >>> Attached is an updated patch against HEAD that has DebugLoc >>> changes. I >>> also split out the ARMAsmPrinter fix into it's own patch. >>> >>> deep >>> >>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng wrote: >>>> Thanks Sandeep. I did a quick scan, this looks really good. But I >>>> do >>>> have a question: >>>> >>>> +/// CCCustomFn - This function assigns a location for Val, >>>> possibly >>>> updating >>>> +/// all args to reflect changes and indicates if it handled it. It >>>> must set >>>> +/// isCustom if it handles the arg and returns true. >>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>>> + bool &result); >>>> >>>> Is it necessary to return two bools (the second is returned by >>>> reference in 'result')? I am confused about the semantics of >>>> 'result'. >>>> >>>> Also, a nitpick: >>>> >>>> + unsigned i; >>>> + for (i = 0; i < 4; ++i) >>>> >>>> The convention we use is: >>>> >>>> + for (unsigned i = 0; i < 4; ++i) >>>> >>>> Thanks, >>>> >>>> Evan >>>> >>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>> >>>>> I think I've got all the cases handled now, implementing with >>>>> CCCustom<"foo"> callbacks into C++. >>>>> >>>>> This also fixes a crash when returning i128. I've also included a >>>>> small asm constraint fix that was needed to build newlib. >>>>> >>>>> deep >>>>> >>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>> >>>>> wrote: >>>>>> >>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>> >>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>> wrote: >>>>>>>> >>>>>>>> One problem with this approach is that since i64 isn't legal, >>>>>>>> the >>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>> handle properly. It might make sense to introduce something >>>>>>>> like >>>>>>>> >>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>> >>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>> convention >>>>>>>> code to to let the target do something not easily representable >>>>>>>> in the tablegen minilanguage. >>>>>>> >>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a way >>>>>>> to >>>>>>> author an arbitrary CCAction by including the source directly in >>>>>>> the >>>>>>> TableGen mini-language. This latter change might want a generic >>>>>>> change >>>>>>> to the TableGen language. For example, the syntax might be like: >>>>>>> >>>>>>> class foo : CCCustomAction { >>>>>>> code <<< EOF >>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>> and >>>>>>> sets CCValAssign::isCustom.... >>>>>>> EOF >>>>>>> } >>>>>>> >>>>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>>>> take a >>>>>>> string that names a function to be called: >>>>>>> >>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>> >>>>>>> the function signature for such functions will have to return >>>>>>> two >>>>>>> results: if the CC processing is finished and if it the func >>>>>>> succeeded >>>>>>> or failed: >>>>>> >>>>>> I like the second solution better. It seems rather cumbersome to >>>>>> embed >>>>>> multi-line c++ code in td files. >>>>>> >>>>>> Evan >>>>>>> >>>>>>> >>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>> bool &result); >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> < >>>>> arm_callingconv >>>>> .diff>_______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> < >>> arm_callingconv >>> .diff >>> >_______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From evan.cheng at apple.com Fri Feb 13 16:46:43 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 13 Feb 2009 14:46:43 -0800 Subject: [LLVMdev] Cross compiling GCC 4.2 build errors In-Reply-To: References: Message-ID: <0D3A4426-FA99-4711-BD8D-EDF90D0E33A1@apple.com> On Feb 13, 2009, at 2:18 PM, S. Bharadwaj Yadavalli wrote: > > I get the following assertion failure during my attempt to build an > x86_64->ARM cross compiler. > > /./gcc/xgcc -B/./gcc/ -B/arm- > none-linux-gnueabi/bin/ -B/arm-none-linux-gnueabi/lib/ > -isystem /arm-none-linux-gnueabi/include -isystem > /arm-none-linux-gnueabi/sys-include -O2 -O2 -g -O2 - > DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings - > Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition - > isystem ./include -fPIC -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 - > D__GCC_FLOAT_NOT_NEEDED -I. -I. -I/gcc -I gcc-4.2-source>/gcc/. -I/gcc/../include -I gcc-4.2-source>/gcc/../libcpp/include -I/ > gcc/../libdecnumber -I../libdecnumber -I/llvm/include - > I DHIDE_EXPORTS -c ../../../../src/llvm-gcc-4.2/gcc/libgcc2.c -o > libgcc/./_muldc3.o > cc1: /llvm/lib/CodeGen/RegisterScavenging.cpp:273: void > llvm::RegScavenger::forward(): Assertion `(isReserved(Reg) || > isUnused(Reg) || IsImpDef || isImplicitlyDefined(Reg) || > isLiveInButUnusedBefore(Reg, MI, MBB, TRI, MRI)) && "Re-defining a > live register!"' failed. > ../../../../src/llvm-gcc-4.2/gcc/libgcc2.c:1914: internal compiler > error: Aborted > Please submit a full bug report, > with preprocessed source if appropriate. > See for instructions. Please add -emit-llvm to produce a bitcode and attach it to a bugzilla report. Thanks, Evan > > > Can some one please tell me what the problem is? > > Here is some relevant info: > > LLVM and llvm-gcc-4.2 source rev 64487 > $ svn info > Path: . > URL: http://llvm.org/svn/llvm-project/llvm/trunk > Repository Root: http://llvm.org/svn/llvm-project > Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8 > Revision: 64487 > Node Kind: directory > Schedule: normal > Last Changed Author: djg > Last Changed Rev: 64468 > Last Changed Date: 2009-02-13 12:45:12 -0500 (Fri, 13 Feb 2009) > > $ svn info > Path: . > URL: http://llvm.org/svn/llvm-project/llvm-gcc-4.2/trunk > Repository Root: http://llvm.org/svn/llvm-project > Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8 > Revision: 64487 > Node Kind: directory > Schedule: normal > Last Changed Author: baldrick > Last Changed Rev: 64234 > Last Changed Date: 2009-02-10 15:43:26 -0500 (Tue, 10 Feb 2009) > > LLVM configured and successfully built as : > /configure --with-llvmgccdir=/llvm- > gcc-4.2 --enable-optimized --enable-jit --prefix= --target=arm-unknown-linux-gnueabi > > llvm-gcc-4.2 configured as : > /configure --prefix= gcc-4.2 --program-prefix=llvm-x86_64-arm --enable-llvm= install>/llvm --disable-multilib --target=arm-none-linux-gnueabi -- > with-sysroot= --enable-languages=c,c++ > > > Thanks, > > Bharadwaj > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/8657f8ed/attachment.html From evan.cheng at apple.com Fri Feb 13 17:05:09 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 13 Feb 2009 15:05:09 -0800 Subject: [LLVMdev] Modeling GPU vector registers, again (with my implementation) In-Reply-To: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com> References: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com> Message-ID: <8CE3FC1A-1211-48E4-A319-AAC6FF346039@apple.com> On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul r0.xyw, r1, r2 > add r0.z, r3, r4 > sub r5, r0, r1 > > Notice that the four elements of r0 are written by two different > instructions. > > My question is how should I model these sub-registers. If I treat > each component > as a register, and do the register allocation individually, it seems > very > difficult to merge the scalars operations back into one vetor > operation. Well, how many possible permutations are there? Is it possible to model each case as a separate physical register? Evan > // each %reg is a sub-register > // r1, r2, r3, r4 here are virtual register number > > mul %reg1024, r1, r2 // x > mul %reg1025, r1, r2 // y > mul %reg1026, r1, r2 // z > > add %reg1027, r3, r4 // w > > sub %reg1028, %reg1024, r1 > sub %reg1029, %reg1025, r1 > sub %reg1030, %reg1026, r1 > sub %reg1031, %reg1027, r1 > > So I decided to model each 4-element register as one Register in > *.td file. > > Here are the details. > > Since all the 4 elements of a vector register occupy the same > 'alloca', > during the conversion of shader assembly to LLVM IR, I check if a > vector > register is written (to different elements) by different > instructions. When > the second write happens, I generate a shufflevector to multiplex the > existing value and the new value, and store the result of > shufflevector. > > Input assembly language: > mul r0.xy, r1, r2 > add r0.zw, r3, r4 > sub r5, r0, r1 > > is converted to LLVM IR: > > %r0 = alloca <4 x float> > %mul_1 = mul <4 x float> %r1, %r2 > store <4 x float> %mul_1, <4 x float>* %r0 > ... > %add_1 = add <4 x float> %r3, %r4 > ; a store does not immediately happen here > %load_1 = load <4 x float>* %r0 > > ; select the first two elements from the existing value, > ; the last two elements from the newly generated value > %merge_1 = shufflevector <4 x float> %load_1, > <4 x float> %add_1, > <4 x i32> < i32 0, i32 1, i32 6, i32 7 > > > ; store the multiplexed value > store <4 x float> %merge_1, <4 x float>* %r0 > > > After mem2reg: > > %mul_1 = mul <4 x float> %r1, %r2 > %add_1 = add <4 x float> %r3, %r4 > %merge_1 = shufflevector <4 x float> %mul_1, > <4 x float> %add_1, > <4 x i32> < i32 0, i32 1, i32 6, i32 7 > > > > After instruction selection: > > MUL %reg1024, %reg1025, %reg1026 > ADD %reg1027, %reg1028, %reg1029 > MERGE %reg1030, %reg1024, "xy", %reg1027, "zw" > > The 'shufflevector' is selected to a MERGE instruction by the > default LLVM > instruction selector. The hardware doesn't have this instruction. I > have a > *pre*-register allocation FunctionPass to remember: > > The phyicial regsiter allocated to the destination register of MERGE > (%reg1030) should replace the destination register allocated to the > destination register of MUL (%reg1024) and ADD(%reg1027). > > In this way I ensure MUL and ADD write to the same physical > register. This > replacement is done in the other FunctionPass *after* register > allocation. > > MUL and ADD have an 'OptionalDefOperand' writemask. By default the > writemask is > "xyzw" (all elmenets are written). > > // 0xF == all elements are written by default > def WRITEMASK : OptionalDefOperand (i32 0xF))> > {...} > > def MUL : MyInst<(outs REG4X32:$dst), > (ins REG4X32:$src0, REG4X32:$src1, WRITEMASK:$wm), > > In the said post-register-allocation FunctionPass, in addition to > replace the > destination registers as described before, the writemask ($wm) of each > instruction is also replaced with the writemask operands of MERGE. So: > > MUL %R0, %R1, %R2, "xyzw" > ADD %R5, %R3, %R4, "xyzw" > MERGE %R6, %R0, "xy", %R5, "zw" > > ==> > > MUL %R6, %R1, %R2, "xy" // "xy" comes from MERGE operand 2 > ADD %R6, %R3, %R4, "zw" > // MERGE %R6, %R0, "xy", %R5, "zw" <== REMOVED > > Final machine code: > > MUL r6.xy, r1, r2 > ADD r6.zw, r3, r4 > SUB r8, r6, r1 > > I don't feel very comfortable with these two very ad-hoc FunctionPass. > > Alex. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/0dc19c01/attachment-0001.html From evan.cheng at apple.com Fri Feb 13 17:06:42 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 13 Feb 2009 15:06:42 -0800 Subject: [LLVMdev] Prevent node from being combined In-Reply-To: <4d77c5f20902110456g56ecff71yb6d832863cadfb1b@mail.gmail.com> References: <4d77c5f20902110456g56ecff71yb6d832863cadfb1b@mail.gmail.com> Message-ID: The only way to prevent a SDNode from being combined if to custom lower it into a target specific node. But dag combiner can run before legalization as well. Evan On Feb 11, 2009, at 4:56 AM, Alex wrote: > How can I prevent some nodes from being combined in DAGCombine.cpp? > > Maybe what I want to do below doesn't follow the philosophy of LLVM, > but I'd like to know if there is any way to avoid node from being > combined. TargetLowering::PerformDAGCombine() is only called if > DAGCombiner cannot combine a specific node. It seems that there is > no chance to stop it from combining a node. > > I need the shuffle mask in the machine instruction but sometimes if > a vector_shuffle can only return LHS or RHS, it's removed/combined > so that I cannot match vector_shuffle in the instruction selector. > > If the vector_shuffle is combined, I have to write the instruction > selector like these: > > def SUBvv: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1, imm:$mask1), > [sub (vector_shuffle REG:$src0, REG:$src0, imm:$mask0), > (vector_shuffle REG:$src1, REG:$src1, imm:$mask1)] > > def SUBrv: MyInst<(ins REG:$src0, REG:$src1, imm:$mask1), > [sub REG:$src0, > (vector_shuffle REG:$src1, REG:$src1, imm:$mask1)] > > def SUBvr: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1), > [sub (vector_shuffle REG:$src0, REG:$src0, imm:$mask0), > REG:$src1)] > > Otherwise, I can write: > > def SUB: MyInst<(ins REG:$src0, imm:$mask0, REG:$src1, imm:$mask1), > [sub (vector_shuffle REG:$src0, REG:$src0, imm:$mask0), > (vector_shuffle REG:$src1, REG:$src1, imm:$mask1)] > > And processing MachineInstr will be easier since the operand index > of writemask is always the same for all instructions. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From simmon12 at illinois.edu Fri Feb 13 17:21:16 2009 From: simmon12 at illinois.edu (Patrick Simmons) Date: Fri, 13 Feb 2009 17:21:16 -0600 Subject: [LLVMdev] Problem Running llvm-suite In-Reply-To: References: <498D137F.7050806@illinois.edu> <5D0D6E75-DB30-4646-8E3B-021C72C2A939@apple.com> <498F3095.3080402@illinois.edu> <498F312D.3020505@illinois.edu> Message-ID: <4996006C.1050503@illinois.edu> Dale Johannesen wrote: > I've never used that FE, I've always built my own. So it seems > possible that's the problem. > I don't personally use -enable-optimized much, but other people do; > that seems unlikely. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > I downloaded the 2.4 release snapshot of llvm, llvm-gcc, and llvm-test, and the test suite runs correctly on that version, which solves my problem. Thanks for helping me with this. --Patrick -- If I'm not here, I've gone out to find myself. If I get back before I return, please keep me here. From deeppatel1987 at gmail.com Fri Feb 13 18:25:27 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Fri, 13 Feb 2009 16:25:27 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> Message-ID: <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> ARMTargetLowering doesn't need case #1, but it seemed like you and Dan wanted a more generic way to inject C++ code into the process so I tried to make the mechanism a bit more general. deep On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng wrote: > > On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: > >> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >> wrote: >>> >>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>> >>>> Although it's not generally needed for ARM's use of CCCustom, I >>>> return >>>> two bools to handle the four possible outcomes to keep the mechanism >>>> flexible: >>>> >>>> * if CCCustomFn handled the arg or not >>>> * if CCCustomFn wants to end processing of the arg or not >>> >>> +/// CCCustomFn - This function assigns a location for Val, possibly >>> updating >>> +/// all args to reflect changes and indicates if it handled it. It >>> must set >>> +/// isCustom if it handles the arg and returns true. >>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>> + bool &result); >>> >>> Is "result" what you refer to as "isCustom" in the comments? >>> >>> Sorry, I am still confused. You mean it could return true but set >>> 'result' to false? That means it has handled the argument but it >>> would >>> not process any more arguments? What scenario do you envision that >>> this will be useful? I'd rather keep it simple. >> >> As you note there are three actual legitimate cases (of the four >> combos): >> >> 1. The CCCustomFn wants the arg handling to proceed. This might be >> used akin to CCPromoteToType. >> 2. The CCCustomFn entirely handled the arg. This might be used akin to >> CCAssignToReg. >> 3. The CCCustomFn tried to handle the arg, but failed. >> >> these results are conveyed the following ways: >> >> 1. The CCCustomFn returns false, &result is not used. >> 2. The CCCustomFn returns true, &result is false; >> 3. The CCCustomFn returns true, &result is true. > > I don't think we want to support #1. If the target want to add custom > code to handle an argument, if should be responsible for outputting > legal code. Is there an actual need to support #1? > > Evan > >> >> >> I tried to keep these CCCustomFns looking like TableGen generated >> code. Suggestions of how to reorganize these results are welcome. :-) >> Perhaps better comments around the typedef for CCCustomFn would >> suffice? >> >> The isCustom flag is simply a means for this machinery to convey to >> the TargetLowering functions to process this arg specially. It may not >> always be possible for the TargetLowering functions to determine that >> the arg needs special handling after all the changes made by the >> CCCustomFn or CCPromoteToType and other transformations. >> >>>> I placed the "unsigned i" outside those loops because i is used >>>> after >>>> the loop. If there's a better index search pattern, I'd be happy to >>>> change it. >>> >>> Ok. >>> >>> One more nitpick: >>> >>> +/// CCCustom - calls a custom arg handling function >>> >>> Please capitalize "calls" and end with a period. >> >> Once we settle on the result handling changes, I'll submit an update >> with this change. >> >>> Thanks, >>> >>> Evan >>> >>>> >>>> >>>> Attached is an updated patch against HEAD that has DebugLoc >>>> changes. I >>>> also split out the ARMAsmPrinter fix into it's own patch. >>>> >>>> deep >>>> >>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng wrote: >>>>> Thanks Sandeep. I did a quick scan, this looks really good. But I >>>>> do >>>>> have a question: >>>>> >>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>> possibly >>>>> updating >>>>> +/// all args to reflect changes and indicates if it handled it. It >>>>> must set >>>>> +/// isCustom if it handles the arg and returns true. >>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>>>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>>>> + bool &result); >>>>> >>>>> Is it necessary to return two bools (the second is returned by >>>>> reference in 'result')? I am confused about the semantics of >>>>> 'result'. >>>>> >>>>> Also, a nitpick: >>>>> >>>>> + unsigned i; >>>>> + for (i = 0; i < 4; ++i) >>>>> >>>>> The convention we use is: >>>>> >>>>> + for (unsigned i = 0; i < 4; ++i) >>>>> >>>>> Thanks, >>>>> >>>>> Evan >>>>> >>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>> >>>>>> I think I've got all the cases handled now, implementing with >>>>>> CCCustom<"foo"> callbacks into C++. >>>>>> >>>>>> This also fixes a crash when returning i128. I've also included a >>>>>> small asm constraint fix that was needed to build newlib. >>>>>> >>>>>> deep >>>>>> >>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>> >>>>>> wrote: >>>>>>> >>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>> >>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> One problem with this approach is that since i64 isn't legal, >>>>>>>>> the >>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>> handle properly. It might make sense to introduce something >>>>>>>>> like >>>>>>>>> >>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>> >>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>> convention >>>>>>>>> code to to let the target do something not easily representable >>>>>>>>> in the tablegen minilanguage. >>>>>>>> >>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a way >>>>>>>> to >>>>>>>> author an arbitrary CCAction by including the source directly in >>>>>>>> the >>>>>>>> TableGen mini-language. This latter change might want a generic >>>>>>>> change >>>>>>>> to the TableGen language. For example, the syntax might be like: >>>>>>>> >>>>>>>> class foo : CCCustomAction { >>>>>>>> code <<< EOF >>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>> and >>>>>>>> sets CCValAssign::isCustom.... >>>>>>>> EOF >>>>>>>> } >>>>>>>> >>>>>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>>>>> take a >>>>>>>> string that names a function to be called: >>>>>>>> >>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>> >>>>>>>> the function signature for such functions will have to return >>>>>>>> two >>>>>>>> results: if the CC processing is finished and if it the func >>>>>>>> succeeded >>>>>>>> or failed: >>>>>>> >>>>>>> I like the second solution better. It seems rather cumbersome to >>>>>>> embed >>>>>>> multi-line c++ code in td files. >>>>>>> >>>>>>> Evan >>>>>>>> >>>>>>>> >>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>> bool &result); >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> < >>>>>> arm_callingconv >>>>>> .diff>_______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> < >>>> arm_callingconv >>>> .diff >>>> >_______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From evan.cheng at apple.com Fri Feb 13 19:47:20 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 13 Feb 2009 17:47:20 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60901161726l5c93c9dag6f98ca06a420cb31@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> Message-ID: <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: > ARMTargetLowering doesn't need case #1, but it seemed like you and Dan > wanted a more generic way to inject C++ code into the process so I > tried to make the mechanism a bit more general. Ok. Since ARM doesn't need it and it's the only client, I'd much rather have CCCustomFn just return a single bool indicating whether it can handle the arg. Would that be ok? Thanks, Evan > > > deep > > On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng > wrote: >> >> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >> >>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>> wrote: >>>> >>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>> >>>>> Although it's not generally needed for ARM's use of CCCustom, I >>>>> return >>>>> two bools to handle the four possible outcomes to keep the >>>>> mechanism >>>>> flexible: >>>>> >>>>> * if CCCustomFn handled the arg or not >>>>> * if CCCustomFn wants to end processing of the arg or not >>>> >>>> +/// CCCustomFn - This function assigns a location for Val, >>>> possibly >>>> updating >>>> +/// all args to reflect changes and indicates if it handled it. It >>>> must set >>>> +/// isCustom if it handles the arg and returns true. >>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>>> + bool &result); >>>> >>>> Is "result" what you refer to as "isCustom" in the comments? >>>> >>>> Sorry, I am still confused. You mean it could return true but set >>>> 'result' to false? That means it has handled the argument but it >>>> would >>>> not process any more arguments? What scenario do you envision that >>>> this will be useful? I'd rather keep it simple. >>> >>> As you note there are three actual legitimate cases (of the four >>> combos): >>> >>> 1. The CCCustomFn wants the arg handling to proceed. This might be >>> used akin to CCPromoteToType. >>> 2. The CCCustomFn entirely handled the arg. This might be used >>> akin to >>> CCAssignToReg. >>> 3. The CCCustomFn tried to handle the arg, but failed. >>> >>> these results are conveyed the following ways: >>> >>> 1. The CCCustomFn returns false, &result is not used. >>> 2. The CCCustomFn returns true, &result is false; >>> 3. The CCCustomFn returns true, &result is true. >> >> I don't think we want to support #1. If the target want to add custom >> code to handle an argument, if should be responsible for outputting >> legal code. Is there an actual need to support #1? >> >> Evan >> >>> >>> >>> I tried to keep these CCCustomFns looking like TableGen generated >>> code. Suggestions of how to reorganize these results are >>> welcome. :-) >>> Perhaps better comments around the typedef for CCCustomFn would >>> suffice? >>> >>> The isCustom flag is simply a means for this machinery to convey to >>> the TargetLowering functions to process this arg specially. It may >>> not >>> always be possible for the TargetLowering functions to determine >>> that >>> the arg needs special handling after all the changes made by the >>> CCCustomFn or CCPromoteToType and other transformations. >>> >>>>> I placed the "unsigned i" outside those loops because i is used >>>>> after >>>>> the loop. If there's a better index search pattern, I'd be happy >>>>> to >>>>> change it. >>>> >>>> Ok. >>>> >>>> One more nitpick: >>>> >>>> +/// CCCustom - calls a custom arg handling function >>>> >>>> Please capitalize "calls" and end with a period. >>> >>> Once we settle on the result handling changes, I'll submit an update >>> with this change. >>> >>>> Thanks, >>>> >>>> Evan >>>> >>>>> >>>>> >>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>> changes. I >>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>> >>>>> deep >>>>> >>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>> wrote: >>>>>> Thanks Sandeep. I did a quick scan, this looks really good. But I >>>>>> do >>>>>> have a question: >>>>>> >>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>> possibly >>>>>> updating >>>>>> +/// all args to reflect changes and indicates if it handled >>>>>> it. It >>>>>> must set >>>>>> +/// isCustom if it handles the arg and returns true. >>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>> &LocInfo, >>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>> &State, >>>>>> + bool &result); >>>>>> >>>>>> Is it necessary to return two bools (the second is returned by >>>>>> reference in 'result')? I am confused about the semantics of >>>>>> 'result'. >>>>>> >>>>>> Also, a nitpick: >>>>>> >>>>>> + unsigned i; >>>>>> + for (i = 0; i < 4; ++i) >>>>>> >>>>>> The convention we use is: >>>>>> >>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Evan >>>>>> >>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>> >>>>>>> I think I've got all the cases handled now, implementing with >>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>> >>>>>>> This also fixes a crash when returning i128. I've also >>>>>>> included a >>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>> >>>>>>> deep >>>>>>> >>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>> >>>>>>> wrote: >>>>>>>> >>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>> >>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> One problem with this approach is that since i64 isn't legal, >>>>>>>>>> the >>>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>>> handle properly. It might make sense to introduce something >>>>>>>>>> like >>>>>>>>>> >>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>> >>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>> convention >>>>>>>>>> code to to let the target do something not easily >>>>>>>>>> representable >>>>>>>>>> in the tablegen minilanguage. >>>>>>>>> >>>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a >>>>>>>>> way >>>>>>>>> to >>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>> directly in >>>>>>>>> the >>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>> generic >>>>>>>>> change >>>>>>>>> to the TableGen language. For example, the syntax might be >>>>>>>>> like: >>>>>>>>> >>>>>>>>> class foo : CCCustomAction { >>>>>>>>> code <<< EOF >>>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>>> and >>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>> EOF >>>>>>>>> } >>>>>>>>> >>>>>>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>>>>>> take a >>>>>>>>> string that names a function to be called: >>>>>>>>> >>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>> >>>>>>>>> the function signature for such functions will have to return >>>>>>>>> two >>>>>>>>> results: if the CC processing is finished and if it the func >>>>>>>>> succeeded >>>>>>>>> or failed: >>>>>>>> >>>>>>>> I like the second solution better. It seems rather cumbersome >>>>>>>> to >>>>>>>> embed >>>>>>>> multi-line c++ code in td files. >>>>>>>> >>>>>>>> Evan >>>>>>>>> >>>>>>>>> >>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>>> bool &result); >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>> >>>>>>> < >>>>>>> arm_callingconv >>>>>>> .diff>_______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> < >>>>> arm_callingconv >>>>> .diff >>>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From deeppatel1987 at gmail.com Fri Feb 13 20:41:09 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Fri, 13 Feb 2009 18:41:09 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <9F8572A5-4F58-492D-A61B-638FC61D42B4@apple.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> Message-ID: <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> Sure. Updated patches attached. deep On Fri, Feb 13, 2009 at 5:47 PM, Evan Cheng wrote: > > On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: > >> ARMTargetLowering doesn't need case #1, but it seemed like you and Dan >> wanted a more generic way to inject C++ code into the process so I >> tried to make the mechanism a bit more general. > > Ok. Since ARM doesn't need it and it's the only client, I'd much > rather have CCCustomFn just return a single bool indicating whether it > can handle the arg. Would that be ok? > > Thanks, > > Evan > >> >> >> deep >> >> On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng >> wrote: >>> >>> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >>> >>>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>>> wrote: >>>>> >>>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>>> >>>>>> Although it's not generally needed for ARM's use of CCCustom, I >>>>>> return >>>>>> two bools to handle the four possible outcomes to keep the >>>>>> mechanism >>>>>> flexible: >>>>>> >>>>>> * if CCCustomFn handled the arg or not >>>>>> * if CCCustomFn wants to end processing of the arg or not >>>>> >>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>> possibly >>>>> updating >>>>> +/// all args to reflect changes and indicates if it handled it. It >>>>> must set >>>>> +/// isCustom if it handles the arg and returns true. >>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>>>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>>>> + bool &result); >>>>> >>>>> Is "result" what you refer to as "isCustom" in the comments? >>>>> >>>>> Sorry, I am still confused. You mean it could return true but set >>>>> 'result' to false? That means it has handled the argument but it >>>>> would >>>>> not process any more arguments? What scenario do you envision that >>>>> this will be useful? I'd rather keep it simple. >>>> >>>> As you note there are three actual legitimate cases (of the four >>>> combos): >>>> >>>> 1. The CCCustomFn wants the arg handling to proceed. This might be >>>> used akin to CCPromoteToType. >>>> 2. The CCCustomFn entirely handled the arg. This might be used >>>> akin to >>>> CCAssignToReg. >>>> 3. The CCCustomFn tried to handle the arg, but failed. >>>> >>>> these results are conveyed the following ways: >>>> >>>> 1. The CCCustomFn returns false, &result is not used. >>>> 2. The CCCustomFn returns true, &result is false; >>>> 3. The CCCustomFn returns true, &result is true. >>> >>> I don't think we want to support #1. If the target want to add custom >>> code to handle an argument, if should be responsible for outputting >>> legal code. Is there an actual need to support #1? >>> >>> Evan >>> >>>> >>>> >>>> I tried to keep these CCCustomFns looking like TableGen generated >>>> code. Suggestions of how to reorganize these results are >>>> welcome. :-) >>>> Perhaps better comments around the typedef for CCCustomFn would >>>> suffice? >>>> >>>> The isCustom flag is simply a means for this machinery to convey to >>>> the TargetLowering functions to process this arg specially. It may >>>> not >>>> always be possible for the TargetLowering functions to determine >>>> that >>>> the arg needs special handling after all the changes made by the >>>> CCCustomFn or CCPromoteToType and other transformations. >>>> >>>>>> I placed the "unsigned i" outside those loops because i is used >>>>>> after >>>>>> the loop. If there's a better index search pattern, I'd be happy >>>>>> to >>>>>> change it. >>>>> >>>>> Ok. >>>>> >>>>> One more nitpick: >>>>> >>>>> +/// CCCustom - calls a custom arg handling function >>>>> >>>>> Please capitalize "calls" and end with a period. >>>> >>>> Once we settle on the result handling changes, I'll submit an update >>>> with this change. >>>> >>>>> Thanks, >>>>> >>>>> Evan >>>>> >>>>>> >>>>>> >>>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>>> changes. I >>>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>>> >>>>>> deep >>>>>> >>>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>>> wrote: >>>>>>> Thanks Sandeep. I did a quick scan, this looks really good. But I >>>>>>> do >>>>>>> have a question: >>>>>>> >>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>> possibly >>>>>>> updating >>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>> it. It >>>>>>> must set >>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>> &LocInfo, >>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>> &State, >>>>>>> + bool &result); >>>>>>> >>>>>>> Is it necessary to return two bools (the second is returned by >>>>>>> reference in 'result')? I am confused about the semantics of >>>>>>> 'result'. >>>>>>> >>>>>>> Also, a nitpick: >>>>>>> >>>>>>> + unsigned i; >>>>>>> + for (i = 0; i < 4; ++i) >>>>>>> >>>>>>> The convention we use is: >>>>>>> >>>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Evan >>>>>>> >>>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>>> >>>>>>>> I think I've got all the cases handled now, implementing with >>>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>>> >>>>>>>> This also fixes a crash when returning i128. I've also >>>>>>>> included a >>>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>>> >>>>>>>> deep >>>>>>>> >>>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>>> >>>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> One problem with this approach is that since i64 isn't legal, >>>>>>>>>>> the >>>>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>>>> handle properly. It might make sense to introduce something >>>>>>>>>>> like >>>>>>>>>>> >>>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>>> >>>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>>> convention >>>>>>>>>>> code to to let the target do something not easily >>>>>>>>>>> representable >>>>>>>>>>> in the tablegen minilanguage. >>>>>>>>>> >>>>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a >>>>>>>>>> way >>>>>>>>>> to >>>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>>> directly in >>>>>>>>>> the >>>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>>> generic >>>>>>>>>> change >>>>>>>>>> to the TableGen language. For example, the syntax might be >>>>>>>>>> like: >>>>>>>>>> >>>>>>>>>> class foo : CCCustomAction { >>>>>>>>>> code <<< EOF >>>>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>>>> and >>>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>>> EOF >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>>>>>>> take a >>>>>>>>>> string that names a function to be called: >>>>>>>>>> >>>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>>> >>>>>>>>>> the function signature for such functions will have to return >>>>>>>>>> two >>>>>>>>>> results: if the CC processing is finished and if it the func >>>>>>>>>> succeeded >>>>>>>>>> or failed: >>>>>>>>> >>>>>>>>> I like the second solution better. It seems rather cumbersome >>>>>>>>> to >>>>>>>>> embed >>>>>>>>> multi-line c++ code in td files. >>>>>>>>> >>>>>>>>> Evan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>>>> bool &result); >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>> < >>>>>>>> arm_callingconv >>>>>>>> .diff>_______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> < >>>>>> arm_callingconv >>>>>> .diff >>>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_callingconv.diff Type: application/octet-stream Size: 54710 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/8cc9cfc6/attachment-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_fixes.diff Type: application/octet-stream Size: 589 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/8cc9cfc6/attachment-0003.obj From deeppatel1987 at gmail.com Fri Feb 13 22:27:22 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Fri, 13 Feb 2009 20:27:22 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> Message-ID: <305d6f60902132027j3cc822dfw1dc817bb4a3f67b5@mail.gmail.com> Sorry left a small bit of cruft in ARMCallingConv.td. A corrected patch it attached. deep On Fri, Feb 13, 2009 at 6:41 PM, Sandeep Patel wrote: > Sure. Updated patches attached. > > deep > > On Fri, Feb 13, 2009 at 5:47 PM, Evan Cheng wrote: >> >> On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: >> >>> ARMTargetLowering doesn't need case #1, but it seemed like you and Dan >>> wanted a more generic way to inject C++ code into the process so I >>> tried to make the mechanism a bit more general. >> >> Ok. Since ARM doesn't need it and it's the only client, I'd much >> rather have CCCustomFn just return a single bool indicating whether it >> can handle the arg. Would that be ok? >> >> Thanks, >> >> Evan >> >>> >>> >>> deep >>> >>> On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng >>> wrote: >>>> >>>> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >>>> >>>>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>>>> wrote: >>>>>> >>>>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>>>> >>>>>>> Although it's not generally needed for ARM's use of CCCustom, I >>>>>>> return >>>>>>> two bools to handle the four possible outcomes to keep the >>>>>>> mechanism >>>>>>> flexible: >>>>>>> >>>>>>> * if CCCustomFn handled the arg or not >>>>>>> * if CCCustomFn wants to end processing of the arg or not >>>>>> >>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>> possibly >>>>>> updating >>>>>> +/// all args to reflect changes and indicates if it handled it. It >>>>>> must set >>>>>> +/// isCustom if it handles the arg and returns true. >>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>> + MVT &LocVT, CCValAssign::LocInfo &LocInfo, >>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState &State, >>>>>> + bool &result); >>>>>> >>>>>> Is "result" what you refer to as "isCustom" in the comments? >>>>>> >>>>>> Sorry, I am still confused. You mean it could return true but set >>>>>> 'result' to false? That means it has handled the argument but it >>>>>> would >>>>>> not process any more arguments? What scenario do you envision that >>>>>> this will be useful? I'd rather keep it simple. >>>>> >>>>> As you note there are three actual legitimate cases (of the four >>>>> combos): >>>>> >>>>> 1. The CCCustomFn wants the arg handling to proceed. This might be >>>>> used akin to CCPromoteToType. >>>>> 2. The CCCustomFn entirely handled the arg. This might be used >>>>> akin to >>>>> CCAssignToReg. >>>>> 3. The CCCustomFn tried to handle the arg, but failed. >>>>> >>>>> these results are conveyed the following ways: >>>>> >>>>> 1. The CCCustomFn returns false, &result is not used. >>>>> 2. The CCCustomFn returns true, &result is false; >>>>> 3. The CCCustomFn returns true, &result is true. >>>> >>>> I don't think we want to support #1. If the target want to add custom >>>> code to handle an argument, if should be responsible for outputting >>>> legal code. Is there an actual need to support #1? >>>> >>>> Evan >>>> >>>>> >>>>> >>>>> I tried to keep these CCCustomFns looking like TableGen generated >>>>> code. Suggestions of how to reorganize these results are >>>>> welcome. :-) >>>>> Perhaps better comments around the typedef for CCCustomFn would >>>>> suffice? >>>>> >>>>> The isCustom flag is simply a means for this machinery to convey to >>>>> the TargetLowering functions to process this arg specially. It may >>>>> not >>>>> always be possible for the TargetLowering functions to determine >>>>> that >>>>> the arg needs special handling after all the changes made by the >>>>> CCCustomFn or CCPromoteToType and other transformations. >>>>> >>>>>>> I placed the "unsigned i" outside those loops because i is used >>>>>>> after >>>>>>> the loop. If there's a better index search pattern, I'd be happy >>>>>>> to >>>>>>> change it. >>>>>> >>>>>> Ok. >>>>>> >>>>>> One more nitpick: >>>>>> >>>>>> +/// CCCustom - calls a custom arg handling function >>>>>> >>>>>> Please capitalize "calls" and end with a period. >>>>> >>>>> Once we settle on the result handling changes, I'll submit an update >>>>> with this change. >>>>> >>>>>> Thanks, >>>>>> >>>>>> Evan >>>>>> >>>>>>> >>>>>>> >>>>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>>>> changes. I >>>>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>>>> >>>>>>> deep >>>>>>> >>>>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>>>> wrote: >>>>>>>> Thanks Sandeep. I did a quick scan, this looks really good. But I >>>>>>>> do >>>>>>>> have a question: >>>>>>>> >>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>> possibly >>>>>>>> updating >>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>> it. It >>>>>>>> must set >>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>> &LocInfo, >>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>> &State, >>>>>>>> + bool &result); >>>>>>>> >>>>>>>> Is it necessary to return two bools (the second is returned by >>>>>>>> reference in 'result')? I am confused about the semantics of >>>>>>>> 'result'. >>>>>>>> >>>>>>>> Also, a nitpick: >>>>>>>> >>>>>>>> + unsigned i; >>>>>>>> + for (i = 0; i < 4; ++i) >>>>>>>> >>>>>>>> The convention we use is: >>>>>>>> >>>>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Evan >>>>>>>> >>>>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>>>> >>>>>>>>> I think I've got all the cases handled now, implementing with >>>>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>>>> >>>>>>>>> This also fixes a crash when returning i128. I've also >>>>>>>>> included a >>>>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>>>> >>>>>>>>> deep >>>>>>>>> >>>>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>>>> >>>>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> One problem with this approach is that since i64 isn't legal, >>>>>>>>>>>> the >>>>>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>>>>> handle properly. It might make sense to introduce something >>>>>>>>>>>> like >>>>>>>>>>>> >>>>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>>>> >>>>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>>>> convention >>>>>>>>>>>> code to to let the target do something not easily >>>>>>>>>>>> representable >>>>>>>>>>>> in the tablegen minilanguage. >>>>>>>>>>> >>>>>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom and a >>>>>>>>>>> way >>>>>>>>>>> to >>>>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>>>> directly in >>>>>>>>>>> the >>>>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>>>> generic >>>>>>>>>>> change >>>>>>>>>>> to the TableGen language. For example, the syntax might be >>>>>>>>>>> like: >>>>>>>>>>> >>>>>>>>>>> class foo : CCCustomAction { >>>>>>>>>>> code <<< EOF >>>>>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>>>>> and >>>>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>>>> EOF >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Does this seem reasonable? An alternative is for CCCustom to >>>>>>>>>>> take a >>>>>>>>>>> string that names a function to be called: >>>>>>>>>>> >>>>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>>>> >>>>>>>>>>> the function signature for such functions will have to return >>>>>>>>>>> two >>>>>>>>>>> results: if the CC processing is finished and if it the func >>>>>>>>>>> succeeded >>>>>>>>>>> or failed: >>>>>>>>>> >>>>>>>>>> I like the second solution better. It seems rather cumbersome >>>>>>>>>> to >>>>>>>>>> embed >>>>>>>>>> multi-line c++ code in td files. >>>>>>>>>> >>>>>>>>>> Evan >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>>>>> bool &result); >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>> >>>>>>>>> < >>>>>>>>> arm_callingconv >>>>>>>>> .diff>_______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>> >>>>>>> < >>>>>>> arm_callingconv >>>>>>> .diff >>>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_callingconv.diff Type: application/octet-stream Size: 54394 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/387f69fd/attachment-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_fixes.diff Type: application/octet-stream Size: 589 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090213/387f69fd/attachment-0003.obj From Viktar.Zviarovich at brunel.ac.uk Sat Feb 14 03:39:45 2009 From: Viktar.Zviarovich at brunel.ac.uk (Viktar Zviarovich) Date: Sat, 14 Feb 2009 09:39:45 -0000 Subject: [LLVMdev] problem with execution engine on windows Message-ID: Dear llvm-dev, First of all I'd like to thank LLVM developers for doing a great job! I am trying to use LLVM libraries on Windows using Visual Studio environment and everything works very smoothly until it comes to the execution engine. I construct a module and obtain a pointer to the function in it using ExecutionEngine::getPointerToFunction but calling this function (after appropriate cast) causes the following error: Unhandled exception at 0x0122a5bc in sampl.exe: 0xC000001E: An attempt was made to execute an invalid lock sequence. The function I am trying to call is trivial (took it from the tutorial just for testing purposes) and has C calling conventions. The execution engine returned by ExecutionEngine::create is Interpreter. Am I doing something wrong or is it a bug? And another question: is JIT supported on Windows? Thank you in advance, Viktar From ofv at wanadoo.es Sat Feb 14 15:11:52 2009 From: ofv at wanadoo.es (=?windows-1252?Q?=D3scar_Fuentes?=) Date: Sat, 14 Feb 2009 22:11:52 +0100 Subject: [LLVMdev] problem with execution engine on windows References: Message-ID: "Viktar Zviarovich" writes: [snip] > The function I am trying to call is trivial (took it from the tutorial > just for testing purposes) and has C calling conventions. The > execution engine returned by ExecutionEngine::create is Interpreter. > > Am I doing something wrong or is it a bug? Are you using CMake for generating the Visual Studio project files? If the answer is yes, please check http://www.llvm.org/docs/CMake.html specially the last section of the page. > And another question: is JIT supported on Windows? Yes. -- Oscar From Viktar.Zviarovich at brunel.ac.uk Sat Feb 14 15:47:11 2009 From: Viktar.Zviarovich at brunel.ac.uk (Viktar Zviarovich) Date: Sat, 14 Feb 2009 21:47:11 -0000 Subject: [LLVMdev] problem with execution engine on windows References: Message-ID: Yes, I am using CMake and the addition "/INCLUDE:_X86TargetMachineModule" compiler option as described in that document partially solved the problem, thank you. After some debugging and browsing the LLVM headers I finally solved it by including "llvm/ExecutionEngine/JIT.h". I would be helpful to document this as well. Viktar -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu on behalf of ?scar Fuentes Sent: Sat 14/02/2009 21:11 To: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] problem with execution engine on windows "Viktar Zviarovich" writes: [snip] > The function I am trying to call is trivial (took it from the tutorial > just for testing purposes) and has C calling conventions. The > execution engine returned by ExecutionEngine::create is Interpreter. > > Am I doing something wrong or is it a bug? Are you using CMake for generating the Visual Studio project files? If the answer is yes, please check http://www.llvm.org/docs/CMake.html specially the last section of the page. > And another question: is JIT supported on Windows? Yes. -- Oscar _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From nicholas at mxc.ca Sat Feb 14 23:05:05 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 14 Feb 2009 21:05:05 -0800 Subject: [LLVMdev] making libraries depend on external libraries? In-Reply-To: References: Message-ID: <4997A281.1060708@mxc.ca> Nick Lewycky wrote: > How can I specify that LLVMInterpreter depends on 'libffi' in the > Makefile? Modifying LD.Flags in Interpreter/Makefile doesn't help since > llvm-config doesn't pick up on that, causing a linker error when > building lli. I'd like "llvm-config --libs interpreter" return -lffi > along with the LLVM libraries it lists. > > You'd think there would be an example of this already, but I looked and > didn't find one. So I dug through the code and it looks like we really don't support this. What we do have is "llvm-config --ldflags" which lists all the system libraries we use, regardless of whether the tool you're compiling would really need them. The easiest thing for me to do would be to add -lffi in there. Sound reasonable for now? Nick PS. I looked into what it would take to do this properly. "llvm-config --libs" just rattles off the list of dependencies calculated by GenLibDeps, which in turn figures it out by running 'nm' over each llvm built library and seeing which library defines what symbol and builds a graph of using lib -> defining lib. I tried adding a switch to ask it to consider certain system libraries as defining some symbols, in the hopes that the rest of the calculation should just work as normal. It turns out that a) we need some way to actually find the system library in order to run nm on it and Autoconf won't give us that. I used 'gcc -print-file-name=libffi.so'. b) Makefile.rules makes the assumption that every library printed by llvm-config --libs is something that LLVM will build, and adds them as make dependencies for the tool being built. This is where I gave up. -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-config-ffi.patch Type: text/x-diff Size: 7157 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090214/8a14e0b1/attachment.bin From regehr at cs.utah.edu Sun Feb 15 22:31:11 2009 From: regehr at cs.utah.edu (John Regehr) Date: Sun, 15 Feb 2009 21:31:11 -0700 Subject: [LLVMdev] PredicateSimplifier questions Message-ID: <4998EC0F.9010906@cs.utah.edu> PredicateSimplifier is a pretty interesting pass, but it doesn't look like opt invokes it at any standard -Ox level, and so I assume that llvm-gcc also does not use this pass? If that is right, I'm curious about why this is the case -- does it simply not provide enough code speedup to compensate for the increase in compile time? Also, a colleague and I (we both teach advanced compiler courses) would like to have a relatively easy way for students to try out various abstract interpretation techniques on LLVM code. Writing an abstract interpreter from scratch has too much overhead. Of the LLVM passes that we know of, PredicateSimplifier seems the best starting point for making a more generic abstract interpreter into which collections of transfer functions could be plugged. Does that seem right or can anyone suggest a better starting point? It would be pretty cool, for example, to see what the Octagon domain could learn about LLVM programs. John Regehr From nicholas at mxc.ca Sun Feb 15 22:54:50 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sun, 15 Feb 2009 20:54:50 -0800 Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: <4998EC0F.9010906@cs.utah.edu> References: <4998EC0F.9010906@cs.utah.edu> Message-ID: <4998F19A.2090300@mxc.ca> Hi John, John Regehr wrote: > PredicateSimplifier is a pretty interesting pass, but it doesn't look > like opt invokes it at any standard -Ox level, and so I assume that > llvm-gcc also does not use this pass? If that is right, I'm curious > about why this is the case -- does it simply not provide enough code > speedup to compensate for the increase in compile time? I wrote predsimplify as I was learning about compiler theory. It's pretty dumb in the sense that it spends lots of time analysing things that will never be used, and despite being the slowest pass in LLVM I haven't seen it improve run-times in a nightly test. > Also, a colleague and I (we both teach advanced compiler courses) would > like to have a relatively easy way for students to try out various > abstract interpretation techniques on LLVM code. Writing an abstract > interpreter from scratch has too much overhead. Of the LLVM passes that > we know of, PredicateSimplifier seems the best starting point for making > a more generic abstract interpreter into which collections of transfer > functions could be plugged. Does that seem right or can anyone suggest > a better starting point? It would be pretty cool, for example, to see > what the Octagon domain could learn about LLVM programs. Predsimplify is believed to have bugs (it results in miscompiled programs) and certainly isn't efficient (it was written before much of include/ADT). Finally, predsimplify is likely to go away once I or someone else writes a proper VRP pass. The only other starting points I can suggest are the concrete interpreter in lib/ExecutionEngine/Interpreter and the sparse conditional propagation framework in Analysis/SparsePropagation.h. If you decide that predsimplify is useful for you despite my warnings, then I would be very happy to hear that my time working on it wasn't all wasted! Nick Lewycky From regehr at cs.utah.edu Sun Feb 15 23:59:20 2009 From: regehr at cs.utah.edu (John Regehr) Date: Sun, 15 Feb 2009 22:59:20 -0700 (MST) Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: <4998F19A.2090300@mxc.ca> References: <4998EC0F.9010906@cs.utah.edu> <4998F19A.2090300@mxc.ca> Message-ID: Thanks for the answers Nick! > I wrote predsimplify as I was learning about compiler theory. It's > pretty dumb in the sense that it spends lots of time analysing things > that will never be used, and despite being the slowest pass in LLVM I > haven't seen it improve run-times in a nightly test. Interesting. The application I have in mind here is some TinyOS code where we've added lots of array bounds checks. The vast majority of these in fact cannot fail but LLVM and GCC are not smart enough to see this. It seemed like predsimplify was probably the right pass for this, but I haven't studied its success rate yet. The problem isn't execution cost but rather code size. This is for the msp430 platform which has too little flash memory. The code bloat really is a showstopper here: a number of existing applications overflow the available memory when we add checking, and cannot be run! (This is using gcc, we do not yet have an LLVM port to msp430.) So the "pluggable domain" thing (an idea we had good luck with at the source level using CIL) is not just for compiler class, but also for solving this real problem. In addition to calling out to standard abstract domains I want to for example play with shelling out to a heavyweight decision procedure. This should make many spurious bounds checks go away. Greg Morrisett reports good success with this approach in the Cyclone compiler. Inspection of embedded codes shows that most array accesses can be proved to be in bounds pretty easily-- though these proofs are apparently out of reach of -O2 type optimizers. > Predsimplify is believed to have bugs (it results in miscompiled > programs) and certainly isn't efficient (it was written before much of > include/ADT). Finally, predsimplify is likely to go away once I or > someone else writes a proper VRP pass. Well I do have a good way to find integer miscompilations... but don't want to waste everyone's time if this pass is on the way out. John Regehr From regehr at cs.utah.edu Mon Feb 16 00:08:16 2009 From: regehr at cs.utah.edu (John Regehr) Date: Sun, 15 Feb 2009 23:08:16 -0700 (MST) Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: <4998F19A.2090300@mxc.ca> References: <4998EC0F.9010906@cs.utah.edu> <4998F19A.2090300@mxc.ca> Message-ID: > Predsimplify is believed to have bugs (it results in miscompiled > programs) and certainly isn't efficient (it was written before much of > include/ADT). Finally, predsimplify is likely to go away once I or > someone else writes a proper VRP pass. Whoever does this, I strongly encourage looking into using (or at least providing optional support for) the Apron library: http://apron.cri.ensmp.fr/library/ No sense reinventing these wheels. John Regehr From clattner at apple.com Mon Feb 16 00:33:01 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 15 Feb 2009 22:33:01 -0800 Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: References: <4998EC0F.9010906@cs.utah.edu> <4998F19A.2090300@mxc.ca> Message-ID: On Feb 15, 2009, at 10:08 PM, John Regehr wrote: >> Predsimplify is believed to have bugs (it results in miscompiled >> programs) and certainly isn't efficient (it was written before much >> of >> include/ADT). Finally, predsimplify is likely to go away once I or >> someone else writes a proper VRP pass. > > Whoever does this, I strongly encourage looking into using (or at > least > providing optional support for) the Apron library: > > http://apron.cri.ensmp.fr/library/ > > No sense reinventing these wheels. In my experience, starting with a very simple and very cheap approach will get most of the benefit. For those who really want to eliminate every check possible, a more expensive approach can be used on top of it. If someone is interested in array bound check elimination, I'd suggest starting with the ABCD (array bounds check elimination on deman) paper. I believe that vmkit would hugely benefit from this as well. -Chris From baldrick at free.fr Mon Feb 16 02:51:09 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 16 Feb 2009 09:51:09 +0100 Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: References: <4998EC0F.9010906@cs.utah.edu> Message-ID: <200902160951.09772.baldrick@free.fr> Hi Chris, > In my experience, starting with a very simple and very cheap approach > will get most of the benefit. For those who really want to eliminate > every check possible, a more expensive approach can be used on top of > it. If someone is interested in array bound check elimination, I'd > suggest starting with the ABCD (array bounds check elimination on > deman) paper. I believe that vmkit would hugely benefit from this as > well. Ada would also benefit I think: all array accesses are checked. Ciao, Duncan. From alex.lavoro.propio at gmail.com Mon Feb 16 04:32:39 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Mon, 16 Feb 2009 02:32:39 -0800 (PST) Subject: [LLVMdev] Modeling GPU vector registers, again (with my implementation) In-Reply-To: <8CE3FC1A-1211-48E4-A319-AAC6FF346039@apple.com> References: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com> <8CE3FC1A-1211-48E4-A319-AAC6FF346039@apple.com> Message-ID: <22034856.post@talk.nabble.com> Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in the following example: // dp4 = dot product 4-element dp4 r0.x, r1, r2 dp4 r0.y, r3, r4 dp4 r0.z, r5, r6 dp4 r0.w, r7, r8 sub r5, r0.xyzw, r6 -- View this message in context: http://www.nabble.com/Modeling-GPU-vector-registers%2C-again-%28with-my-implementation%29-tp22001613p22034856.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From regehr at cs.utah.edu Mon Feb 16 10:00:05 2009 From: regehr at cs.utah.edu (John Regehr) Date: Mon, 16 Feb 2009 09:00:05 -0700 (MST) Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: References: <4998EC0F.9010906@cs.utah.edu> <4998F19A.2090300@mxc.ca> Message-ID: Chris do you have a sense for how the definedness of signed overflow in LLVM would play out in the context of bounds check elimination? That is, would it cause lots of failure to eliminate checks that could be seen to be unnecessary at the C level? John On Sun, 15 Feb 2009, Chris Lattner wrote: > > On Feb 15, 2009, at 10:08 PM, John Regehr wrote: > >>> Predsimplify is believed to have bugs (it results in miscompiled >>> programs) and certainly isn't efficient (it was written before much >>> of >>> include/ADT). Finally, predsimplify is likely to go away once I or >>> someone else writes a proper VRP pass. >> >> Whoever does this, I strongly encourage looking into using (or at >> least >> providing optional support for) the Apron library: >> >> http://apron.cri.ensmp.fr/library/ >> >> No sense reinventing these wheels. > > In my experience, starting with a very simple and very cheap approach > will get most of the benefit. For those who really want to eliminate > every check possible, a more expensive approach can be used on top of > it. If someone is interested in array bound check elimination, I'd > suggest starting with the ABCD (array bounds check elimination on > deman) paper. I believe that vmkit would hugely benefit from this as > well. > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From alex.lavoro.propio at gmail.com Mon Feb 16 10:03:22 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Mon, 16 Feb 2009 08:03:22 -0800 (PST) Subject: [LLVMdev] Eliminate PHI for non-copyable registers In-Reply-To: <83956F80-D316-4852-A673-9D450E974C79@apple.com> References: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> <8E39BC05-CED5-4796-8031-42FA38F80EDC@apple.com> <21972748.post@talk.nabble.com> <83956F80-D316-4852-A673-9D450E974C79@apple.com> Message-ID: <22040006.post@talk.nabble.com> Chris Lattner-2 wrote: > > and out of the registers and must be able to spill them, even if it > means going through another temporary register class. > But what if it cannot even be copied to another temporary register class? The values of these i32 regsiters can only be used as the index of another register class, but the value of the index itself cannot be read. Usually the program can be generated using only 2 of these i32 index registers, but the problem is LLVM requires them to be copyable if there is a PHI node. -- View this message in context: http://www.nabble.com/Eliminate-PHI-for-non-copyable-registers-tp21953583p22040006.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From clattner at apple.com Mon Feb 16 10:50:19 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 16 Feb 2009 08:50:19 -0800 Subject: [LLVMdev] Eliminate PHI for non-copyable registers In-Reply-To: <22040006.post@talk.nabble.com> References: <4d77c5f20902110407p12b7bd04i9477673da5affec9@mail.gmail.com> <8E39BC05-CED5-4796-8031-42FA38F80EDC@apple.com> <21972748.post@talk.nabble.com> <83956F80-D316-4852-A673-9D450E974C79@apple.com> <22040006.post@talk.nabble.com> Message-ID: On Feb 16, 2009, at 8:03 AM, [Alex] wrote: > > > Chris Lattner-2 wrote: >> >> and out of the registers and must be able to spill them, even if it >> means going through another temporary register class. >> > > But what if it cannot even be copied to another temporary register > class? Then they are not allowed to be allocatable. -Chris From clattner at apple.com Mon Feb 16 10:58:06 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 16 Feb 2009 08:58:06 -0800 Subject: [LLVMdev] PredicateSimplifier questions In-Reply-To: References: <4998EC0F.9010906@cs.utah.edu> <4998F19A.2090300@mxc.ca> Message-ID: On Feb 16, 2009, at 8:00 AM, John Regehr wrote: > Chris do you have a sense for how the definedness of signed overflow > in > LLVM would play out in the context of bounds check elimination? > That is, > would it cause lots of failure to eliminate checks that could be > seen to > be unnecessary at the C level? That is an interesting question, and there are several related issues. The possibility of undefined behavior or behavior that programmer's don't expect in C code often leads to "security checks" that end up not doing anything. For example, see things like: http://www.kb.cert.org/vuls/id/162289 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537 etc. In general, I think that if you *really* care about code security in C that bounds checks are not enough. You need to use various techniques to reduce the amount of undefined behavior in C, such as compiling with -fwrapv. A project that I'd like to tackle eventually in Clang is to have direct support by this by emitting code that zero initializes variables by default, *automatically* inserts bound checks where it can, inserts code to check that shift amounts are in range, etc. -Chris From czoccolo at gmail.com Mon Feb 16 11:24:06 2009 From: czoccolo at gmail.com (Corrado Zoccolo) Date: Mon, 16 Feb 2009 18:24:06 +0100 Subject: [LLVMdev] LLVM C bindings Message-ID: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> Hi, I find the C bindings for LLVM very useful, since they allow to invoke LLVM from a wider set of software tools (i.e. llvm-py uses it to export llvm to python). Unfortunately, it seems that llvm-c interface lacks major functionalities, e.g. getting a pointer to a jit-ted function. It is easy to write a small c++ wrapper to expose the functionality that one wants (llvm-py does it for the many optimization passes not available through c binding), but I wonder if it is possible to have a simpler & cleaner solution, i.e. automatically generating C bindings from c++ classes. Or am I missing some obvious way to have the aforementioned functionality in C without writing wrappers? Thanks, Corrado -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo at gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- From Micah.Villmow at amd.com Mon Feb 16 11:24:18 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Mon, 16 Feb 2009 09:24:18 -0800 Subject: [LLVMdev] Modeling GPU vector registers, again (with my implementation) In-Reply-To: <22034856.post@talk.nabble.com> References: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com><8CE3FC1A-1211-48E4-A319-AAC6FF346039@apple.com> <22034856.post@talk.nabble.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827A08@ssanexmb1.amd.com> Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple but still made swizzling fairly difficult. In order to get swizzling to work you only need to handle three SDNodes, insert_vector_elt, extract_vector_elt and build_vector while expanding the rest. For those three nodes I then custom lowered them to a target specific node with an extra integer constant per register that would encode the swizzle mask in 32bits. The correct swizzles can then be generated in the asm printer by decoding the integer constant. This does require having extra moves, but your example below would end up being something like the following: dp4 r100, r1, r2 mov r0.x, r100 (float4 => float1 extract_vector_elt) dp4 r101, r4, r5 mov r3.x, r101 (float4 => float1 extract_vector_elt) iadd r6.xy__, r0.x000, r3.0x00(float1 + float1 => float2 build_vector) dp4 r7.x, r8, r9 dp4 r10.x, r11, r12 iadd r13.xy__, r7.x000, f10.0x00(float1 + float1 => float2 build_vector) iadd r14, r13.xy00, r6.00xy (float2 + float2 => float4 build_vector) sub r15, r14, r9 It's not as compact and neat but it works and the move instructions will get optimized away by the lower level gpu compiler. Hope this helps, Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of [Alex] Sent: Monday, February 16, 2009 2:33 AM To: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Modeling GPU vector registers, again (with my implementation) Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in the following example: // dp4 = dot product 4-element dp4 r0.x, r1, r2 dp4 r0.y, r3, r4 dp4 r0.z, r5, r6 dp4 r0.w, r7, r8 sub r5, r0.xyzw, r6 -- View this message in context: http://www.nabble.com/Modeling-GPU-vector-registers%2C-again-%28with-my- implementation%29-tp22001613p22034856.html Sent from the LLVM - Dev mailing list archive at Nabble.com. _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From gordonhenriksen at me.com Mon Feb 16 12:39:02 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Mon, 16 Feb 2009 13:39:02 -0500 Subject: [LLVMdev] LLVM C bindings In-Reply-To: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> References: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> Message-ID: On Feb 16, 2009, at 12:24, Corrado Zoccolo wrote: > I find the C bindings for LLVM very useful, since they allow to > invoke LLVM from a wider set of software tools (i.e. llvm-py uses it > to export llvm to python). > Unfortunately, it seems that llvm-c interface lacks major > functionalities, e.g. getting a pointer to a jit-ted function. Enhancements to the bindings are welcomed. If you have a patch to add desired functionality, please submit it. > It is easy to write a small c++ wrapper to expose the functionality > that one wants (llvm-py does it for the many optimization passes not > available through c binding), but I wonder if it is possible to have > a simpler & cleaner solution, i.e. automatically generating C > bindings from c++ classes. If you have a successful generated binding, you're welcome to publish and/or submit it. I think a degree of editorial discretion in the bindings is advisable, but completeness might outweigh that consideration. > Or am I missing some obvious way to have the aforementioned > functionality in C without writing wrappers? Nope. ? Gordon From evan.cheng at apple.com Mon Feb 16 13:00:14 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 16 Feb 2009 11:00:14 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902132027j3cc822dfw1dc817bb4a3f67b5@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60902061802q65f813e7p1c75f1fa5a185d32@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> <305d6f60902132027j3cc822dfw1dc817bb4a3f67b5@mail.gmail.com> Message-ID: <90A38A0E-3C84-4765-A521-A2ED66261056@apple.com> Thanks. More questions :-) /// Information about how the value is assigned. - LocInfo HTP : 7; + LocInfo HTP : 6; Do you know why this change is needed? Are we running out of bits? - NeededStackSize = 4; - break; - case MVT::i64: - case MVT::f64: - if (firstGPR < 3) - NeededGPRs = 2; - else if (firstGPR == 3) { - NeededGPRs = 1; - NeededStackSize = 4; - } else - NeededStackSize = 8; + State.addLoc(CCValAssign::getCustomMem(ValNo, ValVT, + State.AllocateStack(4, 4), + MVT::i32, LocInfo)); + return true; // we handled it Your change isn't handling the "NeededStackSize = 8" case. ++ static const unsigned HiRegList[] = { ARM::R0, ARM::R2 }; + static const unsigned LoRegList[] = { ARM::R1, ARM::R3 }; + + if (unsigned Reg = State.AllocateReg(HiRegList, LoRegList, 2)) { + unsigned i; + for (i = 0; i < 2; ++i) + if (HiRegList[i] == Reg) + break; + + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, Reg, + MVT::i32, LocInfo)); + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, LoRegList[i], + MVT::i32, LocInfo)); Since 'i' is used after the loop, please choose a better variable name. Actually, is the loop necessary? We know the low register is always one after the high register. Perhaps you can use ARMRegisterInfo::getRegisterNumbering(Reg), add one to 1. And the lookup the register enum with a new function (something like getRegFromRegisterNum(RegNo, ValVT)). The patch is looking good. I need to run it through some more tests. Unfortunately ARM target is a bit broken right now. I hope to fix it today. Thanks, Evan On Feb 13, 2009, at 8:27 PM, Sandeep Patel wrote: > Sorry left a small bit of cruft in ARMCallingConv.td. A corrected > patch it attached. > > deep > > On Fri, Feb 13, 2009 at 6:41 PM, Sandeep Patel > wrote: >> Sure. Updated patches attached. >> >> deep >> >> On Fri, Feb 13, 2009 at 5:47 PM, Evan Cheng >> wrote: >>> >>> On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: >>> >>>> ARMTargetLowering doesn't need case #1, but it seemed like you >>>> and Dan >>>> wanted a more generic way to inject C++ code into the process so I >>>> tried to make the mechanism a bit more general. >>> >>> Ok. Since ARM doesn't need it and it's the only client, I'd much >>> rather have CCCustomFn just return a single bool indicating >>> whether it >>> can handle the arg. Would that be ok? >>> >>> Thanks, >>> >>> Evan >>> >>>> >>>> >>>> deep >>>> >>>> On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng >>>> wrote: >>>>> >>>>> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >>>>> >>>>>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>>>> > >>>>>> wrote: >>>>>>> >>>>>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>>>>> >>>>>>>> Although it's not generally needed for ARM's use of CCCustom, I >>>>>>>> return >>>>>>>> two bools to handle the four possible outcomes to keep the >>>>>>>> mechanism >>>>>>>> flexible: >>>>>>>> >>>>>>>> * if CCCustomFn handled the arg or not >>>>>>>> * if CCCustomFn wants to end processing of the arg or not >>>>>>> >>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>> possibly >>>>>>> updating >>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>> it. It >>>>>>> must set >>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>> &LocInfo, >>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>> &State, >>>>>>> + bool &result); >>>>>>> >>>>>>> Is "result" what you refer to as "isCustom" in the comments? >>>>>>> >>>>>>> Sorry, I am still confused. You mean it could return true but >>>>>>> set >>>>>>> 'result' to false? That means it has handled the argument but it >>>>>>> would >>>>>>> not process any more arguments? What scenario do you envision >>>>>>> that >>>>>>> this will be useful? I'd rather keep it simple. >>>>>> >>>>>> As you note there are three actual legitimate cases (of the four >>>>>> combos): >>>>>> >>>>>> 1. The CCCustomFn wants the arg handling to proceed. This might >>>>>> be >>>>>> used akin to CCPromoteToType. >>>>>> 2. The CCCustomFn entirely handled the arg. This might be used >>>>>> akin to >>>>>> CCAssignToReg. >>>>>> 3. The CCCustomFn tried to handle the arg, but failed. >>>>>> >>>>>> these results are conveyed the following ways: >>>>>> >>>>>> 1. The CCCustomFn returns false, &result is not used. >>>>>> 2. The CCCustomFn returns true, &result is false; >>>>>> 3. The CCCustomFn returns true, &result is true. >>>>> >>>>> I don't think we want to support #1. If the target want to add >>>>> custom >>>>> code to handle an argument, if should be responsible for >>>>> outputting >>>>> legal code. Is there an actual need to support #1? >>>>> >>>>> Evan >>>>> >>>>>> >>>>>> >>>>>> I tried to keep these CCCustomFns looking like TableGen generated >>>>>> code. Suggestions of how to reorganize these results are >>>>>> welcome. :-) >>>>>> Perhaps better comments around the typedef for CCCustomFn would >>>>>> suffice? >>>>>> >>>>>> The isCustom flag is simply a means for this machinery to >>>>>> convey to >>>>>> the TargetLowering functions to process this arg specially. It >>>>>> may >>>>>> not >>>>>> always be possible for the TargetLowering functions to determine >>>>>> that >>>>>> the arg needs special handling after all the changes made by the >>>>>> CCCustomFn or CCPromoteToType and other transformations. >>>>>> >>>>>>>> I placed the "unsigned i" outside those loops because i is used >>>>>>>> after >>>>>>>> the loop. If there's a better index search pattern, I'd be >>>>>>>> happy >>>>>>>> to >>>>>>>> change it. >>>>>>> >>>>>>> Ok. >>>>>>> >>>>>>> One more nitpick: >>>>>>> >>>>>>> +/// CCCustom - calls a custom arg handling function >>>>>>> >>>>>>> Please capitalize "calls" and end with a period. >>>>>> >>>>>> Once we settle on the result handling changes, I'll submit an >>>>>> update >>>>>> with this change. >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Evan >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>>>>> changes. I >>>>>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>>>>> >>>>>>>> deep >>>>>>>> >>>>>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>>>>> wrote: >>>>>>>>> Thanks Sandeep. I did a quick scan, this looks really good. >>>>>>>>> But I >>>>>>>>> do >>>>>>>>> have a question: >>>>>>>>> >>>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>>> possibly >>>>>>>>> updating >>>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>>> it. It >>>>>>>>> must set >>>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>>> &LocInfo, >>>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>>> &State, >>>>>>>>> + bool &result); >>>>>>>>> >>>>>>>>> Is it necessary to return two bools (the second is returned by >>>>>>>>> reference in 'result')? I am confused about the semantics of >>>>>>>>> 'result'. >>>>>>>>> >>>>>>>>> Also, a nitpick: >>>>>>>>> >>>>>>>>> + unsigned i; >>>>>>>>> + for (i = 0; i < 4; ++i) >>>>>>>>> >>>>>>>>> The convention we use is: >>>>>>>>> >>>>>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Evan >>>>>>>>> >>>>>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>>>>> >>>>>>>>>> I think I've got all the cases handled now, implementing with >>>>>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>>>>> >>>>>>>>>> This also fixes a crash when returning i128. I've also >>>>>>>>>> included a >>>>>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>>>>> >>>>>>>>>> deep >>>>>>>>>> >>>>>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>>>>> >>>>>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>>>> > >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> One problem with this approach is that since i64 isn't >>>>>>>>>>>>> legal, >>>>>>>>>>>>> the >>>>>>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>>>>>> handle properly. It might make sense to introduce >>>>>>>>>>>>> something >>>>>>>>>>>>> like >>>>>>>>>>>>> >>>>>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>>>>> >>>>>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>>>>> convention >>>>>>>>>>>>> code to to let the target do something not easily >>>>>>>>>>>>> representable >>>>>>>>>>>>> in the tablegen minilanguage. >>>>>>>>>>>> >>>>>>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom >>>>>>>>>>>> and a >>>>>>>>>>>> way >>>>>>>>>>>> to >>>>>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>>>>> directly in >>>>>>>>>>>> the >>>>>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>>>>> generic >>>>>>>>>>>> change >>>>>>>>>>>> to the TableGen language. For example, the syntax might be >>>>>>>>>>>> like: >>>>>>>>>>>> >>>>>>>>>>>> class foo : CCCustomAction { >>>>>>>>>>>> code <<< EOF >>>>>>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>>>>>> and >>>>>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>>>>> EOF >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> Does this seem reasonable? An alternative is for CCCustom >>>>>>>>>>>> to >>>>>>>>>>>> take a >>>>>>>>>>>> string that names a function to be called: >>>>>>>>>>>> >>>>>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>>>>> >>>>>>>>>>>> the function signature for such functions will have to >>>>>>>>>>>> return >>>>>>>>>>>> two >>>>>>>>>>>> results: if the CC processing is finished and if it the >>>>>>>>>>>> func >>>>>>>>>>>> succeeded >>>>>>>>>>>> or failed: >>>>>>>>>>> >>>>>>>>>>> I like the second solution better. It seems rather >>>>>>>>>>> cumbersome >>>>>>>>>>> to >>>>>>>>>>> embed >>>>>>>>>>> multi-line c++ code in td files. >>>>>>>>>>> >>>>>>>>>>> Evan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>>>>>> bool &result); >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>> >>>>>>>>>> < >>>>>>>>>> arm_callingconv >>>>>>>>>> .diff>_______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>> < >>>>>>>> arm_callingconv >>>>>>>> .diff >>>>>>>>> < >>>>>>>>> arm_fixes.diff>_______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> > < > arm_callingconv > .diff>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From lennart at augustsson.net Mon Feb 16 13:09:00 2009 From: lennart at augustsson.net (Lennart Augustsson) Date: Mon, 16 Feb 2009 20:09:00 +0100 Subject: [LLVMdev] LLVM C bindings In-Reply-To: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> References: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> Message-ID: The API function that gives you a pointer to the jit-ed function will be available in the next release of LLVM. You're not the only only one who's missed it. But I wish the Pass Manager API was more complete. I might do that if I have a some time left over. -- Lennart On Mon, Feb 16, 2009 at 6:24 PM, Corrado Zoccolo wrote: > Hi, > I find the C bindings for LLVM very useful, since they allow to invoke > LLVM from a wider set of software tools (i.e. llvm-py uses it to > export llvm to python). > Unfortunately, it seems that llvm-c interface lacks major > functionalities, e.g. getting a pointer to a jit-ted function. > It is easy to write a small c++ wrapper to expose the functionality > that one wants (llvm-py does it for the many optimization passes not > available through c binding), but I wonder if it is possible to have a > simpler & cleaner solution, i.e. automatically generating C bindings > from c++ classes. > Or am I missing some obvious way to have the aforementioned > functionality in C without writing wrappers? > > Thanks, > Corrado > > -- > __________________________________________________________________________ > > dott. Corrado Zoccolo mailto:czoccolo at gmail.com > PhD - Department of Computer Science - University of Pisa, Italy > -------------------------------------------------------------------------- > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From dpatel at apple.com Mon Feb 16 13:45:24 2009 From: dpatel at apple.com (Devang Patel) Date: Mon, 16 Feb 2009 11:45:24 -0800 Subject: [LLVMdev] loop passes vs call graph In-Reply-To: <200902131439.48939.baldrick@free.fr> References: <200902131439.48939.baldrick@free.fr> Message-ID: <64016EBB-2FD9-460B-9DEC-DF6FAC85D44B@apple.com> On Feb 13, 2009, at 5:39 AM, Duncan Sands wrote: > Hi, > >> I'm looking at bug 3367. >> >> If I run: >> >> $ opt b.bc -inline -loop-rotate -loop-unswitch -debug-pass=Executions >> >> ... it eventually crashes in the inliner, because the call graph >> isn't >> up to date. (NB if you want to reproduce this you'll have to apply my >> patch from bug 3367 first.) >> >> The reason the call graph isn't up to date is that -loop-unswitch has >> changed a function and not updated the call graph. But that seems OK, >> because -loop-unswitch's getAnalysisUsage() method doesn't claim to >> preserve the call graph. > > given the callgraph F -> G, the pass manager currently does the > following: > run inliner on G, run loop passes on G, run inliner on F, run loop > passes on F. Presumably what is happening is this: the loop passes > change > the functions that G calls (but don't update the callgraph). Now the > inliner visits F and decides to inline G into F. When it does this, > it > presumably merges the callgraph info for G (i.e. what G calls) into > that of > F. But this info is wrong, so F ends up having invalid callgraph > info which > at some point causes trouble. > > I think what should happen is: if a SCC pass (eg: inline) is followed > by function passes that preserve the callgraph, then it should > schedule > them together like above. However if the SCC pass is followed by a > function pass that does not preserve the callgraph then it should be > scheduled entirely after the SCC pass. > > For example, imagine -inline -fpass -loop-unswitch, where fpass is a > function pass that preserves the callgraph. Then the pass manager > should do: > > run -inline on G > run -fpass on G > run -inline on F > run -fpass on F > run -loop-unswitch on G > run -loop-unswitch on F. This will defeat the goal of applying loop transformations before inlining leaf functions. Note, Loop transformations are not aware of call graph. They do not claim to preserve call graph. However, loop passes are run by a loop pass manager (LPPassManager) which is itself a function pass. The pass manager is not doing the right thing here because LPPassManager is incorrectly claiming to preserve call graph. The right approach is to teach LPPassManager to really preserve call graph. - Devang > > > Just my opinion of course. > > Ciao, > > Duncan. > From wurstgebaeck at googlemail.com Mon Feb 16 13:58:06 2009 From: wurstgebaeck at googlemail.com (Jan Rehders) Date: Mon, 16 Feb 2009 20:58:06 +0100 Subject: [LLVMdev] Invalid call generated on 64-bit linux when calling native C function from IR Message-ID: <73cfec680902161158m9706f12t8333309aa8999a60@mail.gmail.com> Hi, when I try to generate LLVM-IR which calls back to native C functions the jit compiler generates invalid code on 64-bit linux. The same code works fine on 32-bit linux, 32-bit OS X and 64-bit OS X. A reproduction case is attached to this mail. It is a simple modification of the "How to use jit" example adding a call to a native function. I am currently using EE->addGlobalMapping to register the native functions. This appears to be nessecary because the native functions will be part of a .so/.dylib so the jit will not find them using dlsym on linux and we would prefer to stripp all symbols. I'm using LLVM 2.4, which I compiled using EXTRA_OPTIONS="-m64/-m32 -fPIC". Below you can see the code generated by the jit on different platforms. On 64-bit linux the 'call' instruction uses a completely wrong address. This invalid address appears to be related to the address of the call instruction itself (e.g. it is always a 'close' address). To my knowledge the call instruction's address argument is a relative address and thus needs to be within 2^24 bytes of the instruction. It looks like the code generator should generate a jump using a function pointer in this situation and fails to handle this. Am I doing something wrong in my code or is this an LLVM bug? Jan Linux 64-bit: (gdb) print addone_addr $1 = (void *) 0x406018 (gdb) x/10i foo_addr 0x2b7184072030: sub $0x8,%rsp 0x2b7184072034: mov $0x14,%edi 0x2b7184072039: callq 0x2b7200406018 <--- absolutely not ok 0x2b718407203e: add $0x8,%rsp 0x2b7184072042: retq (gdb) x/10i nfoo_addr 0x40603a : push %rbp 0x40603b : mov %rsp,%rbp 0x40603e : mov $0x1e,%edi 0x406043 : callq 0x406018 <--- ok 0x406048 : leaveq 0x406049 : retq OS X 64-bit: (gdb) print addone_addr $1 = (void *) 0x100000d56 (gdb) x/10i foo_addr x/10i foo_addr 0x102080030: sub $0x8,%rsp 0x102080034: mov $0x14,%edi 0x102080039: callq 0x100000d56 <--- ok 0x10208003e: add $0x8,%rsp 0x102080042: retq (gdb) x/10i nfoo_addr x/10i nfoo_addr 0x100000d78 : push %rbp 0x100000d79 : mov %rsp,%rbp 0x100000d7c : mov $0x1e,%edi 0x100000d81 : callq 0x100000d56 <--- ok 0x100000d86 : leaveq 0x100000d87 : retq Linux 32-bit: (gdb) print addone_addr print addone_addr $1 = (void *) 0x805aa74 (gdb) x/10i foo_addr x/10i foo_addr 0xc26020: sub $0x4,%esp 0xc26023: movl $0x14,(%esp) 0xc2602a: call 0x805aa74 <--- ok 0xc2602f: add $0x4,%esp 0xc26032: ret (gdb) x/10i nfoo_addr x/10i nfoo_addr 0x805aa8c : push %ebp 0x805aa8d : mov %esp,%ebp 0x805aa8f : push $0x1e 0x805aa91 : call 0x805aa74 <--- ok 0x805aa96 : add $0x4,%esp 0x805aa99 : leave 0x805aa9a : ret OS X 32-bit: (gdb) print addone_addr print addone_addr $1 = (void *) 0x1e62 (gdb) x/10i foo_addr x/10i foo_addr 0x2080020: sub $0xc,%esp 0x2080023: movl $0x14,(%esp) 0x208002a: call 0x1e62 <--- ok 0x208002f: add $0xc,%esp 0x2080032: ret (gdb) x/10i nfoo_addr x/10i nfoo_addr 0x1e80 : push %ebp 0x1e81 : mov %esp,%ebp 0x1e83 : sub $0x18,%esp 0x1e86 : movl $0x1e,(%esp) 0x1e8d : call 0x1e62 <--- ok 0x1e92 : leave 0x1e93 : ret -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090216/6bcebc16/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: llvmtest.cpp Type: text/x-c++src Size: 5733 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090216/6bcebc16/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: makefile Type: application/octet-stream Size: 881 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090216/6bcebc16/attachment-0001.obj From llvm at assumetheposition.nl Mon Feb 16 14:04:38 2009 From: llvm at assumetheposition.nl (Paul Melis) Date: Mon, 16 Feb 2009 21:04:38 +0100 Subject: [LLVMdev] LLVM C bindings In-Reply-To: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> References: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> Message-ID: <4999C6D6.9010306@assumetheposition.nl> Corrado Zoccolo wrote: > Unfortunately, it seems that llvm-c interface lacks major > functionalities, e.g. getting a pointer to a jit-ted function. > It is easy to write a small c++ wrapper to expose the functionality > that one wants (llvm-py does it for the many optimization passes not > available through c binding), but I wonder if it is possible to have a > simpler & cleaner solution, i.e. automatically generating C bindings > from c++ classes. > SWIG (www.swig.org) recently added a C output mode, that is capable of generating a C API for a C++ one. It was a Summer of Code project, so I'm not sure how mature it is. The docs are here: http://swig.svn.sourceforge.net/viewvc/swig/branches/gsoc2008-maciekd/Doc/Manual/C.html Regards, Paul From ofv at wanadoo.es Mon Feb 16 15:03:41 2009 From: ofv at wanadoo.es (=?windows-1252?Q?=D3scar_Fuentes?=) Date: Mon, 16 Feb 2009 22:03:41 +0100 Subject: [LLVMdev] Invalid call generated on 64-bit linux when calling native C function from IR References: <73cfec680902161158m9706f12t8333309aa8999a60@mail.gmail.com> Message-ID: Jan Rehders writes: [snip > I am currently using EE->addGlobalMapping to register the native functions. > This appears to be nessecary because the native functions will be part of a > .so/.dylib so the jit will not find them using dlsym on linux and we would > prefer to stripp all symbols. > > I'm using LLVM 2.4, which I compiled using EXTRA_OPTIONS="-m64/-m32 -fPIC". Seems related to this, which last time I checked was fixed on svn: http://llvm.org/bugs/show_bug.cgi?id=2920 [snip] -- Oscar From jon at ffconsultancy.com Mon Feb 16 15:18:48 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Mon, 16 Feb 2009 21:18:48 +0000 Subject: [LLVMdev] LLVM C bindings In-Reply-To: <4999C6D6.9010306@assumetheposition.nl> References: <4e5e476b0902160924m6012bb46m6ff39e037041c242@mail.gmail.com> <4999C6D6.9010306@assumetheposition.nl> Message-ID: <200902162118.48335.jon@ffconsultancy.com> On Monday 16 February 2009 20:04:38 Paul Melis wrote: > Corrado Zoccolo wrote: > > Unfortunately, it seems that llvm-c interface lacks major > > functionalities, e.g. getting a pointer to a jit-ted function. Yes. I similarly found that tail calls, sret and parts of first-class structs are not usable from OCaml and much of the functionality was not implemented in the C API in LLVM 2.4. > SWIG (www.swig.org) recently added a C output mode, that is capable of > generating a C API for a C++ one. > It was a Summer of Code project, so I'm not sure how mature it is. The > docs are here: > http://swig.svn.sourceforge.net/viewvc/swig/branches/gsoc2008-maciekd/Doc/M >anual/C.html The generation of this FFI code should certainly be automated. However, if the necessary tools are not yet stable perhaps it would be wise to consider looser bindings such as XML-RPC? I assume there are tools that can examine a C++ API from headers in order to create an XML-RPC server automatically? An XML-RPC API would be trivial to use and extend from languages like OCaml and Python and the interface code should not require any maintenance at all. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From alex.lavoro.propio at gmail.com Mon Feb 16 18:37:11 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Mon, 16 Feb 2009 16:37:11 -0800 (PST) Subject: [LLVMdev] #ifdef in TableGen In-Reply-To: References: <4d77c5f20902130109u789f66e7lc5750e0da1e7e255@mail.gmail.com> Message-ID: <22048862.post@talk.nabble.com> Chris Lattner-2 wrote: > > Not as such, what are you trying to do? > For example, sometimes I want some patterns to be matched by one machine instruction and sometimes two or more. I am doing the development for an experimental compiler and nothing is sure. I have source code in the backend looking for some specific opcode and they are conditional compiled (controllled by #ifdef..#endif). -- View this message in context: http://www.nabble.com/-ifdef-in-TableGen-tp21992728p22048862.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From clattner at apple.com Mon Feb 16 18:52:25 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 16 Feb 2009 16:52:25 -0800 Subject: [LLVMdev] #ifdef in TableGen In-Reply-To: <22048862.post@talk.nabble.com> References: <4d77c5f20902130109u789f66e7lc5750e0da1e7e255@mail.gmail.com> <22048862.post@talk.nabble.com> Message-ID: On Feb 16, 2009, at 4:37 PM, [Alex] wrote: > > > Chris Lattner-2 wrote: >> >> Not as such, what are you trying to do? >> > > For example, sometimes I want some patterns to be matched by one > machine > instruction and sometimes two or more. I am doing the development > for an > experimental compiler and nothing is sure. I have source code in the > backend > looking for some specific opcode and they are conditional compiled > (controllled by #ifdef..#endif). Your best bet is to run cpp over the .td files before you run them through tblgen then. For mainline development, we don't want to support this. -Chris From zhangzhengjian at gmail.com Mon Feb 16 19:21:55 2009 From: zhangzhengjian at gmail.com (zhengjian zhang) Date: Tue, 17 Feb 2009 09:21:55 +0800 Subject: [LLVMdev] sjlj-exceptions handlying Message-ID: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> in llvm-backend did't support sjlj-exceptions handlying,but support dwarf-excceptions handlying, my question is: if i want change llvm-backend to support, how should i Do ? anyone can give some clue? bestregards zhangzw From kotha.aparna at gmail.com Mon Feb 16 20:12:17 2009 From: kotha.aparna at gmail.com (aparna kotha) Date: Mon, 16 Feb 2009 21:12:17 -0500 Subject: [LLVMdev] FP128Ty Message-ID: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> I am new to llvm and am stuck up with a problem. I am trying to initialize a Value* of type fp128 having the value 0 I am using the following construct ConstantFP::get(APFloat(APInt(128,0,false))); This is returning a double instead of a float and I am confused. Thanks a lot for your help. -- -- Aparna -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090216/7e7b438f/attachment.html From clattner at apple.com Mon Feb 16 20:36:12 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 16 Feb 2009 18:36:12 -0800 Subject: [LLVMdev] FP128Ty In-Reply-To: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> Message-ID: On Feb 16, 2009, at 6:12 PM, aparna kotha wrote: > I am new to llvm and am stuck up with a problem. > I am trying to initialize a Value* of type fp128 having the value 0 > > I am using the following construct > > ConstantFP::get(APFloat(APInt(128,0,false))); > > This is returning a double instead of a float and I am confused. > > Thanks a lot for your help. FP128Ty is stubbed out, but completely untested and not supported by any targets yet. -Chris From raad_7007 at yahoo.com Tue Feb 17 02:46:13 2009 From: raad_7007 at yahoo.com (RAAD B) Date: Tue, 17 Feb 2009 00:46:13 -0800 (PST) Subject: [LLVMdev] information-transfer between analysis-pases References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> Message-ID: <505270.88665.qm@web33106.mail.mud.yahoo.com> Hello together, I have seen that the analysis-results are stored in llvm-IR as annotations. For example <; preds = %entry> in basicBlock level and <; [#uses=2]> for a variable. Is there any documentation about annotations? Regards Raad -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/28764018/attachment.html From baldrick at free.fr Tue Feb 17 02:50:29 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 17 Feb 2009 09:50:29 +0100 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> Message-ID: <200902170950.29978.baldrick@free.fr> On Tuesday 17 February 2009 02:21:55 zhengjian zhang wrote: > in llvm-backend did't support sjlj-exceptions handlying,but support > dwarf-excceptions handlying, > my question is: if i want change llvm-backend to support, how should i Do ? > anyone can give some clue? It's hard to say - I'm not sure anyone here knows how gcc handles sj/lj style exceptions, or has a good idea of what would be involved to get LLVM support. There's some documentation in gcc, see gcc/ada/raise-gcc.c, line 227 and onwards. Ciao, Duncan. From eli.friedman at gmail.com Tue Feb 17 02:52:58 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 17 Feb 2009 00:52:58 -0800 Subject: [LLVMdev] information-transfer between analysis-pases In-Reply-To: <505270.88665.qm@web33106.mail.mud.yahoo.com> References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> <505270.88665.qm@web33106.mail.mud.yahoo.com> Message-ID: On Tue, Feb 17, 2009 at 12:46 AM, RAAD B wrote: > Hello together, > > I have seen that the analysis-results are stored in llvm-IR as annotations. > For example <; preds = %entry> in basicBlock level and <; [#uses=2]> > for a variable. > > Is there any documentation about annotations? That isn't really analysis; it's just some easy-to-extract information the disassembler uses to make the output more readable. Those notes are treated as comments by the assembler. -Eli From jon at ffconsultancy.com Tue Feb 17 03:07:46 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Tue, 17 Feb 2009 09:07:46 +0000 Subject: [LLVMdev] Pure external functions Message-ID: <200902170907.46651.jon@ffconsultancy.com> Lennart Augustsson mentioned on his blog that he got substantial performance improvements by conveying to LLVM when external functions (e.g. tanh) were pure. How is this done? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From raad_7007 at yahoo.com Tue Feb 17 03:05:06 2009 From: raad_7007 at yahoo.com (RAAD B) Date: Tue, 17 Feb 2009 01:05:06 -0800 (PST) Subject: [LLVMdev] information-transfer between analysis-pases References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> <505270.88665.qm@web33106.mail.mud.yahoo.com> Message-ID: <446545.2634.qm@web33101.mail.mud.yahoo.com> Ok, but what about a pass like alias-analysis. How are the results stored, so that other passes can use them. -Raad ________________________________ From: Eli Friedman To: LLVM Developers Mailing List Sent: Tuesday, February 17, 2009 9:52:58 AM Subject: Re: [LLVMdev] information-transfer between analysis-pases On Tue, Feb 17, 2009 at 12:46 AM, RAAD B wrote: > Hello together, > > I have seen that the analysis-results are stored in llvm-IR as annotations. > For example <; preds = %entry> in basicBlock level and <; [#uses=2]> > for a variable. > > Is there any documentation about annotations? That isn't really analysis; it's just some easy-to-extract information the disassembler uses to make the output more readable. Those notes are treated as comments by the assembler. -Eli _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/6d012fc6/attachment.html From baldrick at free.fr Tue Feb 17 03:10:07 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 17 Feb 2009 10:10:07 +0100 Subject: [LLVMdev] information-transfer between analysis-pases In-Reply-To: <505270.88665.qm@web33106.mail.mud.yahoo.com> References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> <505270.88665.qm@web33106.mail.mud.yahoo.com> Message-ID: <200902171010.07518.baldrick@free.fr> Hi, > I have seen that the analysis-results are stored in llvm-IR as annotations. no they are not. > For example <; preds = %entry> in basicBlock level and <; [#uses=2]> for a variable. > > Is there any documentation about annotations? These are just comments - you can delete them and it won't matter. Ciao, Duncan. From baldrick at free.fr Tue Feb 17 03:46:07 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 17 Feb 2009 10:46:07 +0100 Subject: [LLVMdev] Pure external functions In-Reply-To: <200902170907.46651.jon@ffconsultancy.com> References: <200902170907.46651.jon@ffconsultancy.com> Message-ID: <200902171046.07734.baldrick@free.fr> Hi, > Lennart Augustsson mentioned on his blog that he got substantial performance > improvements by conveying to LLVM when external functions (e.g. tanh) were > pure. first note that tanh is not pure, because the result depends on the current floating point rounding mode. However, if you are willing to sacrifice complete numerical correctness, you can give llvm-gcc the -ffast-math flag and, voila!, tanh becomes pure. Ciao, Duncan. From jon at ffconsultancy.com Tue Feb 17 04:37:28 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Tue, 17 Feb 2009 10:37:28 +0000 Subject: [LLVMdev] Pure external functions In-Reply-To: <200902171046.07734.baldrick@free.fr> References: <200902170907.46651.jon@ffconsultancy.com> <200902171046.07734.baldrick@free.fr> Message-ID: <200902171037.28653.jon@ffconsultancy.com> On Tuesday 17 February 2009 09:46:07 Duncan Sands wrote: > Hi, > > > Lennart Augustsson mentioned on his blog that he got substantial > > performance improvements by conveying to LLVM when external functions > > (e.g. tanh) were pure. > > first note that tanh is not pure, because the result depends on the current > floating point rounding mode. Ugh. > However, if you are willing to sacrifice > complete numerical correctness, you can give llvm-gcc the -ffast-math flag > and, voila!, tanh becomes pure. How do you do the equivalent from the JIT? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From jay.foad at gmail.com Tue Feb 17 04:59:19 2009 From: jay.foad at gmail.com (Jay Foad) Date: Tue, 17 Feb 2009 10:59:19 +0000 Subject: [LLVMdev] loop passes vs call graph In-Reply-To: <64016EBB-2FD9-460B-9DEC-DF6FAC85D44B@apple.com> References: <200902131439.48939.baldrick@free.fr> <64016EBB-2FD9-460B-9DEC-DF6FAC85D44B@apple.com> Message-ID: > This will defeat the goal of applying loop transformations before > inlining leaf functions. Note, Loop transformations are not aware of > call graph. They do not claim to preserve call graph. However, loop > passes are run by a loop pass manager (LPPassManager) which is itself > a function pass. The pass manager is not doing the right thing here > because LPPassManager is incorrectly claiming to preserve call graph. > The right approach is to teach LPPassManager to really preserve call > graph. I've raised a bug to track this issue: http://llvm.org/bugs/show_bug.cgi?id=3601 Thanks, Jay. From baldrick at free.fr Tue Feb 17 05:08:08 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 17 Feb 2009 12:08:08 +0100 Subject: [LLVMdev] Pure external functions In-Reply-To: <200902171037.28653.jon@ffconsultancy.com> References: <200902170907.46651.jon@ffconsultancy.com> <200902171046.07734.baldrick@free.fr> <200902171037.28653.jon@ffconsultancy.com> Message-ID: <200902171208.08157.baldrick@free.fr> > How do you do the equivalent from the JIT? In the IR, give tanh the readnone attribute. If you want it to pay attention to the floating point rounding mode, give it the readonly attribute. Best performance, ignores rounding mode: declare double @tanh(double) nounwind readnone Pays attention to rounding mode; optimizers can still do something though: declare double @tanh(double) nounwind readonly Ciao, Duncan. From lennart at augustsson.net Tue Feb 17 05:12:12 2009 From: lennart at augustsson.net (Lennart Augustsson) Date: Tue, 17 Feb 2009 11:12:12 +0000 Subject: [LLVMdev] Pure external functions In-Reply-To: <200902171037.28653.jon@ffconsultancy.com> References: <200902170907.46651.jon@ffconsultancy.com> <200902171046.07734.baldrick@free.fr> <200902171037.28653.jon@ffconsultancy.com> Message-ID: You need to set the readnone attribute. I set it on the call instruction. -- Lennart On Tue, Feb 17, 2009 at 10:37 AM, Jon Harrop wrote: > On Tuesday 17 February 2009 09:46:07 Duncan Sands wrote: >> Hi, >> >> > Lennart Augustsson mentioned on his blog that he got substantial >> > performance improvements by conveying to LLVM when external functions >> > (e.g. tanh) were pure. >> >> first note that tanh is not pure, because the result depends on the current >> floating point rounding mode. > > Ugh. > >> However, if you are willing to sacrifice >> complete numerical correctness, you can give llvm-gcc the -ffast-math flag >> and, voila!, tanh becomes pure. > > How do you do the equivalent from the JIT? > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From criswell at cs.uiuc.edu Tue Feb 17 09:05:02 2009 From: criswell at cs.uiuc.edu (John Criswell) Date: Tue, 17 Feb 2009 09:05:02 -0600 Subject: [LLVMdev] information-transfer between analysis-pases In-Reply-To: <446545.2634.qm@web33101.mail.mud.yahoo.com> References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> <505270.88665.qm@web33106.mail.mud.yahoo.com> <446545.2634.qm@web33101.mail.mud.yahoo.com> Message-ID: <499AD21E.7010004@cs.uiuc.edu> RAAD B wrote: > Ok, but what about a pass like alias-analysis. > How are the results stored, so that other passes can use them. > Currently, most/all analysis information is stored in memory; the analysis passes do not encode their results into the LLVM IR. There is a new annotation intrinsic for encoding arbitrary information in the IR, but to the best of my knowledge, no analysis pass uses it to encode its results directly in the LLVM IR. Instead, during execution of an LLVM-based tool (e.g. the opt tool), a piece of code called the Pass Manager schedules analysis and transformation passes for execution. LLVM passes are C++ objects. When a transform pass needs analysis results, it asks the Pass Manager for a pointer to the C++ object for the corresponding analysis pass; it then calls methods on this pointer to query the information it requires. So, all analysis information is passed around in memory between passes. For more information, I'd suggest reading the "Writing an LLVM Pass" document at http://llvm.org/docs/WritingAnLLVMPass.html. -- John T. > -Raad > > ________________________________ > From: Eli Friedman > To: LLVM Developers Mailing List > Sent: Tuesday, February 17, 2009 9:52:58 AM > Subject: Re: [LLVMdev] information-transfer between analysis-pases > > On Tue, Feb 17, 2009 at 12:46 AM, RAAD B > wrote: > >> Hello together, >> >> I have seen that the analysis-results are stored in llvm-IR as annotations. >> For example <; preds = %entry> in basicBlock level and <; [#uses=2]> >> for a variable. >> >> Is there any documentation about annotations? >> > > That isn't really analysis; it's just some easy-to-extract information > the disassembler uses to make the output more readable. Those notes > are treated as comments by the assembler. > > -Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > From zapster at zapster.cc Tue Feb 17 09:36:31 2009 From: zapster at zapster.cc (Josef Eisl) Date: Tue, 17 Feb 2009 16:36:31 +0100 Subject: [LLVMdev] Function Attributes in LLVM Message-ID: <499AD97F.9060705@zapster.cc> Hello, I was wondering if there is a way to add more, maybe target dependant, function attributes? I think in certain circumstances they are a good way to give the compiler more information about a function. For example GCC supports attributes to mark an interrupt function witch is very useful for some low level targets. As far as I know function attributes are GCC specific or am I wrong? Is there a Standard-C way to add this kind of meta information to a function? Now the LLVM specific questions :): Is there currently a way that special function attributes are passed to the back end? A solution would be to add custom function attributes to the LLVM IR but changing a core system in order to support a feature of a new device seems not to be the best way IMO. Additionally the front end must be changed too. I think it would be a pretty nice feature if a target could specify special function attributes or am I totally missing the point? Thanks in advance! BR Josef From mrs at apple.com Tue Feb 17 10:30:51 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 17 Feb 2009 08:30:51 -0800 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: <200902170950.29978.baldrick@free.fr> References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> <200902170950.29978.baldrick@free.fr> Message-ID: On Feb 17, 2009, at 12:50 AM, Duncan Sands wrote: > I'm not sure anyone here knows how gcc handles sj/lj style exceptions, Or, reworded slightly, some people here wrote it. :-) From anton at korobeynikov.info Tue Feb 17 10:35:01 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 17 Feb 2009 19:35:01 +0300 Subject: [LLVMdev] Function Attributes in LLVM In-Reply-To: <499AD97F.9060705@zapster.cc> References: <499AD97F.9060705@zapster.cc> Message-ID: <4FB46C3B-8D4A-4DAE-97CB-025D91C0233D@korobeynikov.info> Hello, Josef > useful for some low level targets. As far as I know function > attributes > are GCC specific or am I wrong? That's correct > Is there a Standard-C way to add this > kind of meta information to a function? Well... You can store function pointers into some array and add any extra information your like. > I think it would be a pretty nice feature if a target could specify > special function attributes or am I totally missing the point? Look for annotation attribute. --- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From simmon12 at illinois.edu Tue Feb 17 10:57:53 2009 From: simmon12 at illinois.edu (Patrick Simmons) Date: Tue, 17 Feb 2009 10:57:53 -0600 Subject: [LLVMdev] InstCount Message-ID: <499AEC91.5070706@illinois.edu> Hello, I'm trying to print the instruction count of a bytecode file using the 2.4 release of LLVM. I found llvm-2.4/lib/Analysis/Instcount.cpp but I'm not sure what to do with it. I also looked at llvm-bcanalyzer. The documentation says this command is supposed to print the instruction count in the summary, but it doesn't seem to be doing so. Does anyone know what I should be doing? --Patrick From baldrick at free.fr Tue Feb 17 10:59:43 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 17 Feb 2009 17:59:43 +0100 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> <200902170950.29978.baldrick@free.fr> Message-ID: <200902171759.43476.baldrick@free.fr> > > I'm not sure anyone here knows how gcc handles sj/lj style exceptions, > > Or, reworded slightly, some people here wrote it. :-) Excellent! To handle dwarf eh, LLVM has an intrinsic to get hold of an exception object (eh.exception) and an intrinsic for matching the exception against a list of typeinfo objects (eh.selector). These get morphed into calls to the gcc unwinder lib by the code generator. Can sj/lj follow a similar scheme? Thanks, Duncan. From mrs at apple.com Tue Feb 17 11:04:43 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 17 Feb 2009 09:04:43 -0800 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: <200902171759.43476.baldrick@free.fr> References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> <200902170950.29978.baldrick@free.fr> <200902171759.43476.baldrick@free.fr> Message-ID: On Feb 17, 2009, at 8:59 AM, Duncan Sands wrote: > Excellent! To handle dwarf eh, LLVM has an intrinsic to get hold of > an exception object (eh.exception) and an intrinsic for matching the > exception against a list of typeinfo objects (eh.selector). These > get morphed into calls to the gcc unwinder lib by the code generator. > Can sj/lj follow a similar scheme? Don't see why not, though, these aren't sufficient. From criswell at cs.uiuc.edu Tue Feb 17 11:05:04 2009 From: criswell at cs.uiuc.edu (John Criswell) Date: Tue, 17 Feb 2009 11:05:04 -0600 Subject: [LLVMdev] InstCount In-Reply-To: <499AEC91.5070706@illinois.edu> References: <499AEC91.5070706@illinois.edu> Message-ID: <499AEE40.1010206@cs.uiuc.edu> Patrick Simmons wrote: > Hello, > > I'm trying to print the instruction count of a bytecode file using the > 2.4 release of LLVM. I found llvm-2.4/lib/Analysis/Instcount.cpp but > I'm not sure what to do with it. I also looked at llvm-bcanalyzer. The > documentation says this command is supposed to print the instruction > count in the summary, but it doesn't seem to be doing so. > > Does anyone know what I should be doing? > Looking at the LLVM 2.5 version, it prints out the count of the number of instructions as part of its statistics. To run it on a bitcode file, you should be able to do the following: opt -stats -analyze -instcount The -stats option tells opt to print statistics collected by each pass. The -analyze option tells opt that it is only doing analysis and not any transformation. The -instcount is the command line option to run the code in InstCount.cpp (notice the RegisterPass line in the source code; this assigns a command line option to the pass for use in opt). -- John T. > --Patrick > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From edwintorok at gmail.com Tue Feb 17 11:16:49 2009 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Tue, 17 Feb 2009 19:16:49 +0200 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> <200902170950.29978.baldrick@free.fr> <200902171759.43476.baldrick@free.fr> Message-ID: <499AF101.7020900@gmail.com> On 2009-02-17 19:04, Mike Stump wrote: > On Feb 17, 2009, at 8:59 AM, Duncan Sands wrote: > >> Excellent! To handle dwarf eh, LLVM has an intrinsic to get hold of >> an exception object (eh.exception) and an intrinsic for matching the >> exception against a list of typeinfo objects (eh.selector). These >> get morphed into calls to the gcc unwinder lib by the code generator. >> Can sj/lj follow a similar scheme? >> > > Don't see why not, though, these aren't sufficient. What about the -lowerinvoke pass? Is it incomplete? Best regards, --Edwin From simmon12 at illinois.edu Tue Feb 17 11:23:52 2009 From: simmon12 at illinois.edu (Patrick Simmons) Date: Tue, 17 Feb 2009 11:23:52 -0600 Subject: [LLVMdev] InstCount In-Reply-To: <499AEE40.1010206@cs.uiuc.edu> References: <499AEC91.5070706@illinois.edu> <499AEE40.1010206@cs.uiuc.edu> Message-ID: <499AF2A8.7040800@illinois.edu> John Criswell wrote: > Patrick Simmons wrote: > >> Hello, >> >> I'm trying to print the instruction count of a bytecode file using the >> 2.4 release of LLVM. I found llvm-2.4/lib/Analysis/Instcount.cpp but >> I'm not sure what to do with it. I also looked at llvm-bcanalyzer. The >> documentation says this command is supposed to print the instruction >> count in the summary, but it doesn't seem to be doing so. >> >> Does anyone know what I should be doing? >> >> > Looking at the LLVM 2.5 version, it prints out the count of the number > of instructions as part of its statistics. > > To run it on a bitcode file, you should be able to do the following: > > opt -stats -analyze -instcount > > The -stats option tells opt to print statistics collected by each pass. > The -analyze option tells opt that it is only doing analysis and not any > transformation. The -instcount is the command line option to run the > code in InstCount.cpp (notice the RegisterPass line in the source code; > this assigns a command line option to the pass for use in opt). > > -- John T. > > It works; thanks! --Patrick From npjohnso at cs.princeton.edu Tue Feb 17 11:26:16 2009 From: npjohnso at cs.princeton.edu (Nick Johnson) Date: Tue, 17 Feb 2009 12:26:16 -0500 Subject: [LLVMdev] Function Attributes in LLVM In-Reply-To: <4FB46C3B-8D4A-4DAE-97CB-025D91C0233D@korobeynikov.info> References: <499AD97F.9060705@zapster.cc> <4FB46C3B-8D4A-4DAE-97CB-025D91C0233D@korobeynikov.info> Message-ID: I too have been seeking something like this. To the best of my understanding, the annotation attributes let you attach only text attributes, and only at the function scope, but not to individual basic blocks or instructions. There are intrinsic functions for the purposes of annotating values, but these cannot be used to annotate branch or store instructions (as they produce no value). Additionally, these are only overrided against integer types, and so one cannot annotate a floating point or aggregate value. And although one can easily use an auxillary data to hold additional attributes for basic blocks or instructions, it would be ideal if such annotations could be de/serialized to bitcode files. Your input is appreciated, Nick On Tue, Feb 17, 2009 at 11:35 AM, Anton Korobeynikov wrote: > Hello, Josef > >> useful for some low level targets. As far as I know function >> attributes >> are GCC specific or am I wrong? > That's correct > >> Is there a Standard-C way to add this >> kind of meta information to a function? > Well... You can store function pointers into some array and add any > extra information your like. > >> I think it would be a pretty nice feature if a target could specify >> special function attributes or am I totally missing the point? > Look for annotation attribute. > > --- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -- Nick Johnson From baldrick at free.fr Tue Feb 17 11:52:09 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 17 Feb 2009 18:52:09 +0100 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> <200902171759.43476.baldrick@free.fr> Message-ID: <200902171852.09513.baldrick@free.fr> On Tuesday 17 February 2009 18:04:43 Mike Stump wrote: > On Feb 17, 2009, at 8:59 AM, Duncan Sands wrote: > > Excellent! To handle dwarf eh, LLVM has an intrinsic to get hold of > > an exception object (eh.exception) and an intrinsic for matching the > > exception against a list of typeinfo objects (eh.selector). These > > get morphed into calls to the gcc unwinder lib by the code generator. > > Can sj/lj follow a similar scheme? > > Don't see why not, though, these aren't sufficient. What else is needed? Want to give a quick rundown on sj/lj eh and how it differs from dwarf? Ciao, Duncan. From dpatel at apple.com Tue Feb 17 12:35:56 2009 From: dpatel at apple.com (Devang Patel) Date: Tue, 17 Feb 2009 10:35:56 -0800 Subject: [LLVMdev] Function Attributes in LLVM In-Reply-To: <499AD97F.9060705@zapster.cc> References: <499AD97F.9060705@zapster.cc> Message-ID: On Feb 17, 2009, at 7:36 AM, Josef Eisl wrote: > Now the LLVM specific questions :): Is there currently a way that > special function attributes are passed to the back end? One such example is "optsize". See http://llvm.org/docs/LangRef.html#fnattrs > A solution would > be to add custom function attributes to the LLVM IR but changing a > core > system in order to support a feature of a new device seems not to be > the > best way IMO. Additionally the front end must be changed too. > I think it would be a pretty nice feature if a target could specify > special function attributes or am I totally missing the point? - Devang From dalej at apple.com Tue Feb 17 12:56:59 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 17 Feb 2009 10:56:59 -0800 Subject: [LLVMdev] FP128Ty In-Reply-To: References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> Message-ID: <9350549B-55E0-4E59-BFDB-B71081D74868@apple.com> On Feb 16, 2009, at 6:36 PMPST, Chris Lattner wrote: > > On Feb 16, 2009, at 6:12 PM, aparna kotha wrote: > >> I am new to llvm and am stuck up with a problem. >> I am trying to initialize a Value* of type fp128 having the value 0 >> >> I am using the following construct >> >> ConstantFP::get(APFloat(APInt(128,0,false))); >> >> This is returning a double instead of a float and I am confused. >> >> Thanks a lot for your help. > > FP128Ty is stubbed out, but completely untested and not supported by > any targets yet. More specifically, the compile-time arithmetic stuff *should* Just Work, although nobody has tried it; the conversions of FP128 to and from other formats are not done. They should not be difficult to do by copying & modifying existing code. To do specifically what you're trying to do, add the FP128 case to APFloat::initFromAPInt. The code that reads and writes the IR, and writes assembler output, also needs to be added. (The Float in APFloat means "floating point", not the C float type.) From dalej at apple.com Tue Feb 17 13:22:05 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 17 Feb 2009 11:22:05 -0800 Subject: [LLVMdev] FP128Ty In-Reply-To: <9350549B-55E0-4E59-BFDB-B71081D74868@apple.com> References: <326a2f490902161812p6a2cb3cfy297e768276818ca0@mail.gmail.com> <9350549B-55E0-4E59-BFDB-B71081D74868@apple.com> Message-ID: On Feb 17, 2009, at 10:56 AMPST, Dale Johannesen wrote: > > On Feb 16, 2009, at 6:36 PMPST, Chris Lattner wrote: > >> >> On Feb 16, 2009, at 6:12 PM, aparna kotha wrote: >> >>> I am new to llvm and am stuck up with a problem. >>> I am trying to initialize a Value* of type fp128 having the value 0 >>> >>> I am using the following construct >>> >>> ConstantFP::get(APFloat(APInt(128,0,false))); Also, the 3rd parameter should be "true" for Fp128. This call is getting you PowerPC 128-bit format, which is two doubles stuck together; that's probably why it looks like a double to you. This format has lots of weird numeric properties, and you don't want to use it unless you're targeting PowerPC. >>> This is returning a double instead of a float and I am confused. >>> >>> Thanks a lot for your help. >> >> FP128Ty is stubbed out, but completely untested and not supported by >> any targets yet. > > More specifically, the compile-time arithmetic stuff *should* Just > Work, although nobody has tried it; the conversions of FP128 to and > from other formats are not done. They should not be difficult to do > by copying & modifying existing code. To do specifically what > you're trying to do, add the FP128 case to APFloat::initFromAPInt. > The code that reads and writes the IR, and writes assembler output, > also needs to be added. > > (The Float in APFloat means "floating point", not the C float type.) From mrs at apple.com Tue Feb 17 14:22:05 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 17 Feb 2009 12:22:05 -0800 Subject: [LLVMdev] sjlj-exceptions handlying In-Reply-To: <200902171852.09513.baldrick@free.fr> References: <8e3538210902161721t480acc63wc9e546d1e6dc15dd@mail.gmail.com> <200902171759.43476.baldrick@free.fr> <200902171852.09513.baldrick@free.fr> Message-ID: On Feb 17, 2009, at 9:52 AM, Duncan Sands wrote: > On Tuesday 17 February 2009 18:04:43 Mike Stump wrote: >> On Feb 17, 2009, at 8:59 AM, Duncan Sands wrote: >>> Excellent! To handle dwarf eh, LLVM has an intrinsic to get hold of >>> an exception object (eh.exception) and an intrinsic for matching the >>> exception against a list of typeinfo objects (eh.selector). These >>> get morphed into calls to the gcc unwinder lib by the code >>> generator. >>> Can sj/lj follow a similar scheme? >> >> Don't see why not, though, these aren't sufficient. > > What else is needed? Want to give a quick rundown on sj/lj eh and > how it differs from dwarf? From a practical perspective, compile up g++.mike/eh6.C and see if it matches gcc codegen on a sjlj platform. From there, run the testsuite with eh\*.C and see if they all work. This should go a long way to pointing out deficiencies, if any. I've not been watching carefully how llvm-gcc is wired into llvm with respect to EH, to know if you guys are reusing gcc to do the codegen, or if these are being passed on down to llvm for codegen. clang of course has a slightly different answer here. From robert at muth.org Tue Feb 17 16:04:04 2009 From: robert at muth.org (robert muth) Date: Tue, 17 Feb 2009 17:04:04 -0500 Subject: [LLVMdev] ARM backend playing with alternative jump table implementations Message-ID: <8e3491100902171404v53c0c3dao495cdbd5890dbd10@mail.gmail.com> Hi list: I have been trying to get my feet wet with the ARM backend. As a warmup exercise I wanted to be able to move jumptables especially large ones out of the code section. Currently the idiom for jump tables loooks like this // .set PCRELV0, (.LJTI9_0_0-(.LPCRELL0+8)) // .LPCRELL0: // add r3, pc, #PCRELV0 // ldr pc, [r3, +r0, lsl #2] // .LJTI9_0_0: // .long .LBB9_2 // .long .LBB9_5 // .long .LBB9_7 // .long .LBB9_4 // .long .LBB9_8 I would like to be able to change this to something like: ldr r3, .POOL_ADDR ldr pc, [r3, +r0, lsl #2 .POOL_ADDR: .text .LJTI9_0_0: .data .LJTI9_0_0: .long .LBB9_2 .long .LBB9_5 .long .LBB9_7 .long .LBB9_4 .long .LBB9_8 .text The code for the lowering lives mostly in SDValue ARMTargetLowering::LowerBR_JT with some more heavy lifting done by ARMISD::WrapperJT My attempts at this are marked in the code below. My problem is to come up with the right item/value to put into the constant pool. SDValue ARMTargetLowering::LowerBR_JT(SDValue Op, SelectionDAG &DAG) { SDValue Chain = Op.getOperand(0); SDValue Table = Op.getOperand(1); SDValue Index = Op.getOperand(2); DebugLoc dl = Op.getDebugLoc(); MVT PTy = getPointerTy(); JumpTableSDNode *JT = cast(Table); ARMFunctionInfo *AFI = DAG.getMachineFunction().getInfo(); SDValue UId = DAG.getConstant(AFI->createJumpTableUId(), PTy); SDValue JTI = DAG.getTargetJumpTable(JT->getIndex(), PTy); #if 1 // @@ GET TABLE BASE: current code Table = DAG.getNode(ARMISD::WrapperJT, MVT::i32, JTI, UId); #else // @ MY ATTEMPT AT MOVING THIS OUT ARMConstantPoolValue *CPV = new ARMConstantPoolValue("a_jump_table", 666); SDValue TableValue = DAG.getTargetConstantPool(CPV, PTy, 2); SDValue CPAddr = DAG.getNode(ARMISD::Wrapper, MVT::i32, TableValue); Table = DAG.getLoad(PTy, dl, DAG.getEntryNode(), CPAddr, NULL, 0); #endif Index = DAG.getNode(ISD::MUL, dl, PTy, Index, DAG.getConstant(4, PTy)); //Index = DAG.getNode(ISD::MUL, dl, PTy, TableAddress, DAG.getConstant(4, PTy)); SDValue Addr = DAG.getNode(ISD::ADD, dl, PTy, Index, Table);SDValue APTy, Index, Table); Any help would be greatly appreciated. Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/664f8ae8/attachment.html From scooter.phd at gmail.com Tue Feb 17 16:21:23 2009 From: scooter.phd at gmail.com (Scott Michel) Date: Tue, 17 Feb 2009 14:21:23 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed Message-ID: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> Anyone out there interested in helping out with a subversion pre-commit hook to: - remove trailing whitespace, - expand tabs to spaces, - detect 80-col violations, as well as detect other style guideline breakage? I just ran into the trailing whitespace problem: Eclipse and other editors like to trim excess whitespace from source. However, when one commits a patch with trailing whitespace removed, the extraneous diffs make reading the patch more difficult. Reply to me privately if you're interested in helping out. -scooter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/07e158e5/attachment.html From mrs at apple.com Tue Feb 17 16:35:35 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 17 Feb 2009 14:35:35 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> Message-ID: On Feb 17, 2009, at 2:21 PM, Scott Michel wrote: > - remove trailing whitespace, > - expand tabs to spaces, I'd argue for not changing anything, just fail it. From scooter.phd at gmail.com Tue Feb 17 16:46:46 2009 From: scooter.phd at gmail.com (Scott Michel) Date: Tue, 17 Feb 2009 14:46:46 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> Message-ID: <258cd3200902171446y3e05c738i1a4633b75e7a3050@mail.gmail.com> On Tue, Feb 17, 2009 at 2:35 PM, Mike Stump wrote: > On Feb 17, 2009, at 2:21 PM, Scott Michel wrote: > > - remove trailing whitespace, > > - expand tabs to spaces, > > I'd argue for not changing anything, just fail it. > Trimming whitespace is innocuous, at best. Expanding tabs to spaces, I might be inclined to agree is a 'fail' since weird formatting can result. 80-col violations are absolutely a 'fail'. -scooter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/1d7dfed3/attachment.html From clattner at apple.com Tue Feb 17 16:51:48 2009 From: clattner at apple.com (Chris Lattner) Date: Tue, 17 Feb 2009 14:51:48 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <258cd3200902171446y3e05c738i1a4633b75e7a3050@mail.gmail.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <258cd3200902171446y3e05c738i1a4633b75e7a3050@mail.gmail.com> Message-ID: <96551653-7DB7-43E5-9E61-87A4C19752D3@apple.com> On Feb 17, 2009, at 2:46 PM, Scott Michel wrote: > On Tue, Feb 17, 2009 at 2:35 PM, Mike Stump wrote: > On Feb 17, 2009, at 2:21 PM, Scott Michel wrote: > > - remove trailing whitespace, > > - expand tabs to spaces, > > I'd argue for not changing anything, just fail it. > > Trimming whitespace is innocuous, at best. Expanding tabs to spaces, > I might be inclined to agree is a 'fail' since weird formatting can > result. 80-col violations are absolutely a 'fail'. I'd recommend just making everything be a fail. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/e67b292f/attachment.html From mrs at apple.com Tue Feb 17 16:52:39 2009 From: mrs at apple.com (Mike Stump) Date: Tue, 17 Feb 2009 14:52:39 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <258cd3200902171446y3e05c738i1a4633b75e7a3050@mail.gmail.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <258cd3200902171446y3e05c738i1a4633b75e7a3050@mail.gmail.com> Message-ID: <97217948-442A-435E-896E-DB8A550CA49D@apple.com> On Feb 17, 2009, at 2:46 PM, Scott Michel wrote: > Trimming whitespace is innocuous, at best. Unless you're doing a testcase that wants to verify a feature. From scooter.phd at gmail.com Tue Feb 17 18:09:25 2009 From: scooter.phd at gmail.com (Scott Michel) Date: Tue, 17 Feb 2009 16:09:25 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <97217948-442A-435E-896E-DB8A550CA49D@apple.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <258cd3200902171446y3e05c738i1a4633b75e7a3050@mail.gmail.com> <97217948-442A-435E-896E-DB8A550CA49D@apple.com> Message-ID: <258cd3200902171609i105c0db7oac1333ffefa6722a@mail.gmail.com> All I'm hoping for is a clean diff. Fail commits for trailing whitespace if they're source? Easy to detect .ll files. On Tue, Feb 17, 2009 at 2:52 PM, Mike Stump wrote: > On Feb 17, 2009, at 2:46 PM, Scott Michel wrote: > > Trimming whitespace is innocuous, at best. > > Unless you're doing a testcase that wants to verify a feature. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/b1a422d2/attachment.html From deeppatel1987 at gmail.com Tue Feb 17 18:41:04 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Tue, 17 Feb 2009 16:41:04 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <90A38A0E-3C84-4765-A521-A2ED66261056@apple.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> <305d6f60902132027j3cc822dfw1dc817bb4a3f67b5@mail.gmail.com> <90A38A0E-3C84-4765-A521-A2ED66261056@apple.com> Message-ID: <305d6f60902171641l59329d2bq23193d70e39ae4fb@mail.gmail.com> On Mon, Feb 16, 2009 at 11:00 AM, Evan Cheng wrote: > /// Information about how the value is assigned. > - LocInfo HTP : 7; > + LocInfo HTP : 6; > > Do you know why this change is needed? Are we running out of bits? HTP was't using all of these bits. I needed the hasCustom bit to come from somewhere unless we wanted to grow this struct, so I grabbed a bit from HTP. > - NeededStackSize = 4; > - break; > - case MVT::i64: > - case MVT::f64: > - if (firstGPR < 3) > - NeededGPRs = 2; > - else if (firstGPR == 3) { > - NeededGPRs = 1; > - NeededStackSize = 4; > - } else > - NeededStackSize = 8; > + State.addLoc(CCValAssign::getCustomMem(ValNo, ValVT, > + State.AllocateStack(4, 4), > + MVT::i32, LocInfo)); > + return true; // we handled it > > Your change isn't handling the "NeededStackSize = 8" case. I believe it is. I've attached two additional test cases. The difference is that this case isn't handled by the CCCustomFns. They fail to allocate any regs and then handling falls through to an CCAssignToStack in ARMCallingConv.td. This is how other targets handle similar allocations. > ++ static const unsigned HiRegList[] = { ARM::R0, ARM::R2 }; > + static const unsigned LoRegList[] = { ARM::R1, ARM::R3 }; > + > + if (unsigned Reg = State.AllocateReg(HiRegList, LoRegList, 2)) { > + unsigned i; > + for (i = 0; i < 2; ++i) > + if (HiRegList[i] == Reg) > + break; > + > + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, Reg, > + MVT::i32, LocInfo)); > + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, LoRegList[i], > + MVT::i32, LocInfo)); > > Since 'i' is used after the loop, please choose a better variable name. > > Actually, is the loop necessary? We know the low register is always > one after the high register. Perhaps you can use > ARMRegisterInfo::getRegisterNumbering(Reg), add one to 1. And the > lookup the register enum with a new function (something like > getRegFromRegisterNum(RegNo, ValVT)). > > The patch is looking good. I need to run it through some more tests. > Unfortunately ARM target is a bit broken right now. I hope to fix it > today. I'll submit a revised patch after we've settled on the NeededStackSize=8 issue. deep > Thanks, > > Evan > > On Feb 13, 2009, at 8:27 PM, Sandeep Patel wrote: > >> Sorry left a small bit of cruft in ARMCallingConv.td. A corrected >> patch it attached. >> >> deep >> >> On Fri, Feb 13, 2009 at 6:41 PM, Sandeep Patel > > wrote: >>> Sure. Updated patches attached. >>> >>> deep >>> >>> On Fri, Feb 13, 2009 at 5:47 PM, Evan Cheng >>> wrote: >>>> >>>> On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: >>>> >>>>> ARMTargetLowering doesn't need case #1, but it seemed like you >>>>> and Dan >>>>> wanted a more generic way to inject C++ code into the process so I >>>>> tried to make the mechanism a bit more general. >>>> >>>> Ok. Since ARM doesn't need it and it's the only client, I'd much >>>> rather have CCCustomFn just return a single bool indicating >>>> whether it >>>> can handle the arg. Would that be ok? >>>> >>>> Thanks, >>>> >>>> Evan >>>> >>>>> >>>>> >>>>> deep >>>>> >>>>> On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng >>>>> wrote: >>>>>> >>>>>> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >>>>>> >>>>>>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>>>>> > >>>>>>> wrote: >>>>>>>> >>>>>>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>>>>>> >>>>>>>>> Although it's not generally needed for ARM's use of CCCustom, I >>>>>>>>> return >>>>>>>>> two bools to handle the four possible outcomes to keep the >>>>>>>>> mechanism >>>>>>>>> flexible: >>>>>>>>> >>>>>>>>> * if CCCustomFn handled the arg or not >>>>>>>>> * if CCCustomFn wants to end processing of the arg or not >>>>>>>> >>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>> possibly >>>>>>>> updating >>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>> it. It >>>>>>>> must set >>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>> &LocInfo, >>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>> &State, >>>>>>>> + bool &result); >>>>>>>> >>>>>>>> Is "result" what you refer to as "isCustom" in the comments? >>>>>>>> >>>>>>>> Sorry, I am still confused. You mean it could return true but >>>>>>>> set >>>>>>>> 'result' to false? That means it has handled the argument but it >>>>>>>> would >>>>>>>> not process any more arguments? What scenario do you envision >>>>>>>> that >>>>>>>> this will be useful? I'd rather keep it simple. >>>>>>> >>>>>>> As you note there are three actual legitimate cases (of the four >>>>>>> combos): >>>>>>> >>>>>>> 1. The CCCustomFn wants the arg handling to proceed. This might >>>>>>> be >>>>>>> used akin to CCPromoteToType. >>>>>>> 2. The CCCustomFn entirely handled the arg. This might be used >>>>>>> akin to >>>>>>> CCAssignToReg. >>>>>>> 3. The CCCustomFn tried to handle the arg, but failed. >>>>>>> >>>>>>> these results are conveyed the following ways: >>>>>>> >>>>>>> 1. The CCCustomFn returns false, &result is not used. >>>>>>> 2. The CCCustomFn returns true, &result is false; >>>>>>> 3. The CCCustomFn returns true, &result is true. >>>>>> >>>>>> I don't think we want to support #1. If the target want to add >>>>>> custom >>>>>> code to handle an argument, if should be responsible for >>>>>> outputting >>>>>> legal code. Is there an actual need to support #1? >>>>>> >>>>>> Evan >>>>>> >>>>>>> >>>>>>> >>>>>>> I tried to keep these CCCustomFns looking like TableGen generated >>>>>>> code. Suggestions of how to reorganize these results are >>>>>>> welcome. :-) >>>>>>> Perhaps better comments around the typedef for CCCustomFn would >>>>>>> suffice? >>>>>>> >>>>>>> The isCustom flag is simply a means for this machinery to >>>>>>> convey to >>>>>>> the TargetLowering functions to process this arg specially. It >>>>>>> may >>>>>>> not >>>>>>> always be possible for the TargetLowering functions to determine >>>>>>> that >>>>>>> the arg needs special handling after all the changes made by the >>>>>>> CCCustomFn or CCPromoteToType and other transformations. >>>>>>> >>>>>>>>> I placed the "unsigned i" outside those loops because i is used >>>>>>>>> after >>>>>>>>> the loop. If there's a better index search pattern, I'd be >>>>>>>>> happy >>>>>>>>> to >>>>>>>>> change it. >>>>>>>> >>>>>>>> Ok. >>>>>>>> >>>>>>>> One more nitpick: >>>>>>>> >>>>>>>> +/// CCCustom - calls a custom arg handling function >>>>>>>> >>>>>>>> Please capitalize "calls" and end with a period. >>>>>>> >>>>>>> Once we settle on the result handling changes, I'll submit an >>>>>>> update >>>>>>> with this change. >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Evan >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>>>>>> changes. I >>>>>>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>>>>>> >>>>>>>>> deep >>>>>>>>> >>>>>>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>>>>>> wrote: >>>>>>>>>> Thanks Sandeep. I did a quick scan, this looks really good. >>>>>>>>>> But I >>>>>>>>>> do >>>>>>>>>> have a question: >>>>>>>>>> >>>>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>>>> possibly >>>>>>>>>> updating >>>>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>>>> it. It >>>>>>>>>> must set >>>>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>>>> &LocInfo, >>>>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>>>> &State, >>>>>>>>>> + bool &result); >>>>>>>>>> >>>>>>>>>> Is it necessary to return two bools (the second is returned by >>>>>>>>>> reference in 'result')? I am confused about the semantics of >>>>>>>>>> 'result'. >>>>>>>>>> >>>>>>>>>> Also, a nitpick: >>>>>>>>>> >>>>>>>>>> + unsigned i; >>>>>>>>>> + for (i = 0; i < 4; ++i) >>>>>>>>>> >>>>>>>>>> The convention we use is: >>>>>>>>>> >>>>>>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Evan >>>>>>>>>> >>>>>>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>>>>>> >>>>>>>>>>> I think I've got all the cases handled now, implementing with >>>>>>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>>>>>> >>>>>>>>>>> This also fixes a crash when returning i128. I've also >>>>>>>>>>> included a >>>>>>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>>>>>> >>>>>>>>>>> deep >>>>>>>>>>> >>>>>>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>>>>> > >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> One problem with this approach is that since i64 isn't >>>>>>>>>>>>>> legal, >>>>>>>>>>>>>> the >>>>>>>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>>>>>>> handle properly. It might make sense to introduce >>>>>>>>>>>>>> something >>>>>>>>>>>>>> like >>>>>>>>>>>>>> >>>>>>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>>>>>> >>>>>>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>>>>>> convention >>>>>>>>>>>>>> code to to let the target do something not easily >>>>>>>>>>>>>> representable >>>>>>>>>>>>>> in the tablegen minilanguage. >>>>>>>>>>>>> >>>>>>>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom >>>>>>>>>>>>> and a >>>>>>>>>>>>> way >>>>>>>>>>>>> to >>>>>>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>>>>>> directly in >>>>>>>>>>>>> the >>>>>>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>>>>>> generic >>>>>>>>>>>>> change >>>>>>>>>>>>> to the TableGen language. For example, the syntax might be >>>>>>>>>>>>> like: >>>>>>>>>>>>> >>>>>>>>>>>>> class foo : CCCustomAction { >>>>>>>>>>>>> code <<< EOF >>>>>>>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>>>>>>> and >>>>>>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>>>>>> EOF >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> Does this seem reasonable? An alternative is for CCCustom >>>>>>>>>>>>> to >>>>>>>>>>>>> take a >>>>>>>>>>>>> string that names a function to be called: >>>>>>>>>>>>> >>>>>>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>>>>>> >>>>>>>>>>>>> the function signature for such functions will have to >>>>>>>>>>>>> return >>>>>>>>>>>>> two >>>>>>>>>>>>> results: if the CC processing is finished and if it the >>>>>>>>>>>>> func >>>>>>>>>>>>> succeeded >>>>>>>>>>>>> or failed: >>>>>>>>>>>> >>>>>>>>>>>> I like the second solution better. It seems rather >>>>>>>>>>>> cumbersome >>>>>>>>>>>> to >>>>>>>>>>>> embed >>>>>>>>>>>> multi-line c++ code in td files. >>>>>>>>>>>> >>>>>>>>>>>> Evan >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>>>>>>> bool &result); >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>>> >>>>>>>>>>> < >>>>>>>>>>> arm_callingconv >>>>>>>>>>> .diff>_______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>> >>>>>>>>> < >>>>>>>>> arm_callingconv >>>>>>>>> .diff >>>>>>>>>> < >>>>>>>>>> arm_fixes.diff>_______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> >> < >> arm_callingconv >> .diff>_______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From deeppatel1987 at gmail.com Tue Feb 17 18:42:43 2009 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Tue, 17 Feb 2009 16:42:43 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902171641l59329d2bq23193d70e39ae4fb@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> <305d6f60902132027j3cc822dfw1dc817bb4a3f67b5@mail.gmail.com> <90A38A0E-3C84-4765-A521-A2ED66261056@apple.com> <305d6f60902171641l59329d2bq23193d70e39ae4fb@mail.gmail.com> Message-ID: <305d6f60902171642q1829daaqea5a5d206fefb162@mail.gmail.com> This time with the test cases actually attached. deep On Tue, Feb 17, 2009 at 4:41 PM, Sandeep Patel wrote: > On Mon, Feb 16, 2009 at 11:00 AM, Evan Cheng wrote: >> /// Information about how the value is assigned. >> - LocInfo HTP : 7; >> + LocInfo HTP : 6; >> >> Do you know why this change is needed? Are we running out of bits? > > HTP was't using all of these bits. I needed the hasCustom bit to come > from somewhere unless we wanted to grow this struct, so I grabbed a > bit from HTP. > >> - NeededStackSize = 4; >> - break; >> - case MVT::i64: >> - case MVT::f64: >> - if (firstGPR < 3) >> - NeededGPRs = 2; >> - else if (firstGPR == 3) { >> - NeededGPRs = 1; >> - NeededStackSize = 4; >> - } else >> - NeededStackSize = 8; >> + State.addLoc(CCValAssign::getCustomMem(ValNo, ValVT, >> + State.AllocateStack(4, 4), >> + MVT::i32, LocInfo)); >> + return true; // we handled it >> >> Your change isn't handling the "NeededStackSize = 8" case. > > I believe it is. I've attached two additional test cases. The > difference is that this case isn't handled by the CCCustomFns. They > fail to allocate any regs and then handling falls through to an > CCAssignToStack in ARMCallingConv.td. This is how other targets handle > similar allocations. > >> ++ static const unsigned HiRegList[] = { ARM::R0, ARM::R2 }; >> + static const unsigned LoRegList[] = { ARM::R1, ARM::R3 }; >> + >> + if (unsigned Reg = State.AllocateReg(HiRegList, LoRegList, 2)) { >> + unsigned i; >> + for (i = 0; i < 2; ++i) >> + if (HiRegList[i] == Reg) >> + break; >> + >> + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, Reg, >> + MVT::i32, LocInfo)); >> + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, LoRegList[i], >> + MVT::i32, LocInfo)); >> >> Since 'i' is used after the loop, please choose a better variable name. >> >> Actually, is the loop necessary? We know the low register is always >> one after the high register. Perhaps you can use >> ARMRegisterInfo::getRegisterNumbering(Reg), add one to 1. And the >> lookup the register enum with a new function (something like >> getRegFromRegisterNum(RegNo, ValVT)). >> >> The patch is looking good. I need to run it through some more tests. >> Unfortunately ARM target is a bit broken right now. I hope to fix it >> today. > > I'll submit a revised patch after we've settled on the NeededStackSize=8 issue. > > deep > >> Thanks, >> >> Evan >> >> On Feb 13, 2009, at 8:27 PM, Sandeep Patel wrote: >> >>> Sorry left a small bit of cruft in ARMCallingConv.td. A corrected >>> patch it attached. >>> >>> deep >>> >>> On Fri, Feb 13, 2009 at 6:41 PM, Sandeep Patel >> > wrote: >>>> Sure. Updated patches attached. >>>> >>>> deep >>>> >>>> On Fri, Feb 13, 2009 at 5:47 PM, Evan Cheng >>>> wrote: >>>>> >>>>> On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: >>>>> >>>>>> ARMTargetLowering doesn't need case #1, but it seemed like you >>>>>> and Dan >>>>>> wanted a more generic way to inject C++ code into the process so I >>>>>> tried to make the mechanism a bit more general. >>>>> >>>>> Ok. Since ARM doesn't need it and it's the only client, I'd much >>>>> rather have CCCustomFn just return a single bool indicating >>>>> whether it >>>>> can handle the arg. Would that be ok? >>>>> >>>>> Thanks, >>>>> >>>>> Evan >>>>> >>>>>> >>>>>> >>>>>> deep >>>>>> >>>>>> On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng >>>>>> wrote: >>>>>>> >>>>>>> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >>>>>>> >>>>>>>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>>>>>> > >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>>>>>>> >>>>>>>>>> Although it's not generally needed for ARM's use of CCCustom, I >>>>>>>>>> return >>>>>>>>>> two bools to handle the four possible outcomes to keep the >>>>>>>>>> mechanism >>>>>>>>>> flexible: >>>>>>>>>> >>>>>>>>>> * if CCCustomFn handled the arg or not >>>>>>>>>> * if CCCustomFn wants to end processing of the arg or not >>>>>>>>> >>>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>>> possibly >>>>>>>>> updating >>>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>>> it. It >>>>>>>>> must set >>>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>>> &LocInfo, >>>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>>> &State, >>>>>>>>> + bool &result); >>>>>>>>> >>>>>>>>> Is "result" what you refer to as "isCustom" in the comments? >>>>>>>>> >>>>>>>>> Sorry, I am still confused. You mean it could return true but >>>>>>>>> set >>>>>>>>> 'result' to false? That means it has handled the argument but it >>>>>>>>> would >>>>>>>>> not process any more arguments? What scenario do you envision >>>>>>>>> that >>>>>>>>> this will be useful? I'd rather keep it simple. >>>>>>>> >>>>>>>> As you note there are three actual legitimate cases (of the four >>>>>>>> combos): >>>>>>>> >>>>>>>> 1. The CCCustomFn wants the arg handling to proceed. This might >>>>>>>> be >>>>>>>> used akin to CCPromoteToType. >>>>>>>> 2. The CCCustomFn entirely handled the arg. This might be used >>>>>>>> akin to >>>>>>>> CCAssignToReg. >>>>>>>> 3. The CCCustomFn tried to handle the arg, but failed. >>>>>>>> >>>>>>>> these results are conveyed the following ways: >>>>>>>> >>>>>>>> 1. The CCCustomFn returns false, &result is not used. >>>>>>>> 2. The CCCustomFn returns true, &result is false; >>>>>>>> 3. The CCCustomFn returns true, &result is true. >>>>>>> >>>>>>> I don't think we want to support #1. If the target want to add >>>>>>> custom >>>>>>> code to handle an argument, if should be responsible for >>>>>>> outputting >>>>>>> legal code. Is there an actual need to support #1? >>>>>>> >>>>>>> Evan >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I tried to keep these CCCustomFns looking like TableGen generated >>>>>>>> code. Suggestions of how to reorganize these results are >>>>>>>> welcome. :-) >>>>>>>> Perhaps better comments around the typedef for CCCustomFn would >>>>>>>> suffice? >>>>>>>> >>>>>>>> The isCustom flag is simply a means for this machinery to >>>>>>>> convey to >>>>>>>> the TargetLowering functions to process this arg specially. It >>>>>>>> may >>>>>>>> not >>>>>>>> always be possible for the TargetLowering functions to determine >>>>>>>> that >>>>>>>> the arg needs special handling after all the changes made by the >>>>>>>> CCCustomFn or CCPromoteToType and other transformations. >>>>>>>> >>>>>>>>>> I placed the "unsigned i" outside those loops because i is used >>>>>>>>>> after >>>>>>>>>> the loop. If there's a better index search pattern, I'd be >>>>>>>>>> happy >>>>>>>>>> to >>>>>>>>>> change it. >>>>>>>>> >>>>>>>>> Ok. >>>>>>>>> >>>>>>>>> One more nitpick: >>>>>>>>> >>>>>>>>> +/// CCCustom - calls a custom arg handling function >>>>>>>>> >>>>>>>>> Please capitalize "calls" and end with a period. >>>>>>>> >>>>>>>> Once we settle on the result handling changes, I'll submit an >>>>>>>> update >>>>>>>> with this change. >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Evan >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>>>>>>> changes. I >>>>>>>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>>>>>>> >>>>>>>>>> deep >>>>>>>>>> >>>>>>>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>>>>>>> wrote: >>>>>>>>>>> Thanks Sandeep. I did a quick scan, this looks really good. >>>>>>>>>>> But I >>>>>>>>>>> do >>>>>>>>>>> have a question: >>>>>>>>>>> >>>>>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>>>>> possibly >>>>>>>>>>> updating >>>>>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>>>>> it. It >>>>>>>>>>> must set >>>>>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>>>>> &LocInfo, >>>>>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>>>>> &State, >>>>>>>>>>> + bool &result); >>>>>>>>>>> >>>>>>>>>>> Is it necessary to return two bools (the second is returned by >>>>>>>>>>> reference in 'result')? I am confused about the semantics of >>>>>>>>>>> 'result'. >>>>>>>>>>> >>>>>>>>>>> Also, a nitpick: >>>>>>>>>>> >>>>>>>>>>> + unsigned i; >>>>>>>>>>> + for (i = 0; i < 4; ++i) >>>>>>>>>>> >>>>>>>>>>> The convention we use is: >>>>>>>>>>> >>>>>>>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Evan >>>>>>>>>>> >>>>>>>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>>>>>>> >>>>>>>>>>>> I think I've got all the cases handled now, implementing with >>>>>>>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>>>>>>> >>>>>>>>>>>> This also fixes a crash when returning i128. I've also >>>>>>>>>>>> included a >>>>>>>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>>>>>>> >>>>>>>>>>>> deep >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>>>>>> > >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> One problem with this approach is that since i64 isn't >>>>>>>>>>>>>>> legal, >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> bitcast would require custom C++ code in the ARM target to >>>>>>>>>>>>>>> handle properly. It might make sense to introduce >>>>>>>>>>>>>>> something >>>>>>>>>>>>>>> like >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>>>>>>> convention >>>>>>>>>>>>>>> code to to let the target do something not easily >>>>>>>>>>>>>>> representable >>>>>>>>>>>>>>> in the tablegen minilanguage. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am thinking that this requires two changes: add a flag to >>>>>>>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom >>>>>>>>>>>>>> and a >>>>>>>>>>>>>> way >>>>>>>>>>>>>> to >>>>>>>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>>>>>>> directly in >>>>>>>>>>>>>> the >>>>>>>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>>>>>>> generic >>>>>>>>>>>>>> change >>>>>>>>>>>>>> to the TableGen language. For example, the syntax might be >>>>>>>>>>>>>> like: >>>>>>>>>>>>>> >>>>>>>>>>>>>> class foo : CCCustomAction { >>>>>>>>>>>>>> code <<< EOF >>>>>>>>>>>>>> ....multi-line C++ code goes here that allocates regs & mem >>>>>>>>>>>>>> and >>>>>>>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>>>>>>> EOF >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> Does this seem reasonable? An alternative is for CCCustom >>>>>>>>>>>>>> to >>>>>>>>>>>>>> take a >>>>>>>>>>>>>> string that names a function to be called: >>>>>>>>>>>>>> >>>>>>>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>>>>>>> >>>>>>>>>>>>>> the function signature for such functions will have to >>>>>>>>>>>>>> return >>>>>>>>>>>>>> two >>>>>>>>>>>>>> results: if the CC processing is finished and if it the >>>>>>>>>>>>>> func >>>>>>>>>>>>>> succeeded >>>>>>>>>>>>>> or failed: >>>>>>>>>>>>> >>>>>>>>>>>>> I like the second solution better. It seems rather >>>>>>>>>>>>> cumbersome >>>>>>>>>>>>> to >>>>>>>>>>>>> embed >>>>>>>>>>>>> multi-line c++ code in td files. >>>>>>>>>>>>> >>>>>>>>>>>>> Evan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState &State, >>>>>>>>>>>>>> bool &result); >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>>>> >>>>>>>>>>>> < >>>>>>>>>>>> arm_callingconv >>>>>>>>>>>> .diff>_______________________________________________ >>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>> >>>>>>>>>> < >>>>>>>>>> arm_callingconv >>>>>>>>>> .diff >>>>>>>>>>> < >>>>>>>>>>> arm_fixes.diff>_______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>> < >>> arm_callingconv >>> .diff>_______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: arm_stack64_tests.diff Type: application/octet-stream Size: 1084 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/1d0dd739/attachment.obj From zhangzhengjian at gmail.com Tue Feb 17 19:41:30 2009 From: zhangzhengjian at gmail.com (zhengjian zhang) Date: Wed, 18 Feb 2009 09:41:30 +0800 Subject: [LLVMdev] sjlj-exceptions handlying Message-ID: <8e3538210902171741q6b2158d3s756293939bc30f3@mail.gmail.com> my ugly way about the sjlj-eh is: in the last part of the llvm codegen build some relative sjlj-eh runtime function. ok, why i do like that, because i want quick run of the sjlj-en for my target., above method,based on the dwarf-eh(llvm used), now the other work of my method are emit except table, also base on llvm, now i have problem about the emit except table for sjlj. because llvm used dwarf-eh info no use for sjlj-eh! may be i have to find another way do my work. all are machine dependent ! it's very difficult port to other target. if someone can add a general way do this that would be help ! regards zhangzw From delesley.spambox at googlemail.com Tue Feb 17 11:26:13 2009 From: delesley.spambox at googlemail.com (DeLesley SpamBox) Date: Tue, 17 Feb 2009 10:26:13 -0700 Subject: [LLVMdev] Parametric polymorphism Message-ID: I'm a newcomer to llvm, but what you've done so far is very impressive. Llvm is a godsend to anybody who is attempting to implement their own their own language. :-) My company is considering using llvm as the backend for a small matlab-like language for scientific computation; our other option is MSIL. After reading through the documentation, I noticed that llvm seems to have one major limitation -- the lack of parametric polymorphism. I would like to compile code such as the following: max (T a, T b) { if (a > b) return a; else return b; } There are, of course, various ways to implement the above code. I could compile the above function to a fully generic version with boxed arguments, but that is very slow for scalar types. I could also take the C++ template route, and generate different IR code for every type instantiation. However, I have spent way too much time fighting with templates and code bloat to like that idea. I believe that type instantiation should ideally be handled by llvm, rather than the high-level language. First of all, there are a lot of optimization passes that are type-invariant; it would be nice to be able to partially optimize the code before instantiation. Second, type substitution is very similar to many of the other optimizations that llvm already does, such as inlining, constant propagation, and so on. And third, I am planning to use llvm in JIT mode, and it just makes more sense (to me) to instantiate such functions on demand, at run-time. Are there any plans to add such capability to llvm? I tried looking through the list archives for any discussion, but the archives are not searchable (or I have not figured out how to search them) so I didn't find much; feel free to point me to the proper place. How difficult would it be to add such a capability to llvm? I was thinking of marking type variables like T as opaque types for the initial codegen, and then writing a custom pass that instantiates them to real types. However, I don't know if that would confuse or break other parts of the compiler infrastructure; parametric polymorphism is not necessarily a trivial modification. My personal background is in type theory; I received my doctorate from the functional programming group at the University of Edinburgh. I love the fact that llvm uses a typed assembly language, but the actual type system that is currently used is pretty limited; it seems to be mostly a copy of C. I know that compatibility with C is very important to llvm for obvious reasons, but IMHO, the single biggest problem in making different languages talk to one another is the type system. I'm not a fan of Microsoft's common type system because it's far too OOP-centric, but the basic idea is a good one. I think llvm would really benefit from having a much stronger, but still low-level and language-neutral type theory; it would enable cross-platform multi-language libraries to be developed in any language, and then distributed as llvm IR. (The other major necessity is an accurate and high-performance garbage collector, but the intrinsics for that are already in place.) Is anyone on this list familiar with System F, System F_sub, or System F^\omega_sub? They comprise the basic, standard theories of parametric polymorphism used in the academic world, and have been around for about 20 years. You can obviously get more sophisticated, but the System-F series of calculi have the advantage that they are simple, well-known, off-the-shelf solutions. Pick one, plug it into llvm, and you have a type system that can compete with the JVM or .NET in terms of functionality, without being OOP centric or sacrificing language neutrality. (System F is low-level -- OOP can be easily implemented on top of it). -DeLesley From delesley.spambox at googlemail.com Tue Feb 17 11:30:59 2009 From: delesley.spambox at googlemail.com (DeLesley SpamBox) Date: Tue, 17 Feb 2009 10:30:59 -0700 Subject: [LLVMdev] Clang not mentioned on web site? Message-ID: Why isn't the clang project mentioned anywhere on the main llvm web site? -DeLesley From clattner at apple.com Tue Feb 17 21:49:09 2009 From: clattner at apple.com (Chris Lattner) Date: Tue, 17 Feb 2009 19:49:09 -0800 Subject: [LLVMdev] Clang not mentioned on web site? In-Reply-To: References: Message-ID: <2E9B3D47-02A7-4855-9D23-7606ED5CDE13@apple.com> On Feb 17, 2009, at 9:30 AM, DeLesley SpamBox wrote: > Why isn't the clang project mentioned anywhere on the main llvm > web site? Clang is a secret! Sssh! -Chris From alexei.svitkine at gmail.com Tue Feb 17 22:00:42 2009 From: alexei.svitkine at gmail.com (Alexei Svitkine) Date: Tue, 17 Feb 2009 23:00:42 -0500 Subject: [LLVMdev] Patch: Prefix for ParseCommandLineOptions() Message-ID: <62d9ffc00902172000h4a202e9ch72b37fcdc0ad1108@mail.gmail.com> The motivation behind this patch is that tools that use LLVM as a library and want to use its command line parsing facilities may not want all the various options defined in the LLVM libraries to be available - simply because they may not be relevant. This patch adds an optional "Prefix" parameter to ParseCommandLineOptions(). If set, this will make inaccessible any options that do not begin with the prefix string. Any options with the prefix string will act as if they were defined without that string. For example, if prefix string is "foo-", and "foo-bar" is a defined option, it will be displayed under --help as "--bar" and can be used as "--bar". Any options without "foo-" prefix will not be displayed under --help and will not be useable from the command line. -Alexei -------------- next part -------------- A non-text attachment was scrubbed... Name: CommandLinePrefix.diff Type: application/octet-stream Size: 11877 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090217/b4895e83/attachment.obj From me22.ca at gmail.com Tue Feb 17 23:08:43 2009 From: me22.ca at gmail.com (me22) Date: Wed, 18 Feb 2009 00:08:43 -0500 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: On Tue, Feb 17, 2009 at 12:26, DeLesley SpamBox wrote: > After reading through the documentation, I noticed that llvm seems to > have one major limitation -- the lack of parametric polymorphism. I think the problem is deeper than that, in that LLVM has no official concept of a subtype, so I don't see how the idea of polymorphism could be defined in it. > I would like to compile code such as the following: > > max (T a, T b) { > if (a > b) return a; else return b; > } > Also, "Comparable" implies some kind of function associated with the type in order to actually perform it, and LLVM has no such association mechanism. I'd argue that structural typing means that it cannot, since I don't want std::pair's operator< to work on std::complex, despite them having the same representation. > There are, of course, various ways to implement the above code. I could > compile the above function to a fully generic version with boxed arguments, > but that is very slow for scalar types. I could also take the C++ template > route, and generate different IR code for every type instantiation. However, > I have spent way too much time fighting with templates and code bloat to > like that idea. > I'd be curious to see if inlining + mem2reg would be able to automatically unbox that example, if you always generate auto-boxing code. Even for things too big to inline, there may be a general pass to convert the mallocs from boxing into allocas if nothing captures the argument pointers. I'm by no means an LLVM or compiler expert, though, so I could be quite wrong. ~ Scott From baldrick at free.fr Wed Feb 18 01:31:48 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 18 Feb 2009 08:31:48 +0100 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> Message-ID: <200902180831.48813.baldrick@free.fr> Hi Scott, > Anyone out there interested in helping out with a subversion pre-commit hook > to: > > - remove trailing whitespace, > - expand tabs to spaces, > - detect 80-col violations, > > as well as detect other style guideline breakage? > > I just ran into the trailing whitespace problem: Eclipse and other editors > like to trim excess whitespace from source. However, when one commits a > patch with trailing whitespace removed, the extraneous diffs make reading > the patch more difficult. from the subversion manual: "While hook scripts can do almost anything, there is one dimension in which hook script authors should show restraint: do not modify a commit transaction using hook scripts. While it might be tempting to use hook scripts to automatically correct errors, shortcomings, or policy violations present in the files being committed, doing so can cause problems. Subversion keeps client-side caches of certain bits of repository data, and if you change a commit transaction in this way, those caches become indetectably stale. This inconsistency can lead to surprising and unexpected behavior. Instead of modifying the transaction, you should simply validate the transaction in the pre-commit hook and reject the commit if it does not meet the desired requirements. As a bonus, your users will learn the value of careful, compliance-minded work habits." Ciao, Duncan. From jlerouge at apple.com Wed Feb 18 03:41:51 2009 From: jlerouge at apple.com (Julien Lerouge) Date: Wed, 18 Feb 2009 01:41:51 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> Message-ID: <20090218094151.GA6282@pom.apple.com> On Tue, Feb 17, 2009 at 02:21:23PM -0800, Scott Michel wrote: > Anyone out there interested in helping out with a subversion pre-commit hook > to: > > - remove trailing whitespace, > - expand tabs to spaces, > - detect 80-col violations, > > as well as detect other style guideline breakage? > > I just ran into the trailing whitespace problem: Eclipse and other editors > like to trim excess whitespace from source. However, when one commits a > patch with trailing whitespace removed, the extraneous diffs make reading > the patch more difficult. > > Reply to me privately if you're interested in helping out. > > -scooter Yet another _fun_ way of doing this is to setup a buildbot slave just for that. The slave can fix minor stuff like tabs and trailing whitespaces on its own (checking the changes back in), and yell for things like 80-col violations and whatnot where the changes would not be so trivial. People who don't care are not bothered too much (their code might be changed by the slave), and fascis^H^H^H^H^H^Hpeople who care can quickly find out where the errors are. By setting the tree stable timer for the slave doing the check lower than the slaves doing the actual builds, no extra build is generated in case the code is modified. Beware, some people might try to ddos your buildbot after seeing all their code rewritten, and some other will simply hate you for all the yelling ;-) Julien, -- Julien Lerouge PGP Key Id: 0xB1964A62 PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62 PGP Public Key from: keyserver.pgp.com From gordonhenriksen at me.com Wed Feb 18 07:06:23 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Wed, 18 Feb 2009 08:06:23 -0500 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: <550B442B-1E9B-4C56-A1E7-A3584EDA2EA1@me.com> Hi DeLesley, This is by design. LLVM's type system is very low-level; it doesn't even have a concept of types as most languages reason about them. For instance: struct ColorRGB { float r, g, b; } struct ColorHSV { float h, s, v; } struct ColorHSV { float h, s, v; } struct Point3D { float x, y, z; } all become the same type { float, float, float } in LLVM IR, as you know. Expecting it to directly support generics seems a third-order-of- magnitude leap of faith. :) But there is good news for the faithful? > In regards instantiation on demand, you can do this today: 1. Create a facility such that you can determine which type-generic function x specialized types to instantiate based upon a symbol name. This could be a registry or a name mangling scheme. 2. Implement your own ModuleProvider, overriding materializeFunction to perform specialization rather than loading the type from disk. http://llvm.org/doxygen/classllvm_1_1ModuleProvider.html You'll be in complete control of which insantiation, of course. One could argue that LLVM could have much better support for type genericity by simply allowing full use of abstract data types (those containing opaque types) to be valid in IR, but not for codegen. In general, this creates a second-class "abstract function." The consequences of this would then need to ripple through the design. I wouldn't expect the impact to be too high, for the most part; LLVM already deals with objects of unknown size. Still, there are a large number of potential foibles here. For instance, passing an argument can require platform-specific contortions to conform to the platform ABI, and these contortions depend on information from the high-level (C) type which is not recoverable from LLVM's structural types. Specialization in this scheme would entail a modification of the existing CloneFunction algorithm. The method Type::refineAbstractTypeTo is not useful here, because it would destroy the original type-generic template. http://llvm.org/doxygen/namespacellvm.html#82ca1ea30b8e181ed30dc10bdd1bfbad Instead of creating a literal copy, the algorithm would need to inspect the type of each IR object, replacing abstract data types (as LLVM already supports) with concrete ones as required by the instantiation. It could also jump through platform ABI hoops at call sites if required, but that would be quite complex. But were I you, I wouldn't hold my breath waiting for someone else to implement it for me. :) The answer today is to generate IR anew for each specialization, using the materializeFunction in a managed VM environment, or doing so statically. Improved support for specialization would be an interesting capability to add to LLVM's toolbox, but I would not expect LLVM to fully internalize support for one particular instantiation type system and scheme anytime in the foreseeable future. As for code bloat, you could take .NET's compromise example and use the same code instantiation (but different metadata in your VM) for reference types, but create full specializations for value types. Given your concerns, you clearly have strong ideas about how type specialization should be implemented; why do you think having LLVM make the decision for you internally would be better than making the decision yourself, as you can do today? I'll let others comment on the alternate type system. I wouldn't expect this to happen, personally. > On 2009-02-17, at 12:26, DeLesley SpamBox wrote: > I'm a newcomer to llvm, but what you've done so far is very > impressive. > Llvm is a godsend to anybody who is attempting to implement their own > their own language. :-) My company is considering using llvm as the > backend for a small matlab-like language for scientific computation; > our > other option is MSIL. > > After reading through the documentation, I noticed that llvm seems to > have one major limitation -- the lack of parametric polymorphism. > I would > like to compile code such as the following: > > max (T a, T b) { > if (a > b) return a; else return b; > } > > There are, of course, various ways to implement the above code. I > could > compile the above function to a fully generic version with boxed > arguments, > but that is very slow for scalar types. I could also take the C++ > template > route, and generate different IR code for every type instantiation. > However, > I have spent way too much time fighting with templates and code > bloat to > like that idea. > I believe that type instantiation should ideally be handled by llvm, > rather > than the high-level language. > First of all, there are a lot of > optimization passes > that are type-invariant; it would be nice to be able to partially > optimize the code > before instantiation. Second, type substitution is very similar to > many of the > other optimizations that llvm already does, such as inlining, constant > propagation, and so on. > And third, I am planning to use llvm in JIT mode, and > it just makes more sense (to me) to instantiate such functions on > demand, at > run-time. > Are there any plans to add such capability to llvm? I tried looking > through the > list archives for any discussion, but the archives are not > searchable (or I have > not figured out how to search them) so I didn't find much; feel free > to point me > to the proper place. > > How difficult would it be to add such a capability to llvm? I was > thinking of > marking type variables like T as opaque types for the initial > codegen, and then > writing a custom pass that instantiates them to real types. > However, I don't > know if that would confuse or break other parts of the compiler > infrastructure; > parametric polymorphism is not necessarily a trivial modification. > > My personal background is in type theory; I received my doctorate > from the > functional programming group at the University of Edinburgh. I love > the fact > that llvm uses a typed assembly language, but the actual type system > that > is currently used is pretty limited; it seems to be mostly a copy of > C. > > I know that compatibility with C is very important to llvm for > obvious reasons, > but IMHO, the single biggest problem in making different languages > talk to > one another is the type system. I'm not a fan of Microsoft's common > type > system because it's far too OOP-centric, but the basic idea is a > good one. > I think llvm would really benefit from having a much stronger, but > still low-level > and language-neutral type theory; it would enable cross-platform > multi-language > libraries to be developed in any language, and then distributed as > llvm IR. > (The other major necessity is an accurate and high-performance garbage > collector, but the intrinsics for that are already in place.) > > Is anyone on this list familiar with System F, System F_sub, or System > F^\omega_sub? They comprise the basic, standard theories of > parametric > polymorphism used in the academic world, and have been around for > about > 20 years. You can obviously get more sophisticated, but the System- > F series > of calculi have the advantage that they are simple, well-known, off- > the-shelf > solutions. Pick one, plug it into llvm, and you have a type system > that can > compete with the JVM or .NET in terms of functionality, without > being OOP > centric or sacrificing language neutrality. (System F is low-level -- > OOP can be > easily implemented on top of it). ? Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090218/cf773f85/attachment.html From jon at ffconsultancy.com Wed Feb 18 08:53:13 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 18 Feb 2009 14:53:13 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: <200902181453.13508.jon@ffconsultancy.com> On Tuesday 17 February 2009 17:26:13 DeLesley SpamBox wrote: > I'm a newcomer to llvm, but what you've done so far is very impressive. > Llvm is a godsend to anybody who is attempting to implement their own > their own language. :-) My company is considering using llvm as the > backend for a small matlab-like language for scientific computation; Very interesting. My company is prototyping a high-level VM built upon LLVM that is also designed for scientific computing. > our other option is MSIL. We decided not to target MSIL because it will be impossible to compete with Microsoft's F#. Building upon LLVM offers *much* better performance and platform independence. If the start of this year has been anything to go by, LLVM also has much better commercial potential and I believe we could be earning more revenue from it than from .NET in 12 months time. > After reading through the documentation, I noticed that llvm seems to > have one major limitation -- the lack of parametric polymorphism. I would > like to compile code such as the following: > > max (T a, T b) { > if (a > b) return a; else return b; > } > > There are, of course, various ways to implement the above code. I could > compile the above function to a fully generic version with boxed arguments, > but that is very slow for scalar types. I could also take the C++ template > route, and generate different IR code for every type instantiation. > However, I have spent way too much time fighting with templates and code > bloat to like that idea. I am using the latter approach and it is easy to implement and works well so far, although our test base in tiny so bloat is not a problem. > I believe that type instantiation should ideally be handled by llvm, rather > than the high-level language. First of all, there are a lot of > optimization passes > that are type-invariant; it would be nice to be able to partially > optimize the code > before instantiation. Second, type substitution is very similar to many of > the other optimizations that llvm already does, such as inlining, constant > propagation, and so on. And third, I am planning to use llvm in JIT mode, > and it just makes more sense (to me) to instantiate such functions on > demand, at run-time. Excellent idea. > Are there any plans to add such capability to llvm? I do not believe so. > How difficult would it be to add such a capability to llvm? I was thinking > of marking type variables like T as opaque types for the initial codegen, > and then writing a custom pass that instantiates them to real types. > However, I don't know if that would confuse or break other parts of the > compiler infrastructure; parametric polymorphism is not necessarily a > trivial modification. What complications do you forsee? > My personal background is in type theory; I received my doctorate from the > functional programming group at the University of Edinburgh. I love the > fact that llvm uses a typed assembly language, but the actual type system > that is currently used is pretty limited; it seems to be mostly a copy of > C. Yes. LLVM has augmented the functionality required for C with some low-level but critical features like tail call elimination but it does not implement any of the higher-level features you may have expected. After all, it is the LLvm. ;-) LLVM is ideal for building HLVMs and CLRs though. > I know that compatibility with C is very important to llvm for obvious > reasons, but IMHO, the single biggest problem in making different languages > talk to one another is the type system. I'm not a fan of Microsoft's > common type system because it's far too OOP-centric, but the basic idea is > a good one. Absolutely. > I think llvm would really benefit from having a much stronger, > but still low-level > and language-neutral type theory; it would enable cross-platform > multi-language libraries to be developed in any language, and then > distributed as llvm IR. The prospect of higher-level VMs is certainly the single most compelling aspect of LLVM for me (and many other people, I believe) but I am not convinced that such functionality deserves a place in LLVM itself. You can implement parametric polymorphism easily on top of LLVM today. LLVM's optimizations will be run (probably redundantly) on multiple instantiations but the HLVM will also be rewriting code, e.g. to interoperate with the GC and to perform optimizations like instantiating higher-order functions for a given function argument. Rather than trying to push high-level type system features into the LLVM I would opt for an HLVM that provided not only a decent type system but also garbage collection, reflection and any other high-level features of interest. I would also prioritize getting HLVM 1.0 shipped over the technical aspects of integration otherwise you are likely to end up with nothing. > (The other major necessity is an accurate and > high-performance garbage collector, but the intrinsics for that are already > in place.) I heard that LLVM's GC API is immature and largely untested so I chose to assume an uncooperative environment instead. > Is anyone on this list familiar with System F, System F_sub, or System > F^\omega_sub? I am only vaguely aware of System F because I use MLs extensively and HM is a relative. > They comprise the basic, standard theories of parametric > polymorphism used in the academic world, and have been around for about > 20 years. You can obviously get more sophisticated, but the System-F > series of calculi have the advantage that they are simple, well-known, > off-the-shelf solutions. Pick one, plug it into llvm, and you have a type > system that can compete with the JVM or .NET in terms of functionality, > without being OOP centric or sacrificing language neutrality. (System F is > low-level -- OOP can be > easily implemented on top of it). Sounds like a dream come true. Where's the catch? ;-) -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From lennart at augustsson.net Wed Feb 18 08:57:17 2009 From: lennart at augustsson.net (Lennart Augustsson) Date: Wed, 18 Feb 2009 14:57:17 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: Yes, I'm familiar with System F and its siblings. It's not as easy as just plugging that into LLVM and you get something good. Plugging them together would take hard work, and then you have to do specialization at the LLVM level anyway to get good code. I don't think the LLVM should have polymorphism, like Jon Harrop I think that should be dealt with at a higher level. -- Lennart On Tue, Feb 17, 2009 at 5:26 PM, DeLesley SpamBox wrote: > Is anyone on this list familiar with System F, System F_sub, or System > F^\omega_sub? They comprise the basic, standard theories of parametric > polymorphism used in the academic world, and have been around for about > 20 years. You can obviously get more sophisticated, but the System-F series > of calculi have the advantage that they are simple, well-known, off-the-shelf > solutions. Pick one, plug it into llvm, and you have a type system that can > compete with the JVM or .NET in terms of functionality, without being OOP > centric or sacrificing language neutrality. (System F is low-level -- > OOP can be > easily implemented on top of it). From dag at cray.com Wed Feb 18 10:00:40 2009 From: dag at cray.com (David Greene) Date: Wed, 18 Feb 2009 10:00:40 -0600 Subject: [LLVMdev] LLVM 2.4 Dominance Frontier Problem Message-ID: <200902181000.40420.dag@cray.com> I just finished upgrading our LLVM to 2.4 and I immediately ran into a problem with dominance frontier calculation: llvm/lib/VMCore/PassManager.cpp:714: void llvm::PMDataManager::verifyDomInfo(llvm::Pass&, llvm::Function&): Assertion `0 && "Invalid dominator info"' failed. Strangely enough, the Pass running when the assert triggers is Dominance Frontier Construction. This is somewhat puzzling. Any idea where to start looking? -Dave From dpatel at apple.com Wed Feb 18 10:35:14 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 18 Feb 2009 08:35:14 -0800 Subject: [LLVMdev] LLVM 2.4 Dominance Frontier Problem In-Reply-To: <200902181000.40420.dag@cray.com> References: <200902181000.40420.dag@cray.com> Message-ID: <2A9E9F2E-38F6-422F-8A51-54A40DB9F505@apple.com> On Feb 18, 2009, at 8:00 AM, David Greene wrote: > I just finished upgrading our LLVM to 2.4 and I immediately ran into > a problem > with dominance frontier calculation: > > llvm/lib/VMCore/PassManager.cpp:714: void > llvm::PMDataManager::verifyDomInfo(llvm::Pass&, llvm::Function&): > Assertion > `0 && "Invalid dominator info"' failed. > > Strangely enough, the Pass running when the assert triggers is > Dominance > Frontier Construction. > > This is somewhat puzzling. Any idea where to start looking? The PassManager must have printed dominator diffs. before this assertion. Go and fix the last pass that mainpulated dominator info or file a PR with reproducible test case for mainline. IIRC, dom info verifier is not enabled by default. - Devang From czoccolo at gmail.com Wed Feb 18 10:37:54 2009 From: czoccolo at gmail.com (Corrado Zoccolo) Date: Wed, 18 Feb 2009 17:37:54 +0100 Subject: [LLVMdev] LLVM C bindings Message-ID: <4e5e476b0902180837o4ed31c68y1550ce78ec6156d5@mail.gmail.com> On Mon, 16 Feb 2009 21:18:48 +0000 Jon Harrop wrote: > > On Monday 16 February 2009 20:04:38 Paul Melis wrote: > Yes. I similarly found that tail calls, sret and parts of first-class structs > are not usable from OCaml and much of the functionality was not implemented > in the C API in LLVM 2.4. > >> SWIG (www.swig.org) recently added a C output mode, that is capable of >> generating a C API for a C++ one. >> It was a Summer of Code project, so I'm not sure how mature it is. The >> docs are here: >> http://swig.svn.sourceforge.net/viewvc/swig/branches/gsoc2008-maciekd/Doc/M >>anual/C.html > > The generation of this FFI code should certainly be automated. However, if the > necessary tools are not yet stable perhaps it would be wise to consider > looser bindings such as XML-RPC? I assume there are tools that can examine a > C++ API from headers in order to create an XML-RPC server automatically? > > An XML-RPC API would be trivial to use and extend from languages like OCaml > and Python and the interface code should not require any maintenance at all. > XML-RPC assumes you want an external server, or it can be used from the same process? If SWIG is not mature enough, and/or doesn't provide the needed level of flexibility, we have other options: * write a llc backend that generates C-bindings for C++ compiled code (general solution, could replace SWIG). * write a perl script that parses nm output from LLVM libraries and creates the binding code. Maybe I'll try one of those approaches when I have time... Do we have coding conventions in LLVM to distinguish private methods from public methods in classes, in order to easily identify which methods should be exported? Corrado From dpatel at apple.com Wed Feb 18 10:53:00 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 18 Feb 2009 08:53:00 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <20090218094151.GA6282@pom.apple.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <20090218094151.GA6282@pom.apple.com> Message-ID: On Feb 18, 2009, at 1:41 AM, Julien Lerouge wrote: > Yet another _fun_ way of doing this is to setup a buildbot slave just > for that. The slave can fix minor stuff like tabs and trailing > whitespaces on its own (checking the changes back in), and yell for > things like 80-col violations and whatnot where the changes would > not be > so trivial. If you're going to change anything then this is the best alternative, otherwise I can live with status quo. Do not reject commit just because of formatting issues. It can have serious -ve impact on productivity. To folks who prefers to reject commits due to formatting errors -- You already rely on a some kind of "tool" to make your day to day life easier. [ Most likely you've your editor automatically replacing tabs into spaces. Your terminal window is only 80 col. wide or your editor is displaying a vertical line to warn you about 80 col. and so on... ]. The build bot suggested by Julien is yet another "tool" that accomplishes the same. One the slave bot can use clang static analyzer ... :) - Devang From jon at ffconsultancy.com Wed Feb 18 11:14:54 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 18 Feb 2009 17:14:54 +0000 Subject: [LLVMdev] LLVM C bindings In-Reply-To: <4e5e476b0902180837o4ed31c68y1550ce78ec6156d5@mail.gmail.com> References: <4e5e476b0902180837o4ed31c68y1550ce78ec6156d5@mail.gmail.com> Message-ID: <200902181714.54224.jon@ffconsultancy.com> On Wednesday 18 February 2009 16:37:54 Corrado Zoccolo wrote: > On Mon, 16 Feb 2009 21:18:48 +0000 > Jon Harrop wrote: > > An XML-RPC API would be trivial to use and extend from languages like > > OCaml and Python and the interface code should not require any > > maintenance at all. > > XML-RPC assumes you want an external server, or it can be used from > the same process? You can do it entirely in process by passing strings. This requires a minimal C API at either end and autogenerated serialization code. > If SWIG is not mature enough, and/or doesn't provide the needed level > of flexibility, we have other options: > > * write a llc backend that generates C-bindings for C++ compiled code > (general solution, could replace SWIG). Metaprogramming using LLVM itself is certainly the best way forward in the long term (IMHO) but it would be much more alluring if Clang could compile LLVM itself... In the mean time, I think it would probably be much easier to reuse some existing tools for C++. Surely there are some mature tools to expose a C++ API in a language agnostic way?! -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From delesley.spambox at googlemail.com Wed Feb 18 11:32:39 2009 From: delesley.spambox at googlemail.com (DeLesley SpamBox) Date: Wed, 18 Feb 2009 10:32:39 -0700 Subject: [LLVMdev] Parametric polymorphism Message-ID: > I think the problem is deeper than that, in that LLVM has no official > concept of a subtype, so I don't see how the idea of polymorphism > could be defined in it. Parametric polymorphism is different from subtype polymorphism; you can have one without the other. Parametric polymorphism just means that you can use type variables (like T) in the IR, which are later instantiated to actual types. >> max (T a, T b) { >> if (a > b) return a; else return b; >> } >> > > Also, "Comparable" implies some kind of function associated with the > type in order to actually perform it... True, but I'm not worried about that; method tables are easy to add once type substitutions are allowed. Dropping down to a mythical low-level language that I call template C, we would have: struct ComparableDict { bool (*equals)(T a, T b); bool (*leq)(T a, T b); bool (*geq)(T a, T b); } T max(ComparableDict* dict, T a, T b) { if ((*dict->geq)(a,b)) return a; else return b; } This mechanism duplicates the dictionary-passing used in Haskell type classes. The correct dictionary for any given type is inferred by the high-level language, but it is passed as an argument at the llvm level. Notice that there is no subtyping. The only things I need from llvm are: (1) The ability to define a parameterized type, e.g. ComparableDict. (2) The ability to define a function that is parameterized by a type, e.g. max. (3) The ability to substitute a type for a type variable, e.g. specialize max to max. In order to get good performance, you will need to instantiate max with both a type T, and the dictionary for T. In other words, max must be partially evaluated with respect to a given type -and- a given dictionary. While this is obviously a complication, partial evaluation with respect to constant arguments is something that llvm is already quite capable of. If there is not a pass that does it already, I could easily write one. The problem is that I have no way of declaring a parameterized type or a parameterized function within the llvm IR. Note also that when I talk about ``types'', I am referring to low-level llvm types -- scalars, records, pointers, etc. I'm not referring to complex functional types with pattern matching as found in Haskell, or complex OOP classes as found in JVM or .NET with constructors, methods, and all that jazz. -DeLesley From dag at cray.com Wed Feb 18 11:35:38 2009 From: dag at cray.com (David Greene) Date: Wed, 18 Feb 2009 11:35:38 -0600 Subject: [LLVMdev] LLVM 2.4 Dominance Frontier Problem In-Reply-To: <2A9E9F2E-38F6-422F-8A51-54A40DB9F505@apple.com> References: <200902181000.40420.dag@cray.com> <2A9E9F2E-38F6-422F-8A51-54A40DB9F505@apple.com> Message-ID: <200902181135.39172.dag@cray.com> On Wednesday 18 February 2009 10:35, Devang Patel wrote: > On Feb 18, 2009, at 8:00 AM, David Greene wrote: > > I just finished upgrading our LLVM to 2.4 and I immediately ran into > > a problem > > with dominance frontier calculation: > > > > llvm/lib/VMCore/PassManager.cpp:714: void > > llvm::PMDataManager::verifyDomInfo(llvm::Pass&, llvm::Function&): > > Assertion > > `0 && "Invalid dominator info"' failed. > > > > Strangely enough, the Pass running when the assert triggers is > > Dominance > > Frontier Construction. > > > > This is somewhat puzzling. Any idea where to start looking? > > The PassManager must have printed dominator diffs. before this > assertion. Yes, it did. There are no diffs. > Go and fix the last pass that mainpulated dominator info or > file a PR with reproducible test case for mainline. IIRC, dom info > verifier is not enabled by default. No, it's not and it looks like that was the problem. This looks like it's another misuse of C++. When I turn on --enable-expensive-checks, things blow up all over the place. There's an increment of a singular iterator in CominanceFrontierBase::compare. I fixed this issue and that solved the problem on my current testcase. We really need to start testing with --enable-expensive-checks. After finishing up the details of our merge, I'm going to go polish the validator some more. -Dave From dpatel at apple.com Wed Feb 18 11:45:54 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 18 Feb 2009 09:45:54 -0800 Subject: [LLVMdev] LLVM 2.4 Dominance Frontier Problem In-Reply-To: <200902181135.39172.dag@cray.com> References: <200902181000.40420.dag@cray.com> <2A9E9F2E-38F6-422F-8A51-54A40DB9F505@apple.com> <200902181135.39172.dag@cray.com> Message-ID: <665DB2F3-9D91-472E-B840-B6FFEEC20C8B@apple.com> On Feb 18, 2009, at 9:35 AM, David Greene wrote: >> Go and fix the last pass that mainpulated dominator info or >> file a PR with reproducible test case for mainline. IIRC, dom info >> verifier is not enabled by default. > > No, it's not and it looks like that was the problem. > > This looks like it's another misuse of C++. When I turn on > --enable-expensive-checks, things blow up all over the place. > There's an > increment of a singular iterator in CominanceFrontierBase::compare. > > I fixed this issue and that solved the problem on my current testcase. Great! Pl. apply the fix to mainline also. > We really need to start testing with --enable-expensive-checks. After > finishing up the details of our merge, I'm going to go polish the > validator > some more. Thanks! - Devang From Micah.Villmow at amd.com Wed Feb 18 12:14:12 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Wed, 18 Feb 2009 10:14:12 -0800 Subject: [LLVMdev] Possible error in LegalizeDAG Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827DA1@ssanexmb1.amd.com> I'm still trying to track down some alignment issues with loads(i.e. 8/16 bit loads being turned into 32bit sign extending loads) and I cannot for the life of me seem to figure out how to enter this section of code: // If this is an unaligned load and the target doesn't support it, // expand it. if (!TLI.allowsUnalignedMemoryAccesses()) { unsigned ABIAlignment = TLI.getTargetData()-> getABITypeAlignment(LD->getMemoryVT().getTypeForMVT()); if (LD->getAlignment() < ABIAlignment){ Result = ExpandUnalignedLoad(cast(Result.getNode()), DAG, TLI); Tmp1 = Result.getOperand(0); Tmp2 = Result.getOperand(1); Tmp1 = LegalizeOp(Tmp1); Tmp2 = LegalizeOp(Tmp2); } } This is from LegalizeDAG.cpp:2146 The problem that I see is that LD->getAlignment() is set via the call getMVTAlignment(VT) in SelectionDAG.cpp:3385, which in turn calls TLI.getTargetData()->getABITypeAlignment(Ty). So, the statement if (LD->getALignment() < ABIAlignment) always fails from what I can see. Even if I set in my DataLayout that i8 should have a 32bit ABI alignment, this does not work because the load alignment is set to the ABI alignment instead of being set based on the actual bit size. Any hints would be greatly appreciated, this is a blocking issue that I just cannot seem to resolve without modifying the LLVM codebase to remove the extend + load -> extload combining step. Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090218/2d66731f/attachment.html From delesley.spambox at googlemail.com Wed Feb 18 13:53:25 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 12:53:25 -0700 Subject: [LLVMdev] Parametric polymorphism Message-ID: Thanks for the detailed response! :-) > This is by design. LLVM's type system is very low-level... Yes, and it should remain low-level. :-) > Expecting it to directly support generics seems a third-order-of- > magnitude leap of faith. :) But there is good news for the faithful? Let us distinguish between generics as found in java or .Net, and parametric polymorphism in general. Generics are intimately tied to classes, which are big, complex, OOP-centric, and definitely not low-level. Parametric polymorphism only means that the LLVM IR should support type variables. I do not regard type variables as being high-level, or a leap of faith. The llvm type system already supports recursive types, which are a whole lot more complicated than simple type variables. In fact, the current version of recursive types already have type variables -- they are called "up references". In type theory, recursive types are generally represented using the syntax: rec X. Where is a type expression. The variable X can appear within and acts as a recursive reference to . The llvm type { int, \1* } would thus ordinarily be written as: rec X. { int, X* }. I am proposing to extend the llvm type system with types of the form: forall X. Conceptually, this is not any higher-level than a recursive type. > One could argue that LLVM could have much better support for type > genericity by simply allowing full use of abstract data types (those > containing opaque types) to be valid in IR, but not for codegen. That's more or less what I'm suggesting. A type variable refers to an unknown, or ``opaque'' type, which then becomes known later when the variable is instantiated. However, abstract data types are not necessarily invalid for codegen; it depends on how the types are used. More on this below... > Still, there are a large number of potential foibles here. For > instance, passing an argument can require platform-specific > contortions to conform to the platform ABI... Are those contortions done by the native code generator back-end, or are they done when the C compiler generates llvm IR? I'm assuming it's done by the back-end, because it would be bad if the C compiler had to generate different IR for every platform. If ABI conformance is done by the back-end however, then that's a good reason to put type specialization in llvm, where the back-end can see it. :-) > Given your concerns, you clearly have strong ideas about how type > specialization should be implemented; why do you think having LLVM > make the decision for you internally would be better than making the > decision yourself, as you can do today? I want llvm to do the specialization, because specialization is inextricably tied to similar optimizations, like inlining and partial evaluation. Doing it within llvm has many advantages, such as JIT support and link-time optimization. Moreover, specialization should really be done at the codegen level in order to do it properly. C++ templates are a great example of why *NOT* to do specialization within the high-level language. The first problem with C++ templates is that they don't support separate compilation, which makes it a royal pain to design and use template libraries. The library has to be distributed in source code form, compilation times go through the roof, and the linker has to weed out all the duplicate instantiations. The second problem with C++ templates is that every template instantiation always generates completely new code, even when it doesn't need to. Consider the following two functions: struct PairPtr { T* first, T* second }; struct Pair { T first, T second }; T* getSecondPtr(PairPtr* pair) { return pair->second; } T getSecond(Pair* pair) { return pair->second; } The generated code for getSecondPtr() is the same for every T. There is absolutely no need to generate a bazillion specialized copies. Notice that the .Net value-type/reference-type distinction would be overly naive in this case: we can instantiate T to a value type and still get the exact same generated code. The generated code for getSecond() needs to know the size of T in order to calculate the correct offset into the record. However, we still don't need to generate a bazillion specialized copies; we can simply pass the size of T as an argument: void getSecond(int sizeT, void* pair, void* result) { return memcpy(result, ((char*) pair) + sizeT, sizeT); } Transformations of this kind are most definitely not high-level. In order to avoid code bloat, I have been forced to completely bypass the llvm type system. Llvm is supposed to take care of determining offsets and allocating space for return values on the stack. By trying to do that myself, I may have created alignment issues on certain architectures, and possibly broken the platform ABI. Moreover, existing llvm optimization passes probably have no idea how to deal with the code. The code above is somewhat contrived, but I do have a real reason for wanting to pass offsets as arguments. Offsets are an elegant way of implementing mixin classes, e.g. class Mix extends X { ... } If I implement Mix as a template, then every instantiation of Mix would generate new code for every method. However, if I simply pass the size of X as a hidden argument to each method, then the code can be generated once, and compiled to a separate library. -DeLesley From delesley.spambox at googlemail.com Wed Feb 18 14:14:10 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 13:14:10 -0700 Subject: [LLVMdev] Parametric polymorphism Message-ID: >> How difficult would it be to add such a capability to llvm? I was thinking >> of marking type variables like T as opaque types for the initial codegen, >> and then writing a custom pass that instantiates them to real types. >> However, I don't know if that would confuse or break other parts of the >> compiler infrastructure; parametric polymorphism is not necessarily a >> trivial modification. > > What complications do you forsee? The biggest change is that every type can suddenly contain type variables. That could possibly confuse anything that happens to be looking at the types, which I assume is almost every compiler pass. I'm not familiar with llvm internals, so I don't have a good sense of the scale of the change. >> Is anyone on this list familiar with System F, System F_sub, or System >> F^\omega_sub?... Pick one, plug it into llvm, and you have a type >> system that can compete with the JVM or .NET in terms of functionality, >> without being OOP centric or sacrificing language neutrality. (System F is >> low-level -- OOP can be easily implemented on top of it). > > Sounds like a dream come true. Where's the catch? ;-) As far as I'm concerned, there is no catch, aside from the work required to make it happen. The biggest technical problem is that type specialization can be implemented in a couple of different ways, and there are different time/space tradeoffs for each technique. System F_omega tells you how to do type-checking with parameterized types; it doesn't tell you how to generate efficient code. If the actual specialization was encapsulated within a separate pass, then different languages could use whatever technique was most appropriate. Unfortunately, different techniques are not binary compatible, so that would make it hard to design a universal type system. :-( -DeLesley From elijah.epifanov at gmail.com Wed Feb 18 14:25:47 2009 From: elijah.epifanov at gmail.com (Elijah Epifanov) Date: Wed, 18 Feb 2009 23:25:47 +0300 Subject: [LLVMdev] svn pre-commit hook: help needed Message-ID: <59dd1aec0902181225t1eb78a0am2a6b31aeaea382a4@mail.gmail.com> > On Feb 18, 2009, at 1:41 AM, Julien Lerouge wrote: > > > Yet another _fun_ way of doing this is to setup a buildbot slave just > > for that. The slave can fix minor stuff like tabs and trailing > > whitespaces on its own (checking the changes back in), and yell for > > things like 80-col violations and whatnot where the changes would > > not be > > so trivial. > > If you're going to change anything then this is the best alternative, > otherwise I can live with status quo. > > Do not reject commit just because of formatting issues. It can have > serious -ve impact on productivity. > > To folks who prefers to reject commits due to formatting errors -- You > already rely on a some kind of "tool" to make your day to day life > easier. [ Most likely you've your editor automatically replacing tabs > into spaces. Your terminal window is only 80 col. wide or your editor > is displaying a vertical line to warn you about 80 col. and so > on... ]. The build bot suggested by Julien is yet another "tool" that > accomplishes the same. One the slave bot can use clang static > analyzer ... :) > > - > Devang 1. buildbot doing reformats is worse than not taking care of formatting at all. 2. modifying svn transactions is a *crime* (the same effect as making a whole [every field, every function] c++ program const-qualified, and using const_cast everywhere to live with it) 3. pre-commit hook rejecting not properly formatted commits *is* a good thing. We use it at work and it saves tens or even (few) hundreds manhours per month when merging commits between different releases while spending less than 1 minute per commit (formatter run per project takes about 1 min). At llvm, code style policy probably will not save that much, because you don't have to maintain 3-5 branches (trunk, testing, production + approx 2 large branches for integration with other projects or major features not fitting in release cycle). And I strongly suggest you to use very strict policy - this helps merging a lot. Hope that helps (our [depersonalized] pre-commit hook included) -------------- next part -------------- A non-text attachment was scrubbed... Name: pre-commit-example.sh Type: application/x-sh Size: 3317 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090218/c1e73bac/attachment-0001.sh From gordonhenriksen at me.com Wed Feb 18 14:27:37 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Wed, 18 Feb 2009 15:27:37 -0500 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> On 2009-02-18, at 14:53, DeLesley Hutchins wrote: > On 2009-02-18, at 08:06, Gordon Henriksen wrote: > >> Still, there are a large number of potential foibles here. For >> instance, passing an argument can require platform-specific >> contortions to conform to the platform ABI... > > Are those contortions done by the native code generator back-end, or > are they done when the C compiler generates llvm IR? I'm assuming > it's done by the back-end, because it would be bad if the C compiler > had to generate different IR for every platform. It's done by the front-end. There are a variety of attributes and mechanisms which are used to convolute data and marshall it through call sites in an ABI-conformant manner. > I want llvm to do the specialization, because specialization is > inextricably tied to similar optimizations, like inlining and > partial evaluation. Doing it within llvm has many advantages, such > as JIT support and link-time optimization. These are IR-level optimizations, which LLVM does not magically do of its own accord. If LLVM transparently performs specialization, then no post-specialization IR optimizations can be performed. ? Gordon From me22.ca at gmail.com Wed Feb 18 14:32:29 2009 From: me22.ca at gmail.com (me22) Date: Wed, 18 Feb 2009 15:32:29 -0500 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: On Wed, Feb 18, 2009 at 12:32, DeLesley SpamBox wrote: >> I think the problem is deeper than that, in that LLVM has no official >> concept of a subtype, so I don't see how the idea of polymorphism >> could be defined in it. > > Parametric polymorphism is different from subtype polymorphism; you > can have one without the other. Parametric polymorphism just means > that you can use type variables (like T) in the IR, which are later > instantiated > to actual types. > I was thinking of the "T extends Comparable" part, which does involve subtype polymorphism. Apologies if I'm getting terms mixed up. > > True, but I'm not worried about that; method tables are easy to add once type > substitutions are allowed. Dropping down to a mythical low-level language > that I call template C, we would have: > > struct ComparableDict { > bool (*equals)(T a, T b); > bool (*leq)(T a, T b); > bool (*geq)(T a, T b); > } > > T max(ComparableDict* dict, T a, T b) { > if ((*dict->geq)(a,b)) return a; else return b; > } > What do the parametrized types give you that you don't get from using opaque instead of T? Type checking has already passed by the time it reaches this level, and you would simply add the appropriate bitcasts when generating the functions to be added to the dictionaries. You could even safely bitcast integral and floating-point values to pass them through the opaque to avoid boxing. > > In order to get good performance, you will need to instantiate max > with both a type > T, and the dictionary for T. In other words, max must be partially > evaluated with > respect to a given type -and- a given dictionary. While this is > obviously a complication, > partial evaluation with respect to constant arguments is something that llvm is > already quite capable of. If there is not a pass that does it > already, I could easily write > one. The problem is that I have no way of declaring a parameterized type or a > parameterized function within the llvm IR. > I wonder if the "instantiation" could be done instead as a normal pass taking advantage of a general call-site-context-sensitive inter-procedural points-to analysis. Getting rid of the indirection in the generic dictionary seems like the same problem as devirtualizing method calls, so a unified solution would be nice. On Wed, Feb 18, 2009 at 14:53, DeLesley Hutchins wrote: > > Moreover, specialization should really be done at the codegen level > in order to do it properly. C++ templates are a great example of > why *NOT* to do specialization within the high-level language. > But specialization (in the C++ template sense) is also a great example of why it's needed in the host language, as is overloading. > > The generated code for getSecond() needs to know the size of T in > order to calculate the correct offset into the record. However, > we still don't need to generate a bazillion specialized copies; > we can simply pass the size of T as an argument: > > void getSecond(int sizeT, void* pair, void* result) { > return memcpy(result, ((char*) pair) + sizeT, sizeT); > } > Of course, that only works for POD types, so you still need a different instantiation for std::string, std::vector, ... There's no possibility of implementing C++'s complicated lookup and resolution rules down at the LLVM level, and since you need the instantiations just to parse C++, using it as an example for why LLVM should do instantiation seems flawed. Once you restrict yourself to generics, you're only ever passing pointers around, which, as you said, is the easy case (relatively), since you don't need the type information at all once past the front-end. Thanks for putting up with the newbie, ~ Scott From delesley.spambox at googlemail.com Wed Feb 18 14:57:02 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 13:57:02 -0700 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> References: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> Message-ID: > It's done by the front-end. There are a variety of attributes and > mechanisms which are used to convolute data and marshall it through > call sites in an ABI-conformant manner. Oh dear. :-( Do the attributes change depending on the type? I would assume that attributes like "ccc" are type-invariant; i.e. every instantiation should use the C-calling convention, whatever that happens to be for the types in question. >> I want llvm to do the specialization, because specialization is >> inextricably tied to similar optimizations, like inlining and >> partial evaluation. Doing it within llvm has many advantages, such >> as JIT support and link-time optimization. > > These are IR-level optimizations, which LLVM does not magically do of > its own accord. If LLVM transparently performs specialization, then no > post-specialization IR optimizations can be performed. Since these are IR-level optimizations, the logical place to put type variables is in the IR. If type variables only exist in the high-level language, then it's impossible for the llvm linker, JIT, or optimizers, or code generators to do anything intelligent with them. :-) I don't expect magic. If type specialization is implemented as an IR pass, then why couldn't post-specialization IR optimizations be performed? -DeLesley From jon at ffconsultancy.com Wed Feb 18 15:25:57 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 18 Feb 2009 21:25:57 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> Message-ID: <200902182125.57857.jon@ffconsultancy.com> On Wednesday 18 February 2009 20:57:02 DeLesley Hutchins wrote: > > It's done by the front-end. There are a variety of attributes and > > mechanisms which are used to convolute data and marshall it through > > call sites in an ABI-conformant manner. > > Oh dear. :-( I think many people were confused by this at first but an excellent counter example was provided in a previous thread: C99 ABIs can require that struct return values are returned via a pointer to a preallocated struct passed as an auxiliary argument *except* when you're talking about a C99 complex, in which case the return value is conveyed in a completely different way. So this can only be done in the front-end. > Do the attributes change depending on the type? You change the function signature depending whether the return type is a struct or not if you want C compatibility, if that is what you mean. > >> I want llvm to do the specialization, because specialization is > >> inextricably tied to similar optimizations, like inlining and > >> partial evaluation. Doing it within llvm has many advantages, such > >> as JIT support and link-time optimization. > > > > These are IR-level optimizations, which LLVM does not magically do of > > its own accord. If LLVM transparently performs specialization, then no > > post-specialization IR optimizations can be performed. > > Since these are IR-level optimizations, the logical place to put type > variables is in the IR. If type variables only exist in the high-level > language, then it's impossible for the llvm linker, JIT, or optimizers, > or code generators to do anything intelligent with them. :-) What would you want them to do? > I don't expect magic. If type specialization is implemented as an IR > pass, then why couldn't post-specialization IR optimizations be > performed? They could but my impression is that this will only ever be an academic exercise: I doubt LLVM's type system will ever be changed in such a fundamental way simply because the back-end is reaching maturity and that would destabilize a large part of the library but also because there is no clear advantage in doing that, not only for the majority of LLVM users but also in the context of its goals. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From delesley.spambox at googlemail.com Wed Feb 18 15:27:21 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 14:27:21 -0700 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: > I was thinking of the "T extends Comparable" part, which does involve > subtype polymorphism. Apologies if I'm getting terms mixed up. It was a bad example -- not close enough to actual LLVM. :-) > What do the parametrized types give you that you don't get from using > opaque instead of T? Possibly nothing. I don't really understand the limitations of opaque. Is it possible to declare a structure type { int, opaque, int }, and then use getelementptr to retrieve the third element? I'm guessing not, because there's no way for the code generator to calculate the correct offset. If T is a type variable, then the type { int, T, int } should be valid. Eventually of course, the native code generator will need to know the size of T, but that shouldn't worry other optimization passes that are only dealing with the IR. > Type checking has already passed by the time it > reaches this level, and you would simply add the appropriate bitcasts > when generating the functions to be added to the dictionaries. Assuming that all values are 32-bit. Oops -- can't use doubles, long doubles, or compile to 64-bit architectures. Doh! ;-) > I wonder if the "instantiation" could be done instead as a normal pass Yes, and it should. > inter-procedural points-to analysis. Getting rid of the indirection > in the generic dictionary seems like the same problem as > devirtualizing method calls, so a unified solution would be nice. Devirtualization is actually a lot trickier, since it relies on whole program analysis and type information from the high-level language; we need to know that a class C has no subclasses in order to devirtualize its methods. Getting rid of the dictionary indirection is a simple matter of constant propagation and inlining; no type wizardry required. (This is one reason why I'm not a big fan of OOP type systems.) > But specialization (in the C++ template sense) is also a great example > of why it's needed in the host language, as is overloading. There's no reason why specialization -has- to be done by llvm. Idiotic languages like C++ can do it themselves, badly, if they want. ;-) I want llvm to do it for me because it can do a better job. ;-) >> void getSecond(int sizeT, void* pair, void* result) { >> return memcpy(result, ((char*) pair) + sizeT, sizeT); >> } > > Of course, that only works for POD types, so you still need a > different instantiation for std::string, std::vector, ... Every llvm type is a POD type. It's either a scalar, a vector, an array, a struct, or a pointer. Every one of those types has a fixed size. The C++ compiler is supposed to translate complicated thingies like std::string into a POD type that llvm can understand. > There's no possibility of implementing C++'s complicated lookup and > resolution rules down at the LLVM level, Why on earth would we want to do anything like C++ lookup? I tried writing a C++ parser once, and I think the Obama administration can easily use it as a geneva-convention-friendly alternative to waterboarding on suspected terrorists. :-) > Once you restrict yourself to generics, you're only ever passing > pointers around, which, as you said, is the easy case (relatively), > since you don't need the type information at all once past the > front-end. Yes, and Java generics are dog-slow because they can only use pointers. Try implementing a generic complex number class in Java, and watch the two-order-of-magnitude drop in performance on scientific code. -DeLesley From delesley.spambox at googlemail.com Wed Feb 18 15:43:30 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 14:43:30 -0700 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: <200902182125.57857.jon@ffconsultancy.com> References: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> <200902182125.57857.jon@ffconsultancy.com> Message-ID: > I think many people were confused by this at first but an excellent counter > example was provided in a previous thread: C99 ABIs can require that struct > return values are returned via a pointer to a preallocated struct passed as > an auxiliary argument *except* when you're talking about a C99 complex, in > which case the return value is conveyed in a completely different way. Thanks for the explanation. That definitely does make type instantiation in the IR a whole lot more annoying. >> I don't expect magic. If type specialization is implemented as an IR >> pass, then why couldn't post-specialization IR optimizations be >> performed? > > They could but my impression is that this will only ever be an academic > exercise: I doubt LLVM's type system will ever be changed in such a > fundamental way simply because the back-end is reaching maturity and that > would destabilize a large part of the library but also because there is no > clear advantage in doing that, not only for the majority of LLVM users but > also in the context of its goals. The majority of llvm users are using llvm to compile C, or things that are similar to C. People who want to compile something other than C (Haskell, ML, OCaml, C#, etc.) would benefit from from having type variables in the IR. Focusing too much on one language leads to a limited VM. Witness the JVM (only supports Java), or MSIL (designed for C#, with other features tacked on as an afterthought). What you say about maturity and destabilization is probably true, but that's more of a political problem or a manpower problem, not a technical problem. -DeLesley From jon at ffconsultancy.com Wed Feb 18 15:53:59 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 18 Feb 2009 21:53:59 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: <200902182153.59104.jon@ffconsultancy.com> On Wednesday 18 February 2009 21:27:21 DeLesley Hutchins wrote: > Try implementing a generic complex number class in Java, and watch the > two-order-of-magnitude drop in performance on scientific code. Amen. I haven't proven it with a working HLVM yet but I believe LLVM will make it possible (even easy?) to generate extremely performant code from heavily abstracted high-level source. Complex numbers are a great example where the JVM is terrible and .NET is much better but still many times slower than necessary. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From jon at ffconsultancy.com Wed Feb 18 16:31:58 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 18 Feb 2009 22:31:58 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: <200902182125.57857.jon@ffconsultancy.com> Message-ID: <200902182231.58323.jon@ffconsultancy.com> On Wednesday 18 February 2009 21:43:30 DeLesley Hutchins wrote: > The majority of llvm users are using llvm to compile C, or things that are > similar to C. C++, objective C and Cg. > People who want to compile something other than C (Haskell, > ML, OCaml, C#, etc.) would benefit from from having type variables in the > IR. Absolutely. My HLVM is specifically designed to support MLs. However, I believe my project can only be completed with reasonable effort by building upon LLVM and taking a minimalist approach to placing requirements upon LLVM. I am not only avoiding contributing revolutionary changes to LLVM myself, I am avoiding everyone else's experimental features as well if at all possible. > Focusing too much on one language leads to a limited VM. Witness the JVM > (only supports Java), or MSIL (designed for C#, with other features tacked > on as an afterthought). I agree completely but I do not believe that justifies making radical changes to LLVM itself. Indeed, you can do a perfectly good job by building upon LLVM precisely because LLVM does provide the esoteric low-level features that you want (e.g. tail call elimination is better in LLVM than on .NET!). > What you say about maturity and destabilization is probably true, but > that's more of a political problem or a manpower problem, not a technical > problem. Absolutely. I believe your proposal will not go ahead for non-technical reasons. That is not to say that it would not be wonderful to have a HLVM with such features that is built upon LLVM. Indeed, that is precisely what I am trying to accomplish. It will certainly not be theoretically optimal but it will exist and, I believe, it will be of great practical value. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From delesley.spambox at googlemail.com Wed Feb 18 16:29:41 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 15:29:41 -0700 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: <200902182153.59104.jon@ffconsultancy.com> References: <200902182153.59104.jon@ffconsultancy.com> Message-ID: >> Try implementing a generic complex number class in Java, and watch the >> two-order-of-magnitude drop in performance on scientific code. > > Amen. I haven't proven it with a working HLVM yet but I believe LLVM will make > it possible (even easy?) to generate extremely performant code from heavily > abstracted high-level source. Partial evaluation is a fantastic way to completely eliminate the overhead of using very abstract, high-level interfaces. Every computation on static data can be eliminated by a partial evaluator, and in most cases, all of the overhead of a high-level interface involves information that is statically known. Partial evaluation is also old technology and very simple to implement; I am continually amazed that it hasn't been more widely applied. I think the biggest issue is that partial evaluation works best for programs that are free of side-effects. Most imperative programs do a lot of their computation by mutating the heap, and heap mutations are a lot harder to track. In any case, combining partial evaluation in a HLVM with the optimization passes provided by LLVM should yield blazingly fast code. The goal of making Java or C# ``as fast as C'' has always seemed somewhat uninspiring to me. C is actually rather hard to optimize; a high-level DSL should easily be able to outperform C on any task within its particular domain. -DeLesley From lennart at augustsson.net Wed Feb 18 16:51:03 2009 From: lennart at augustsson.net (Lennart Augustsson) Date: Wed, 18 Feb 2009 23:51:03 +0100 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> <200902182125.57857.jon@ffconsultancy.com> Message-ID: Why do you say that people who compile, e.g., functional languages would benefit from type variables in LLVM? I like the level the LLVM is at, and would prefer to deal with instantiating parametric polymorphism at a higher level. On Wed, Feb 18, 2009 at 10:43 PM, DeLesley Hutchins wrote: >> I think many people were confused by this at first but an excellent counter >> example was provided in a previous thread: C99 ABIs can require that struct >> return values are returned via a pointer to a preallocated struct passed as >> an auxiliary argument *except* when you're talking about a C99 complex, in >> which case the return value is conveyed in a completely different way. > > Thanks for the explanation. That definitely does make type instantiation in > the IR a whole lot more annoying. > >>> I don't expect magic. If type specialization is implemented as an IR >>> pass, then why couldn't post-specialization IR optimizations be >>> performed? >> >> They could but my impression is that this will only ever be an academic >> exercise: I doubt LLVM's type system will ever be changed in such a >> fundamental way simply because the back-end is reaching maturity and that >> would destabilize a large part of the library but also because there is no >> clear advantage in doing that, not only for the majority of LLVM users but >> also in the context of its goals. > > The majority of llvm users are using llvm to compile C, or things that are > similar to C. People who want to compile something other than C (Haskell, > ML, OCaml, C#, etc.) would benefit from from having type variables in the IR. > > Focusing too much on one language leads to a limited VM. Witness the > JVM (only supports Java), or MSIL (designed for C#, with other features > tacked on as an afterthought). > > What you say about maturity and destabilization is probably true, but > that's more of a political problem or a manpower problem, not a technical > problem. > > -DeLesley > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From eli.friedman at gmail.com Wed Feb 18 17:01:04 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 18 Feb 2009 15:01:04 -0800 Subject: [LLVMdev] Possible error in LegalizeDAG In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C827DA1@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827DA1@ssanexmb1.amd.com> Message-ID: On Wed, Feb 18, 2009 at 10:14 AM, Villmow, Micah wrote: > I'm still trying to track down some alignment issues with loads(i.e. 8/16 > bit loads being turned into 32bit sign extending loads) and I cannot for the > life of me seem to figure out how to enter this section of code: > > // If this is an unaligned load and the target doesn't support it, > > // expand it. Why do you expect to enter this section of code? It's impossible for an i8 load to be unaligned. > Any hints would be greatly appreciated, this is a blocking issue that I just > cannot seem to resolve without modifying the LLVM codebase to remove the > extend + load -> extload combining step. LLVM will "uncombine" it for you if you use setLoadExtAction with the appropriate arguments. -Eli From delesley.spambox at googlemail.com Wed Feb 18 17:36:27 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 16:36:27 -0700 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: <527244AB-EE18-4884-84BE-A27F4FC33D04@me.com> <200902182125.57857.jon@ffconsultancy.com> Message-ID: > Why do you say that people who compile, e.g., functional languages > would benefit from type variables in LLVM? > I like the level the LLVM is at, and would prefer to deal with > instantiating parametric polymorphism at a higher level. I'm surprised you're happy with a non-polymorphic llvm. Does Cayenne target llvm? Dependent types take polymorphism to new heights -- but perhaps you feel that since llvm cannot hope to match Cayenne, you might as well do everything yourself. :-) My reasons are: First, if type variables are handled at the higher-level, then every functional language will have its own implementation. It will not be possible to define and reuse optimization passes between languages, which is one of the goals of llvm. Second, polymorphism can be made more efficient if it has low-level codegen support -- see my earlier offset code. Third, any language which has polymorphism is faced with the task of ``erasing'' it before it can hand the code to llvm. Such erasure can be done in different ways -- compare Java type erasure (use pointers, instantiate nothing), with C++ (instantiate everything), with .Net (instantiate some things). (I don't know how Ocaml does it) It is very difficult to share code between languages if the erasure semantics differ so widely. -DeLesley From dag at cray.com Wed Feb 18 18:49:38 2009 From: dag at cray.com (David Greene) Date: Wed, 18 Feb 2009 18:49:38 -0600 Subject: [LLVMdev] Possible DAGCombiner or TargetData Bug Message-ID: <200902181849.38871.dag@cray.com> I got bit by this in LLVM 2.4 DagCombiner.cpp and it's still in trunk: SDValue DAGCombiner::visitSTORE(SDNode *N) { [...] // If this is a store of a bit convert, store the input value if the // resultant store does not need a higher alignment than the original. if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() && ST->isUnindexed()) { unsigned Align = ST->getAlignment(); MVT SVT = Value.getOperand(0).getValueType(); unsigned OrigAlign = TLI.getTargetData()-> getABITypeAlignment(SVT.getTypeForMVT()); if (Align <= OrigAlign && ((!LegalOperations && !ST->isVolatile()) || TLI.isOperationLegalOrCustom(ISD::STORE, SVT))) return DAG.getStore(Chain, N->getDebugLoc(), Value.getOperand(0), Ptr, ST->getSrcValue(), ST->getSrcValueOffset(), ST->isVolatile(), OrigAlign); } Uhh...this doesn't seem legal to me. How can we just willy-nilly create a store with a greater alignment? In this case Align is 8 and OrigAlign is 16 because SVT.getTypeForMVT() is Type::VectorTyID (<2 x i64>) which has an ABI type of VECTOR_ALIGN. Hmm...why is the ABI alignment for VectorTyID 16? The ABI certainly doesn't guarantee it. It only guarantees it for __int128, __float128 and __m128. Lots of other types can map to <2 x i64>. Any opinions on this? -Dave From scottm at aero.org Wed Feb 18 18:57:28 2009 From: scottm at aero.org (Scott Michel) Date: Wed, 18 Feb 2009 16:57:28 -0800 Subject: [LLVMdev] Modeling GPU vector registers, again (with my implementation) In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C827A08@ssanexmb1.amd.com> References: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com><8CE3FC1A-1211-48E4-A319-AAC6FF346039@apple.com> <22034856.post@talk.nabble.com> <5BA674C5FF7B384A92C2C95D8CC71E1C827A08@ssanexmb1.amd.com> Message-ID: On Feb 16, 2009, at 9:24 AM, Villmow, Micah wrote: > In order to get swizzling to work you only need to handle three > SDNodes, insert_vector_elt, extract_vector_elt and build_vector while > expanding the rest. For those three nodes I then custom lowered > them to > a target specific node with an extra integer constant per register > that > would encode the swizzle mask in 32bits. Villimow, Micah: This problem argues for why SDNode should be target polymorphic. If they were target polymorphic, then a target-specific node would be largely unnecessary. (By a target-specific node, I mean extending the ISD enumeration for your target.) A target polymorphic SDNode would still capture all of the behaviors and attributes with insert_vector_elt, extract_vector_elt and build_vector, but also allow you to add additional behaviors and attributes. Which is mostly the point in object oriented programming. Assuming you don't need to do extra DAGCombine work, you would get that for free from the parent class. Unfortunately, that would mean a lot of work at this juncture and a heavy overhaul of SelectionDAGNodes.h and associated SelectionDAG source. Node allocation, in particular, would become more complicated (but not unsolvable.) Were anyone going to tackle this problem, the solution would have to be largely incremental, i.e., the source can't be overhauled all at once, but should permit incremental transition of SDNodes to a behavioral interface style. -scooter From jon at ffconsultancy.com Wed Feb 18 19:21:33 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 19 Feb 2009 01:21:33 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: Message-ID: <200902190121.33667.jon@ffconsultancy.com> On Wednesday 18 February 2009 23:36:27 DeLesley Hutchins wrote: > > Why do you say that people who compile, e.g., functional languages > > would benefit from type variables in LLVM? > > I like the level the LLVM is at, and would prefer to deal with > > instantiating parametric polymorphism at a higher level. > > I'm surprised you're happy with a non-polymorphic llvm. Does > Cayenne target llvm? Dependent types take polymorphism to new > heights -- but perhaps you feel that since llvm cannot hope to > match Cayenne, you might as well do everything yourself. :-) > > My reasons are: > > First, if type variables are handled at the higher-level, then every > functional language will have its own implementation. No, they just need to share a HLVM. > Second, polymorphism can be made more efficient if it has low-level > codegen support -- see my earlier offset code. In theory, perhaps. > Third, any language which has polymorphism is faced with the task > of ``erasing'' it before it can hand the code to llvm. Such erasure can be > done in different ways -- compare Java type erasure (use pointers, > instantiate nothing), with C++ (instantiate everything), with .Net > (instantiate some things). (I don't know how Ocaml does it) It is very > difficult to share code between languages if the erasure semantics differ > so widely. The same can be said of closures, garbage collection and a dozen other features that also cannot feasibly be added to LLVM. The only logical solution is to build a HLVM on top of LLVM and share that between these high-level language implementations. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From Micah.Villmow at amd.com Wed Feb 18 19:18:53 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Wed, 18 Feb 2009 17:18:53 -0800 Subject: [LLVMdev] Bug in BranchFolding.cpp:OptimizeBlock Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C827E9C@ssanexmb1.amd.com> I've ran across an issue in BranchFolding.cpp where it is incorrectly folding a branch to the wrong fallthrough location. This is in LLVM 2.4 and seems to be in 2.5 also. The code in question is: void BranchFolder::OptimizeBlock(MachineBasicBlock *MBB) { MachineFunction::iterator FallThrough = MBB; ++FallThrough; // If this block is empty, make everyone use its fall-through, not the block // explicitly. Landing pads should not do this since the landing-pad table // points to this block. if (MBB->empty() && !MBB->isLandingPad()) { // Dead block? Leave for cleanup later. if (MBB->pred_empty()) return; if (FallThrough == MBB->getParent()->end()) { // TODO: Simplify preds to not branch here if possible! } else { // Rewrite all predecessors of the old block to go to the fallthrough // instead. while (!MBB->pred_empty()) { MachineBasicBlock *Pred = *(MBB->pred_end()-1); Pred->ReplaceUsesOfBlockWith(MBB, FallThrough); } // If MBB was the target of a jump table, update jump tables to go to the // fallthrough instead. MBB->getParent()->getJumpTableInfo()-> ReplaceMBBInJumpTables(MBB, FallThrough); MadeChange = true; } return; } The problem with this section of code is that FallThrough is not guaranteed to be a successor of MBB or even a descendent of MBB. The bitcode I've attached is a case where there are 5 basic blocks, where the first four end with conditional branches to an early return, as specified with initial.dot. TailMergeBlocks in BranchFolding::runOnMachineFunction merges the 4 early return blocks to a single basic block and numbers renumbers them, as specified with tailmerge.dot. When it runs optimize block on the if.end14 block, it enters the above segment of code, removing it and replacing it with FallThrough, which is NOT its successor block and links two blocks changing the structure of the program as shown in Optimizeblock.dot. I've attached a possible solution as a p4diff as I don't have svn setup on this machine, but let me know of any comments about the patch. All of the files are attached in bugzilla, #3616, http://llvm.org/bugs/show_bug.cgi?id=3616 Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090218/2b321739/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizeblock.diff Type: application/octet-stream Size: 873 bytes Desc: optimizeblock.diff Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090218/2b321739/attachment.obj From dag at cray.com Wed Feb 18 19:19:20 2009 From: dag at cray.com (David Greene) Date: Wed, 18 Feb 2009 19:19:20 -0600 Subject: [LLVMdev] Modeling GPU vector registers, again (with my implementation) In-Reply-To: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com> References: <4d77c5f20902130947s349dcab6r74e2057dd18161@mail.gmail.com> Message-ID: <200902181919.20897.dag@cray.com> On Friday 13 February 2009 11:47, Alex wrote: > It seems to me that LLVM sub-register is not for the following hardware > architecture. > > All instructions of a hardware are vector instructions. All registers > contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul r0.xyw, r1, r2 > add r0.z, r3, r4 > sub r5, r0, r1 > > Notice that the four elements of r0 are written by two different > instructions. > > My question is how should I model these sub-registers. If I treat each > component > as a register, and do the register allocation individually, it seems very > difficult to merge the scalars operations back into one vetor operation. This is a very good use case for vector masks in LLVM. Expressing this as two masked operations and a merge: ** Warning, pseudo-LLVM code *** mul r0, r1, r2, [1101] ; [xy_w] add r6, r3, r4, [0010] ; [__z_] ** The assumption here is that masked elements are undefined, so we need a merge ** select r0, r0, r6, [1101] ; Select 1's from r0, 0's from r6, merge sub r5, r0, r1, [1111] ; Or have no mask == full mask The registers are just vector registers then. They don't have component pieces. Regalloc will have no problem with them. The MachineInstrs for your architecture would have to preserve the mask semantics. In the AsmPrinter for your architecture, it would be a simple matter to dump out the mask as the field specifier on a register name. The masks would allow you to get rid of the shufflevector stuff. Since you don't have a hardware merge instruction you could keep your pre- and post-regalloc passes to rewrite things or a very simple post-regalloc peephole pass could examine the masks of the merge and rewrite the registers in the defs without a pre-regalloc pass needed to remember things. Alas, we do not have masks in LLVM just yet. But I'm getting to the point where I'm ready to restart that discussion. :) This also won't directly handle the more general case of swizzling: r0.wyzx = ... But a "regular" masked operation followed by a shufflevector should do it. -Dave From dag at cray.com Wed Feb 18 19:30:58 2009 From: dag at cray.com (David Greene) Date: Wed, 18 Feb 2009 19:30:58 -0600 Subject: [LLVMdev] Possible DAGCombiner or TargetData Bug In-Reply-To: <200902181849.38871.dag@cray.com> References: <200902181849.38871.dag@cray.com> Message-ID: <200902181930.58839.dag@cray.com> On Wednesday 18 February 2009 18:49, David Greene wrote: > Hmm...why is the ABI alignment for VectorTyID 16? The ABI certainly > doesn't guarantee it. It only guarantees it for __int128, __float128 and > __m128. Lots of other types can map to <2 x i64>. I should mention this is x86-64. The data layout for x86-64 doesn't even mention vector types, so the default from TargetData gets used, which is to set VECTOR_ALIGN to 16 for preferred and ABI alignment: class X86Subtarget : public TargetSubtarget { [...] std::string getDataLayout() const { const char *p; if (is64Bit()) p = "e-p:64:64-s:64-f64:64:64-i64:64:64-f80:128:128"; else { if (isTargetDarwin()) p = "e-p:32:32-f64:32:64-i64:32:64-f80:128:128"; else p = "e-p:32:32-f64:32:64-i64:32:64-f80:32:32"; } return std::string(p); } So maybe the problem is here? -Dave From delesley.spambox at googlemail.com Wed Feb 18 21:31:04 2009 From: delesley.spambox at googlemail.com (DeLesley Hutchins) Date: Wed, 18 Feb 2009 20:31:04 -0700 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: <200902190121.33667.jon@ffconsultancy.com> References: <200902190121.33667.jon@ffconsultancy.com> Message-ID: > The same can be said of closures, garbage collection and a dozen other > features that also cannot feasibly be added to LLVM. > > The only logical solution is to build a HLVM on top of LLVM and share that > between these high-level language implementations. This is an excellent point. You have convinced me. :-) BTW, what garbage collector are you using for your HLVM? You complain about mono's use of the Boehm-Weiser collector on your blog; but you also said that you assumed an uncooperative environment. I'm not a GC expert, so why are the current GC intrinsics insufficient? -DeLesley From gohman at apple.com Wed Feb 18 21:43:15 2009 From: gohman at apple.com (Dan Gohman) Date: Wed, 18 Feb 2009 19:43:15 -0800 (PST) Subject: [LLVMdev] Possible DAGCombiner or TargetData Bug In-Reply-To: <200902181849.38871.dag@cray.com> References: <200902181849.38871.dag@cray.com> Message-ID: I agree, that doesn't look right. It looks like this is what was intended: Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp =================================================================== --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy) @@ -4903,9 +4903,9 @@ // resultant store does not need a higher alignment than the original. if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() && ST->isUnindexed()) { - unsigned Align = ST->getAlignment(); + unsigned OrigAlign = ST->getAlignment(); MVT SVT = Value.getOperand(0).getValueType(); - unsigned OrigAlign = TLI.getTargetData()-> + unsigned Align = TLI.getTargetData()-> getABITypeAlignment(SVT.getTypeForMVT()); if (Align <= OrigAlign && ((!LegalOperations && !ST->isVolatile()) || Does that look right to you? Dan On Wed, February 18, 2009 4:49 pm, David Greene wrote: > I got bit by this in LLVM 2.4 DagCombiner.cpp and it's still in trunk: > > SDValue DAGCombiner::visitSTORE(SDNode *N) { > > [...] > > // If this is a store of a bit convert, store the input value if the > // resultant store does not need a higher alignment than the original. > if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() && > ST->isUnindexed()) { > unsigned Align = ST->getAlignment(); > MVT SVT = Value.getOperand(0).getValueType(); > unsigned OrigAlign = TLI.getTargetData()-> > getABITypeAlignment(SVT.getTypeForMVT()); > if (Align <= OrigAlign && > ((!LegalOperations && !ST->isVolatile()) || > TLI.isOperationLegalOrCustom(ISD::STORE, SVT))) > return DAG.getStore(Chain, N->getDebugLoc(), Value.getOperand(0), > Ptr, ST->getSrcValue(), > ST->getSrcValueOffset(), ST->isVolatile(), > OrigAlign); > } > > Uhh...this doesn't seem legal to me. How can we just willy-nilly create a > store with a greater alignment? In this case Align is 8 and OrigAlign is > 16 > because SVT.getTypeForMVT() is Type::VectorTyID (<2 x i64>) which has an > ABI > type of VECTOR_ALIGN. > > Hmm...why is the ABI alignment for VectorTyID 16? The ABI certainly > doesn't > guarantee it. It only guarantees it for __int128, __float128 and __m128. > Lots of other types can map to <2 x i64>. > > Any opinions on this? > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From pg at cs.stanford.edu Wed Feb 18 22:20:17 2009 From: pg at cs.stanford.edu (Philip Guo) Date: Wed, 18 Feb 2009 20:20:17 -0800 Subject: [LLVMdev] what's correct behavior for struct forward declarations? Message-ID: <79e41e9f0902182020u221558f4rd6175646dbc236a7@mail.gmail.com> hi all, i'm trying to use LLVM to compile some linux kernel code, and i noticed a mismatch with gcc. here is a simplified test case: struct foo { int a; int b; int c; }; static struct foo x; // 'forward' declaration? int bar() { printf("a: %d, b: %d, c: %d\n", x.a, x.b, x.c); } static struct foo x = { .a = 1, .b = 2, .c = 3, }; int main() { bar(); return 0; } when this code is compiled with gcc and run, stdout prints "a: 1, b: 2, c: 3", which means that it takes the true declaration of x, initialized to 1, 2, 3. however, when it's compiled with llvm, llvm emits the following code for x: @x = internal global %struct.foo zeroinitializer ; <%struct.foo*> [#uses=3] which seems to me like it's taking the first declaration of x, which is a forward declaration. is that the correct behavior? i believe that the kernel developers intended for the second (real declaration) of x to be visible, even in bar(), but that's not what's happening with llvm. is there an easy workaround where i can get llvm to emit code initializing x to {1,2,3}? thanks! Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090218/2f65fc21/attachment.html From clattner at apple.com Wed Feb 18 22:30:31 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 18 Feb 2009 20:30:31 -0800 Subject: [LLVMdev] what's correct behavior for struct forward declarations? In-Reply-To: <79e41e9f0902182020u221558f4rd6175646dbc236a7@mail.gmail.com> References: <79e41e9f0902182020u221558f4rd6175646dbc236a7@mail.gmail.com> Message-ID: On Feb 18, 2009, at 8:20 PM, Philip Guo wrote: > hi all, > > i'm trying to use LLVM to compile some linux kernel code, and i > noticed a mismatch with gcc. here is a simplified test case: This definitely looks like a bug, but I can't reproduce it with mainline. Are you sure this is not fixed with SVN? -Chris > > > struct foo { > int a; > int b; > int c; > }; > > static struct foo x; // 'forward' declaration? > > int bar() { > printf("a: %d, b: %d, c: %d\n", x.a, x.b, x.c); > } > > static struct foo x = { > .a = 1, .b = 2, .c = 3, > }; > > int main() { > bar(); > return 0; > } > > > when this code is compiled with gcc and run, stdout prints "a: 1, b: > 2, c: 3", which means that it takes the true declaration of x, > initialized to 1, 2, 3. however, when it's compiled with llvm, llvm > emits the following code for x: > > @x = internal global %struct.foo zeroinitializer ; <%struct.foo*> > [#uses=3] > > which seems to me like it's taking the first declaration of x, which > is a forward declaration. is that the correct behavior? i believe > that the kernel developers intended for the second (real > declaration) of x to be visible, even in bar(), but that's not > what's happening with llvm. is there an easy workaround where i can > get llvm to emit code initializing x to {1,2,3}? thanks! > > Philip > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From echeng at apple.com Thu Feb 19 00:36:25 2009 From: echeng at apple.com (Evan Cheng) Date: Wed, 18 Feb 2009 22:36:25 -0800 Subject: [LLVMdev] ARM backend playing with alternative jump table implementations In-Reply-To: <8e3491100902171404v53c0c3dao495cdbd5890dbd10@mail.gmail.com> References: <8e3491100902171404v53c0c3dao495cdbd5890dbd10@mail.gmail.com> Message-ID: <7CE4FE61-1423-4632-8326-254B8E701815@apple.com> On Feb 17, 2009, at 2:04 PM, robert muth wrote: > Hi list: > > I have been trying to get my feet wet with the ARM backend. Welcome. > > As a warmup exercise I wanted to be able to move > jumptables especially large ones out of the code section. > Currently the idiom for jump tables loooks like this > > // .set PCRELV0, (.LJTI9_0_0-(.LPCRELL0+8)) > // .LPCRELL0: > // add r3, pc, #PCRELV0 > // ldr pc, [r3, +r0, lsl #2] > // .LJTI9_0_0: > // .long .LBB9_2 > // .long .LBB9_5 > // .long .LBB9_7 > // .long .LBB9_4 > // .long .LBB9_8 > > I would like to be able to change this to something like: > > ldr r3, .POOL_ADDR > ldr pc, [r3, +r0, lsl #2 > > .POOL_ADDR: > .text .LJTI9_0_0: > > .data > .LJTI9_0_0: > .long .LBB9_2 > .long .LBB9_5 > .long .LBB9_7 > .long .LBB9_4 > .long .LBB9_8 > .text Ok. I think it's a worthwhile alternative jumptable codegen scheme. > > > The code for the lowering lives mostly in SDValue > ARMTargetLowering::LowerBR_JT > with some more heavy lifting done by ARMISD::WrapperJT > My attempts at this are marked in the code below. > My problem is to come up with the right item/value to put into the > constant pool. Off the top of my head, I think you need to enhance ARMConstantPoolValue. Perhaps add a new ARMCPKind for jumptable base address. You also have to encode the jumptable id in ARMConstantPoolValue. Evan > > SDValue ARMTargetLowering::LowerBR_JT(SDValue Op, SelectionDAG &DAG) { > SDValue Chain = Op.getOperand(0); > SDValue Table = Op.getOperand(1); > SDValue Index = Op.getOperand(2); > DebugLoc dl = Op.getDebugLoc(); > > MVT PTy = getPointerTy(); > JumpTableSDNode *JT = cast(Table); > ARMFunctionInfo *AFI = > DAG.getMachineFunction().getInfo(); > SDValue UId = DAG.getConstant(AFI->createJumpTableUId(), PTy); > SDValue JTI = DAG.getTargetJumpTable(JT->getIndex(), PTy); > > #if 1 > // @@ GET TABLE BASE: current code > Table = DAG.getNode(ARMISD::WrapperJT, MVT::i32, JTI, UId); > #else > // @ MY ATTEMPT AT MOVING THIS OUT > ARMConstantPoolValue *CPV = new > ARMConstantPoolValue("a_jump_table", 666); > SDValue TableValue = DAG.getTargetConstantPool(CPV, PTy, 2); > SDValue CPAddr = DAG.getNode(ARMISD::Wrapper, MVT::i32, TableValue); > Table = DAG.getLoad(PTy, dl, DAG.getEntryNode(), CPAddr, NULL, 0); > #endif > > Index = DAG.getNode(ISD::MUL, dl, PTy, Index, DAG.getConstant(4, > PTy)); > //Index = DAG.getNode(ISD::MUL, dl, PTy, TableAddress, > DAG.getConstant(4, PTy)); > SDValue Addr = DAG.getNode(ISD::ADD, dl, PTy, Index, > Table);SDValue APTy, Index, Table); > > > > Any help would be greatly appreciated. > > Robert > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From echeng at apple.com Thu Feb 19 00:40:01 2009 From: echeng at apple.com (Evan Cheng) Date: Wed, 18 Feb 2009 22:40:01 -0800 Subject: [LLVMdev] Using CallingConvLower in ARM target In-Reply-To: <305d6f60902171641l59329d2bq23193d70e39ae4fb@mail.gmail.com> References: <305d6f60812270430xdf1ebb9gf6d99f94215ab66b@mail.gmail.com> <305d6f60902121821u586a5ecat65c81d462d139e1d@mail.gmail.com> <305d6f60902131420g51a7f35ajf8eba18b6710951e@mail.gmail.com> <305d6f60902131625k460a4518k7fae41005d379d16@mail.gmail.com> <05C6EB7F-36AC-4B18-A32D-36A71530BBAB@apple.com> <305d6f60902131841p354431e6pa92dd9df14bc5555@mail.gmail.com> <305d6f60902132027j3cc822dfw1dc817bb4a3f67b5@mail.gmail.com> <90A38A0E-3C84-4765-A521-A2ED66261056@apple.com> <305d6f60902171641l59329d2bq23193d70e39ae4fb@mail.gmail.com> Message-ID: On Feb 17, 2009, at 4:41 PM, Sandeep Patel wrote: > On Mon, Feb 16, 2009 at 11:00 AM, Evan Cheng > wrote: >> /// Information about how the value is assigned. >> - LocInfo HTP : 7; >> + LocInfo HTP : 6; >> >> Do you know why this change is needed? Are we running out of bits? > > HTP was't using all of these bits. I needed the hasCustom bit to come > from somewhere unless we wanted to grow this struct, so I grabbed a > bit from HTP. > >> - NeededStackSize = 4; >> - break; >> - case MVT::i64: >> - case MVT::f64: >> - if (firstGPR < 3) >> - NeededGPRs = 2; >> - else if (firstGPR == 3) { >> - NeededGPRs = 1; >> - NeededStackSize = 4; >> - } else >> - NeededStackSize = 8; >> + State.addLoc(CCValAssign::getCustomMem(ValNo, ValVT, >> + >> State.AllocateStack(4, 4), >> + MVT::i32, LocInfo)); >> + return true; // we handled it >> >> Your change isn't handling the "NeededStackSize = 8" case. > > I believe it is. I've attached two additional test cases. The > difference is that this case isn't handled by the CCCustomFns. They > fail to allocate any regs and then handling falls through to an > CCAssignToStack in ARMCallingConv.td. This is how other targets handle > similar allocations. Ok. > > >> ++ static const unsigned HiRegList[] = { ARM::R0, ARM::R2 }; >> + static const unsigned LoRegList[] = { ARM::R1, ARM::R3 }; >> + >> + if (unsigned Reg = State.AllocateReg(HiRegList, LoRegList, 2)) { >> + unsigned i; >> + for (i = 0; i < 2; ++i) >> + if (HiRegList[i] == Reg) >> + break; >> + >> + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, Reg, >> + MVT::i32, LocInfo)); >> + State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, >> LoRegList[i], >> + MVT::i32, LocInfo)); >> >> Since 'i' is used after the loop, please choose a better variable >> name. >> >> Actually, is the loop necessary? We know the low register is always >> one after the high register. Perhaps you can use >> ARMRegisterInfo::getRegisterNumbering(Reg), add one to 1. And the >> lookup the register enum with a new function (something like >> getRegFromRegisterNum(RegNo, ValVT)). >> >> The patch is looking good. I need to run it through some more tests. >> Unfortunately ARM target is a bit broken right now. I hope to fix it >> today. > > I'll submit a revised patch after we've settled on the > NeededStackSize=8 issue. ARM target is fairly healthy now. I'll run some tests with your patch in the next few days. Thanks, Evan > > > deep > >> Thanks, >> >> Evan >> >> On Feb 13, 2009, at 8:27 PM, Sandeep Patel wrote: >> >>> Sorry left a small bit of cruft in ARMCallingConv.td. A corrected >>> patch it attached. >>> >>> deep >>> >>> On Fri, Feb 13, 2009 at 6:41 PM, Sandeep Patel >>> wrote: >>>> Sure. Updated patches attached. >>>> >>>> deep >>>> >>>> On Fri, Feb 13, 2009 at 5:47 PM, Evan Cheng >>>> wrote: >>>>> >>>>> On Feb 13, 2009, at 4:25 PM, Sandeep Patel wrote: >>>>> >>>>>> ARMTargetLowering doesn't need case #1, but it seemed like you >>>>>> and Dan >>>>>> wanted a more generic way to inject C++ code into the process >>>>>> so I >>>>>> tried to make the mechanism a bit more general. >>>>> >>>>> Ok. Since ARM doesn't need it and it's the only client, I'd much >>>>> rather have CCCustomFn just return a single bool indicating >>>>> whether it >>>>> can handle the arg. Would that be ok? >>>>> >>>>> Thanks, >>>>> >>>>> Evan >>>>> >>>>>> >>>>>> >>>>>> deep >>>>>> >>>>>> On Fri, Feb 13, 2009 at 2:34 PM, Evan Cheng >>>>>> >>>>>> wrote: >>>>>>> >>>>>>> On Feb 13, 2009, at 2:20 PM, Sandeep Patel wrote: >>>>>>> >>>>>>>> On Fri, Feb 13, 2009 at 12:33 PM, Evan Cheng >>>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Feb 12, 2009, at 6:21 PM, Sandeep Patel wrote: >>>>>>>>> >>>>>>>>>> Although it's not generally needed for ARM's use of >>>>>>>>>> CCCustom, I >>>>>>>>>> return >>>>>>>>>> two bools to handle the four possible outcomes to keep the >>>>>>>>>> mechanism >>>>>>>>>> flexible: >>>>>>>>>> >>>>>>>>>> * if CCCustomFn handled the arg or not >>>>>>>>>> * if CCCustomFn wants to end processing of the arg or not >>>>>>>>> >>>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>>> possibly >>>>>>>>> updating >>>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>>> it. It >>>>>>>>> must set >>>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>>> &LocInfo, >>>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>>> &State, >>>>>>>>> + bool &result); >>>>>>>>> >>>>>>>>> Is "result" what you refer to as "isCustom" in the comments? >>>>>>>>> >>>>>>>>> Sorry, I am still confused. You mean it could return true but >>>>>>>>> set >>>>>>>>> 'result' to false? That means it has handled the argument >>>>>>>>> but it >>>>>>>>> would >>>>>>>>> not process any more arguments? What scenario do you envision >>>>>>>>> that >>>>>>>>> this will be useful? I'd rather keep it simple. >>>>>>>> >>>>>>>> As you note there are three actual legitimate cases (of the >>>>>>>> four >>>>>>>> combos): >>>>>>>> >>>>>>>> 1. The CCCustomFn wants the arg handling to proceed. This might >>>>>>>> be >>>>>>>> used akin to CCPromoteToType. >>>>>>>> 2. The CCCustomFn entirely handled the arg. This might be used >>>>>>>> akin to >>>>>>>> CCAssignToReg. >>>>>>>> 3. The CCCustomFn tried to handle the arg, but failed. >>>>>>>> >>>>>>>> these results are conveyed the following ways: >>>>>>>> >>>>>>>> 1. The CCCustomFn returns false, &result is not used. >>>>>>>> 2. The CCCustomFn returns true, &result is false; >>>>>>>> 3. The CCCustomFn returns true, &result is true. >>>>>>> >>>>>>> I don't think we want to support #1. If the target want to add >>>>>>> custom >>>>>>> code to handle an argument, if should be responsible for >>>>>>> outputting >>>>>>> legal code. Is there an actual need to support #1? >>>>>>> >>>>>>> Evan >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I tried to keep these CCCustomFns looking like TableGen >>>>>>>> generated >>>>>>>> code. Suggestions of how to reorganize these results are >>>>>>>> welcome. :-) >>>>>>>> Perhaps better comments around the typedef for CCCustomFn would >>>>>>>> suffice? >>>>>>>> >>>>>>>> The isCustom flag is simply a means for this machinery to >>>>>>>> convey to >>>>>>>> the TargetLowering functions to process this arg specially. It >>>>>>>> may >>>>>>>> not >>>>>>>> always be possible for the TargetLowering functions to >>>>>>>> determine >>>>>>>> that >>>>>>>> the arg needs special handling after all the changes made by >>>>>>>> the >>>>>>>> CCCustomFn or CCPromoteToType and other transformations. >>>>>>>> >>>>>>>>>> I placed the "unsigned i" outside those loops because i is >>>>>>>>>> used >>>>>>>>>> after >>>>>>>>>> the loop. If there's a better index search pattern, I'd be >>>>>>>>>> happy >>>>>>>>>> to >>>>>>>>>> change it. >>>>>>>>> >>>>>>>>> Ok. >>>>>>>>> >>>>>>>>> One more nitpick: >>>>>>>>> >>>>>>>>> +/// CCCustom - calls a custom arg handling function >>>>>>>>> >>>>>>>>> Please capitalize "calls" and end with a period. >>>>>>>> >>>>>>>> Once we settle on the result handling changes, I'll submit an >>>>>>>> update >>>>>>>> with this change. >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Evan >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Attached is an updated patch against HEAD that has DebugLoc >>>>>>>>>> changes. I >>>>>>>>>> also split out the ARMAsmPrinter fix into it's own patch. >>>>>>>>>> >>>>>>>>>> deep >>>>>>>>>> >>>>>>>>>> On Mon, Feb 9, 2009 at 8:54 AM, Evan Cheng >>>>>>>>>> wrote: >>>>>>>>>>> Thanks Sandeep. I did a quick scan, this looks really good. >>>>>>>>>>> But I >>>>>>>>>>> do >>>>>>>>>>> have a question: >>>>>>>>>>> >>>>>>>>>>> +/// CCCustomFn - This function assigns a location for Val, >>>>>>>>>>> possibly >>>>>>>>>>> updating >>>>>>>>>>> +/// all args to reflect changes and indicates if it handled >>>>>>>>>>> it. It >>>>>>>>>>> must set >>>>>>>>>>> +/// isCustom if it handles the arg and returns true. >>>>>>>>>>> +typedef bool CCCustomFn(unsigned &ValNo, MVT &ValVT, >>>>>>>>>>> + MVT &LocVT, CCValAssign::LocInfo >>>>>>>>>>> &LocInfo, >>>>>>>>>>> + ISD::ArgFlagsTy &ArgFlags, CCState >>>>>>>>>>> &State, >>>>>>>>>>> + bool &result); >>>>>>>>>>> >>>>>>>>>>> Is it necessary to return two bools (the second is >>>>>>>>>>> returned by >>>>>>>>>>> reference in 'result')? I am confused about the semantics of >>>>>>>>>>> 'result'. >>>>>>>>>>> >>>>>>>>>>> Also, a nitpick: >>>>>>>>>>> >>>>>>>>>>> + unsigned i; >>>>>>>>>>> + for (i = 0; i < 4; ++i) >>>>>>>>>>> >>>>>>>>>>> The convention we use is: >>>>>>>>>>> >>>>>>>>>>> + for (unsigned i = 0; i < 4; ++i) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Evan >>>>>>>>>>> >>>>>>>>>>> On Feb 6, 2009, at 6:02 PM, Sandeep Patel wrote: >>>>>>>>>>> >>>>>>>>>>>> I think I've got all the cases handled now, implementing >>>>>>>>>>>> with >>>>>>>>>>>> CCCustom<"foo"> callbacks into C++. >>>>>>>>>>>> >>>>>>>>>>>> This also fixes a crash when returning i128. I've also >>>>>>>>>>>> included a >>>>>>>>>>>> small asm constraint fix that was needed to build newlib. >>>>>>>>>>>> >>>>>>>>>>>> deep >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jan 19, 2009 at 10:18 AM, Evan Cheng >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> On Jan 16, 2009, at 5:26 PM, Sandeep Patel wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jan 3, 2009 at 11:46 AM, Dan Gohman >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> One problem with this approach is that since i64 isn't >>>>>>>>>>>>>>> legal, >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> bitcast would require custom C++ code in the ARM >>>>>>>>>>>>>>> target to >>>>>>>>>>>>>>> handle properly. It might make sense to introduce >>>>>>>>>>>>>>> something >>>>>>>>>>>>>>> like >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CCIfType<[f64], CCCustom> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> where CCCustom is a new entity that tells the calling >>>>>>>>>>>>>>> convention >>>>>>>>>>>>>>> code to to let the target do something not easily >>>>>>>>>>>>>>> representable >>>>>>>>>>>>>>> in the tablegen minilanguage. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am thinking that this requires two changes: add a >>>>>>>>>>>>>> flag to >>>>>>>>>>>>>> CCValAssign (take a bit from HTP) to indicate isCustom >>>>>>>>>>>>>> and a >>>>>>>>>>>>>> way >>>>>>>>>>>>>> to >>>>>>>>>>>>>> author an arbitrary CCAction by including the source >>>>>>>>>>>>>> directly in >>>>>>>>>>>>>> the >>>>>>>>>>>>>> TableGen mini-language. This latter change might want a >>>>>>>>>>>>>> generic >>>>>>>>>>>>>> change >>>>>>>>>>>>>> to the TableGen language. For example, the syntax might >>>>>>>>>>>>>> be >>>>>>>>>>>>>> like: >>>>>>>>>>>>>> >>>>>>>>>>>>>> class foo : CCCustomAction { >>>>>>>>>>>>>> code <<< EOF >>>>>>>>>>>>>> ....multi-line C++ code goes here that allocates regs & >>>>>>>>>>>>>> mem >>>>>>>>>>>>>> and >>>>>>>>>>>>>> sets CCValAssign::isCustom.... >>>>>>>>>>>>>> EOF >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> Does this seem reasonable? An alternative is for CCCustom >>>>>>>>>>>>>> to >>>>>>>>>>>>>> take a >>>>>>>>>>>>>> string that names a function to be called: >>>>>>>>>>>>>> >>>>>>>>>>>>>> CCIfType<[f64], CCCustom<"MyCustomLoweringFunc">> >>>>>>>>>>>>>> >>>>>>>>>>>>>> the function signature for such functions will have to >>>>>>>>>>>>>> return >>>>>>>>>>>>>> two >>>>>>>>>>>>>> results: if the CC processing is finished and if it the >>>>>>>>>>>>>> func >>>>>>>>>>>>>> succeeded >>>>>>>>>>>>>> or failed: >>>>>>>>>>>>> >>>>>>>>>>>>> I like the second solution better. It seems rather >>>>>>>>>>>>> cumbersome >>>>>>>>>>>>> to >>>>>>>>>>>>> embed >>>>>>>>>>>>> multi-line c++ code in td files. >>>>>>>>>>>>> >>>>>>>>>>>>> Evan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> typedef bool CCCustomFn(unsigned ValNo, MVT ValVT, >>>>>>>>>>>>>> MVT LocVT, CCValAssign::LocInfo LocInfo, >>>>>>>>>>>>>> ISD::ArgFlagsTy ArgFlags, CCState >>>>>>>>>>>>>> &State, >>>>>>>>>>>>>> bool &result); >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>>>> >>>>>>>>>>>> < >>>>>>>>>>>> arm_callingconv >>>>>>>>>>>> .diff>_______________________________________________ >>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>>>> >>>>>>>>>> < >>>>>>>>>> arm_callingconv >>>>>>>>>> .diff >>>>>>>>>>> < >>>>>>>>>>> arm_fixes >>>>>>>>>>> .diff>_______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>> < >>> arm_callingconv >>> .diff >>> >_______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From hbrenkun at yahoo.cn Thu Feb 19 04:00:52 2009 From: hbrenkun at yahoo.cn (=?gb2312?B?yM7ApA==?=) Date: Thu, 19 Feb 2009 18:00:52 +0800 (CST) Subject: [LLVMdev] help: about how to use tblgen to constraint operand. Message-ID: <440668.85384.qm@web92401.mail.cnh.yahoo.com> I define a pattern to move two 32bits gpr to 64bits fpr. like arm instructure fmdrr. But I need to use an even/odd register pair to save its 2 operands. I define in mytarget.td: myfmdrr: SDTypeProfile<1, 2, [SDTCisVT<0, f64>, SDTCisVT<1, i32>, ????SDTCisSameAs<1, 2>]>; def my_fmdrr : ........... def myFMDRR : .... ??????????????????????(outs FPR: $result), ins(GPR: $op1, GPR:$op2 ) ???????????????????????[(setFPR: $result, (my_fmdrr GPR: $op1, GPR:$op2) )] I create myfmdrr instructure in mytargetISelLowering.cpp. and its operands are in R0 and R1. But after optimization, the operands are save R2 and R1. I know optimization pass does not know myfmdrr operands constraint. But How I tell optimzition pass by tblgen?? Could I can control operand constraint in mytargetiSelLowering.cpp? How do I control?? ___________________________________________________________ ????????????????? http://card.mail.cn.yahoo.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/dc5f40ae/attachment.html From jon at ffconsultancy.com Thu Feb 19 06:18:37 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 19 Feb 2009 12:18:37 +0000 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: References: <200902190121.33667.jon@ffconsultancy.com> Message-ID: <200902191218.37545.jon@ffconsultancy.com> On Thursday 19 February 2009 03:31:04 DeLesley Hutchins wrote: > > The same can be said of closures, garbage collection and a dozen other > > features that also cannot feasibly be added to LLVM. > > > > The only logical solution is to build a HLVM on top of LLVM and share > > that between these high-level language implementations. > > This is an excellent point. You have convinced me. :-) > > BTW, what garbage collector are you using for your HLVM? > > You > complain about mono's use of the Boehm-Weiser collector on your > blog; but you also said that you assumed an uncooperative > environment. I am creating a new (very simple) one by keeping any live local reference variables on a shadow stack. That assumes an uncooperative environment but it is still precise. That will suffice for now. > I'm not a GC expert, so why are the current GC intrinsics insufficient? They may well be sufficient but I am avoiding them for the same non-techical reason: I consider them to be an experimental feature of LLVM so I don't want to take a dependency on them if possible and, in this case, there is a simple workaround. Other people are creating far more bleeding edge VMs (e.g. VMKit) using LLVM's GC API so they would be much better positioned to discuss the technical aspects than I am. I would like to hear any status updates they have! -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From nicolas.geoffray at lip6.fr Thu Feb 19 07:39:24 2009 From: nicolas.geoffray at lip6.fr (Nicolas Geoffray) Date: Thu, 19 Feb 2009 14:39:24 +0100 Subject: [LLVMdev] Parametric polymorphism In-Reply-To: <200902191218.37545.jon@ffconsultancy.com> References: <200902190121.33667.jon@ffconsultancy.com> <200902191218.37545.jon@ffconsultancy.com> Message-ID: <499D610C.6090801@lip6.fr> Hi Jon, Jon Harrop wrote: > Other people are creating far more bleeding edge VMs (e.g. VMKit) using LLVM's > GC API so they would be much better positioned to discuss the technical > aspects than I am. I would like to hear any status updates they have! > > VMKit uses conservative GCs (Boehm or Mmap2, a GC developed in our lab), so we don't use the GC API. But we are thinking on using it some day. Nicolas From aaronngray.lists at googlemail.com Thu Feb 19 10:40:22 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Thu, 19 Feb 2009 16:40:22 +0000 Subject: [LLVMdev] -fPIC warning on every compile on Cygwin Message-ID: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> Hi, I partly built LLVM on Cygwin yesterday and it was fine as far as it went. But after doing a svn update today I am getting the following warning on every compile :- llvm[3]: Compiling LowerAllocations.cpp for Debug build /usr/src/llvm/lib/Transforms/Utils/LowerAllocations.cpp:1: warning: -fPIC ignored for target (all code is position independent) This maybe happening on other targets too. Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/c4881d78/attachment.html From jon at ffconsultancy.com Thu Feb 19 11:03:55 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 19 Feb 2009 17:03:55 +0000 Subject: [LLVMdev] VMKit (was Parametric polymorphism) In-Reply-To: <499D610C.6090801@lip6.fr> References: <200902191218.37545.jon@ffconsultancy.com> <499D610C.6090801@lip6.fr> Message-ID: <200902191703.55889.jon@ffconsultancy.com> On Thursday 19 February 2009 13:39:24 Nicolas Geoffray wrote: > Jon Harrop wrote: > > Other people are creating far more bleeding edge VMs (e.g. VMKit) using > > LLVM's GC API so they would be much better positioned to discuss the > > technical aspects than I am. I would like to hear any status updates they > > have! > > VMKit uses conservative GCs (Boehm or Mmap2, a GC developed in our lab), > so we don't use the GC API. Right. > But we are thinking on using it some day. I think it would be great if there were a simple working demo but I suppose the best such demo would be a minimal HLVM... What approach do you take to generics on the CLR? Also, are you gearing up for another release of VMKit to coincide with LLVM 2.5? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From echeng at apple.com Thu Feb 19 11:11:50 2009 From: echeng at apple.com (Evan Cheng) Date: Thu, 19 Feb 2009 09:11:50 -0800 Subject: [LLVMdev] help: about how to use tblgen to constraint operand. In-Reply-To: <440668.85384.qm@web92401.mail.cnh.yahoo.com> References: <440668.85384.qm@web92401.mail.cnh.yahoo.com> Message-ID: <027429E6-A8A3-4562-A48F-6D32D4707DF2@apple.com> Currently there is no constraint that tells the register allocator to allocate a consecutive register pair. What I would suggest you do is to declare pseudo register pair registers (and corresponding register class, say PAIR_GPR). In this case, your myFMDRR would take one input of PAIR_GPR class. The asm printer should be taught to print a PAIR_GPR register as two GPR registers (you should also teach the JIT of the same thing). A PAIR_GPR register should be a super register of two GPR registers. e.g. r0r1_pair is a super register of r0 and r1. In order to *construct* a PAIR_GPR register, you have to use two INSERT_SUBREG. To extract out a GPR from a PAIR_GPR, you need to issue EXTRACT_SUBREG. In most cases, these will be nop's. In other cases, they are copies. Evan On Feb 19, 2009, at 2:00 AM, ?? wrote: > I define a pattern to move two 32bits gpr to 64bits fpr. like arm > instructure fmdrr. > But I need to use an even/odd register pair to save its 2 operands. > I define in mytarget.td: > > myfmdrr: > SDTypeProfile<1, 2, [SDTCisVT<0, f64>, SDTCisVT<1, i32>, > SDTCisSameAs<1, 2>]>; > def my_fmdrr : ........... > def myFMDRR : .... > (outs FPR: $result), ins(GPR: $op1, GPR:$op2 ) > [(setFPR: $result, (my_fmdrr GPR: $op1, GPR: > $op2) )] > > I create myfmdrr instructure in mytargetISelLowering.cpp. and its > operands are in R0 and R1. > But after optimization, the operands are save R2 and R1. I know > optimization pass does not > know myfmdrr operands constraint. But How I tell optimzition pass by > tblgen?? > > Could I can control operand constraint in mytargetiSelLowering.cpp? > How do I control?? > > > ????????????????? > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/ba38a9af/attachment.html From echeng at apple.com Thu Feb 19 11:20:55 2009 From: echeng at apple.com (Evan Cheng) Date: Thu, 19 Feb 2009 09:20:55 -0800 Subject: [LLVMdev] direct calls to inttoptr constants In-Reply-To: <499439CC.5060307@dcs.gla.ac.uk> References: <0A1521D7F5274AAB9FC7AD18516F42DB@dev> <4992C8E4.50804@dcs.gla.ac.uk> <499439CC.5060307@dcs.gla.ac.uk> Message-ID: <24997091-E4E8-44F4-9CD1-9FB2CC4AA086@apple.com> It's a instruction selection issue. Please file a bugzilla with a test case. Thanks. Evan On Feb 12, 2009, at 7:01 AM, Mark Shannon wrote: > Tobias, > I've looked into this a bit more. > You are right. > The confusion arose as I have two versions of my compiler: > The ahead-of-time compiler uses symbolic info and does the right > thing. > The JIT compiler uses runtime addresses (in effect integers) and > when I examined the code in the debugger I found that LLVM produces > indirect calls, like this: > mov $0x8153c8c,%eax > call *%eax > > Sadly, however, I have no idea how to fix this :( > but I will try and investigate. > > Do you have any ideas yet? > > Mark. > > Tobias wrote: >> Hello Mark, >> >> I've followed your advice and changed the IR to: >> %0 = call i32 inttoptr (i32 12345678 to i32 (i32)*)(i32 0) nounwind >> the call is still indirect. >> >> IMHO llc does not call it directly because the address is neither >> a globalvalue (JIT) nor a external symbol. >> That's why it uses a fallback mechanism to call it indirectly >> assuming the address is not constant and is calculated at runtime. >> >> tobias >> >> Mark wrote: >>> I'm doing something similar (I use LLVM as part of my JIT >>> compiler) and >>> if I remember correctly, LLVM does the correct thing. >>> >>> I think you need to try changing the i64 value to an i32 value. >>> If that doesn't work you could also try replacing the tail call >>> with a >>> normal call. >>> >>> >>> Mark. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From nicholas at mxc.ca Thu Feb 19 11:52:25 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Thu, 19 Feb 2009 09:52:25 -0800 Subject: [LLVMdev] -fPIC warning on every compile on Cygwin In-Reply-To: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> References: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> Message-ID: <499D9C59.4060600@mxc.ca> Aaron Gray wrote: > Hi, > > I partly built LLVM on Cygwin yesterday and it was fine as far as it > went. But after doing a svn update today I am getting the following > warning on every compile :- > > > llvm[3]: Compiling LowerAllocations.cpp for Debug build > /usr/src/llvm/lib/Transforms/Utils/LowerAllocations.cpp:1: warning: > -fPIC ignored for target (all code is position independent) > > > This maybe happening on other targets too. Thanks for the report. This is certainly due to my change last night to make LLVM build as PIC by default. Part of the fix is going to be splitting apart whether we're building PIC and whether we pass in the -fPIC flag. We want to know whether the build is PIC in order to control whether libLTO should be built. On your platform we'll still want to build libLTO but don't want to pass the -fPIC flag. The other part of the fix I'm not so sure about. How should the build system detect that we're building PIC without the -fPIC flag on this platform? Nick > Aaron > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From aaronngray.lists at googlemail.com Thu Feb 19 12:32:40 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Thu, 19 Feb 2009 18:32:40 +0000 Subject: [LLVMdev] -fPIC warning on every compile on Cygwin In-Reply-To: <499D9C59.4060600@mxc.ca> References: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> <499D9C59.4060600@mxc.ca> Message-ID: <9719867c0902191032n4478ad07o1145c0c8a50ae245@mail.gmail.com> On Thu, Feb 19, 2009 at 5:52 PM, Nick Lewycky wrote: > Aaron Gray wrote: > > Hi, > > > > I partly built LLVM on Cygwin yesterday and it was fine as far as it > > went. But after doing a svn update today I am getting the following > > warning on every compile :- > > > > > > llvm[3]: Compiling LowerAllocations.cpp for Debug build > > /usr/src/llvm/lib/Transforms/Utils/LowerAllocations.cpp:1: warning: > > -fPIC ignored for target (all code is position independent) > > > > > > This maybe happening on other targets too. > > Thanks for the report. This is certainly due to my change last night to > make LLVM build as PIC by default. > I was a little perplex at the warning as I could not ascertain where thery were coming from and could see no obvious commit !:) Out of interest, could you point me to the patch or commit, please. > > Part of the fix is going to be splitting apart whether we're building > PIC and whether we pass in the -fPIC flag. We want to know whether the > build is PIC in order to control whether libLTO should be built. On your > platform we'll still want to build libLTO but don't want to pass the > -fPIC flag. > Nice >The other part of the fix I'm not so sure about. How should the build >system detect that we're building PIC without the -fPIC flag on this >platform? Looks like configure/autoconf territory. 'configure' flashes up that Cygwin supports PIC. See attached config.out file. Aaron > > Nick > > > Aaron > > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/83dc5ec6/attachment.html -------------- next part -------------- checking build system type... i686-pc-cygwin checking host system type... i686-pc-cygwin checking target system type... i686-pc-cygwin checking type of operating system we're going to host on... Cygwin checking target architecture... x86 checking for gcc... gcc checking for C compiler default output file name... a.exe checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... .exe checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking whether byte ordering is bigendian... no checking how to run the C preprocessor... gcc -E checking whether we are using the GNU C compiler... (cached) yes checking whether gcc accepts -g... (cached) yes checking for gcc option to accept ISO C89... (cached) none needed checking for g++... g++ checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking ... checking for flex... flex checking for yywrap in -lfl... yes checking lex output file root... lex.yy checking whether yytext is a pointer... yes checking ... checking for bison... bison -y checking for BSD-compatible nm... /usr/bin/nm -B checking for GNU make... make checking whether ln -s works... yes checking for cmp... /usr/bin/cmp checking for cp... /usr/bin/cp checking for date... /usr/bin/date checking for find... /usr/bin/find checking for grep... (cached) /usr/bin/grep checking for mkdir... /usr/bin/mkdir checking for mv... /usr/bin/mv checking for ranlib... ranlib checking for rm... /usr/bin/rm checking for sed... /usr/bin/sed checking for tar... /usr/bin/tar checking for pwd... /usr/bin/pwd checking for Graphviz... echo Graphviz checking for dot... echo dot checking for gv... no checking for gsview32... no checking for dotty... echo dotty checking for perl... /usr/bin/perl checking for Perl 5.006 or newer... yes checking for a BSD-compatible install... /usr/bin/install -c checking for bzip2... /usr/bin/bzip2 checking for doxygen... /usr/bin/doxygen checking for groff... /usr/bin/groff checking for gzip... /usr/bin/gzip checking for pod2html... /usr/bin/pod2html checking for pod2man... /usr/bin/pod2man checking for runtest... /usr/bin/runtest checking for the tclsh program in tclinclude directory... none checking for tclsh8.4... no checking for tclsh8.4.8... no checking for tclsh8.4.7... no checking for tclsh8.4.6... no checking for tclsh8.4.5... no checking for tclsh8.4.4... no checking for tclsh8.4.3... no checking for tclsh8.4.2... no checking for tclsh8.4.1... no checking for tclsh8.4.0... no checking for tclsh8.3... no checking for tclsh8.3.5... no checking for tclsh8.3.4... no checking for tclsh8.3.3... no checking for tclsh8.3.2... no checking for tclsh8.3.1... no checking for tclsh8.3.0... no checking for tclsh... /usr/bin/tclsh checking for zip... no checking for ocamlc... /usr/bin/ocamlc checking for ocamlopt... /usr/bin/ocamlopt checking for ocamldep... /usr/bin/ocamldep checking for ocamldoc... /usr/bin/ocamldoc checking for gas... no checking for as... /usr/bin/as checking for compiler -Wl,-R option... yes checking for an ANSI C-conforming const... yes checking for dirent.h that defines DIR... yes checking for library containing opendir... none required checking dlfcn.h usability... yes checking dlfcn.h presence... yes checking for dlfcn.h... yes checking dynamic linker characteristics... Win32 ld.exe checking which extension is used for loadable modules... .dll checking which variable specifies run-time library path... PATH checking for the default library search path... /lib /usr/lib checking for objdir... .libs checking command to parse /usr/bin/nm -B output from object... ok checking whether libtool supports -dlopen/-dlpreopen... yes checking for shl_load... no checking for shl_load in -ldld... no checking for dlopen in -ldl... yes checking for dlerror... yes checking for _ prefix in compiled symbols... yes checking whether we have to add an underscore for dlsym... unknown checking whether deplibs are loaded by dlopen... unknown checking argz.h usability... yes checking argz.h presence... yes checking for argz.h... yes checking for error_t... yes checking for argz_append... yes checking for argz_create_sep... yes checking for argz_insert... yes checking for argz_next... yes checking for argz_stringify... yes checking assert.h usability... yes checking assert.h presence... yes checking for assert.h... yes checking ctype.h usability... yes checking ctype.h presence... yes checking for ctype.h... yes checking errno.h usability... yes checking errno.h presence... yes checking for errno.h... yes checking malloc.h usability... yes checking malloc.h presence... yes checking for malloc.h... yes checking for memory.h... (cached) yes checking for stdlib.h... (cached) yes checking stdio.h usability... yes checking stdio.h presence... yes checking for stdio.h... yes checking for unistd.h... (cached) yes checking dl.h usability... no checking dl.h presence... no checking for dl.h... no checking sys/dl.h usability... no checking sys/dl.h presence... no checking for sys/dl.h... no checking dld.h usability... no checking dld.h presence... no checking for dld.h... no checking mach-o/dyld.h usability... no checking mach-o/dyld.h presence... no checking for mach-o/dyld.h... no checking for string.h... (cached) yes checking for strchr... yes checking for strrchr... yes checking for memcpy... yes checking for memmove... yes checking for strcmp... yes checking for closedir... yes checking for opendir... yes checking for readdir... yes checking for a sed that does not truncate output... /usr/bin/sed checking for ld used by gcc... /usr/i686-pc-cygwin/bin/ld.exe checking if the linker (/usr/i686-pc-cygwin/bin/ld.exe) is GNU ld... yes checking for /usr/i686-pc-cygwin/bin/ld.exe option to reload object files... -r checking how to recognise dependent libraries... file_magic ^x86 archive import|^x86 DLL checking how to run the C++ preprocessor... g++ -E checking for g77... no checking for f77... no checking for xlf... no checking for frt... no checking for pgf77... no checking for cf77... no checking for fort77... no checking for fl32... no checking for af77... no checking for f90... no checking for xlf90... no checking for pgf90... no checking for pghpf... no checking for epcf90... no checking for gfortran... no checking for g95... no checking for f95... no checking for fort... no checking for xlf95... no checking for ifort... no checking for ifc... no checking for efc... no checking for pgf95... no checking for lf95... no checking for ftn... no checking whether we are using the GNU Fortran 77 compiler... no checking whether accepts -g... no checking the maximum length of command line arguments... 8192 checking command to parse /usr/bin/nm -B output from gcc object... (cached) ok checking for objdir... .libs checking for ar... ar checking for ranlib... (cached) ranlib checking for strip... strip checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... checking if gcc static flag -static works... yes checking if gcc supports -c -o file.o... yes checking whether the gcc linker (/usr/i686-pc-cygwin/bin/ld.exe) supports shared libraries... yes checking whether -lc should be explicitly linked in... yes checking dynamic linker characteristics... Win32 ld.exe checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking whether a program can dlopen itself... no checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes configure: creating mklib appending configuration tag "CXX" to mklib checking for ld used by g++... /usr/i686-pc-cygwin/bin/ld.exe checking if the linker (/usr/i686-pc-cygwin/bin/ld.exe) is GNU ld... yes checking whether the g++ linker (/usr/i686-pc-cygwin/bin/ld.exe) supports shared libraries... yes checking for g++ option to produce PIC... checking if g++ static flag -static works... yes checking if g++ supports -c -o file.o... yes checking whether the g++ linker (/usr/i686-pc-cygwin/bin/ld.exe) supports shared libraries... yes checking dynamic linker characteristics... Win32 ld.exe checking how to hardcode library paths into programs... immediate appending configuration tag "F77" to mklib checking for llvm-gcc.exe... no checking for llvm-g++.exe... no checking tool compatibility... ok checking for elf_begin in -lelf... no checking for sin in -lm... yes checking for library containing dlopen... none required checking for ffi_call in -lffi... no checking for library containing mallinfo... none required checking for pthread_mutex_init in -lpthread... yes checking for library containing pthread_mutex_lock... none required checking for dirent.h that defines DIR... (cached) yes checking for library containing opendir... (cached) none required checking for MAP_ANONYMOUS vs. MAP_ANON... yes checking whether stat file-mode macros are broken... no checking for ANSI C header files... (cached) yes checking for sys/wait.h that is POSIX.1 compatible... yes checking whether time.h and sys/time.h may both be included... yes checking for dlfcn.h... (cached) yes checking execinfo.h usability... no checking execinfo.h presence... no checking for execinfo.h... no checking fcntl.h usability... yes checking fcntl.h presence... yes checking for fcntl.h... yes checking for inttypes.h... (cached) yes checking limits.h usability... yes checking limits.h presence... yes checking for limits.h... yes checking link.h usability... no checking link.h presence... no checking for link.h... no checking for malloc.h... (cached) yes checking setjmp.h usability... yes checking setjmp.h presence... yes checking for setjmp.h... yes checking signal.h usability... yes checking signal.h presence... yes checking for signal.h... yes checking for stdint.h... (cached) yes checking for unistd.h... (cached) yes checking utime.h usability... yes checking utime.h presence... yes checking for utime.h... yes checking windows.h usability... yes checking windows.h presence... yes checking for windows.h... yes checking sys/mman.h usability... yes checking sys/mman.h presence... yes checking for sys/mman.h... yes checking sys/param.h usability... yes checking sys/param.h presence... yes checking for sys/param.h... yes checking sys/resource.h usability... yes checking sys/resource.h presence... yes checking for sys/resource.h... yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for sys/types.h... (cached) yes checking malloc/malloc.h usability... no checking malloc/malloc.h presence... no checking for malloc/malloc.h... no checking mach/mach.h usability... no checking mach/mach.h presence... no checking for mach/mach.h... no checking pthread.h usability... yes checking pthread.h presence... yes checking for pthread.h... yes checking for HUGE_VAL sanity... yes checking for pid_t... yes checking for size_t... yes checking return type of signal handlers... void checking whether struct tm is in sys/time.h or time.h... time.h checking for int64_t... yes checking for uint64_t... yes checking for backtrace... no checking for ceilf... yes checking for floorf... yes checking for roundf... yes checking for rintf... yes checking for nearbyintf... yes checking for getcwd... yes checking for powf... yes checking for fmodf... yes checking for strtof... yes checking for round... yes checking for getpagesize... yes checking for getrusage... yes checking for getrlimit... yes checking for setrlimit... yes checking for gettimeofday... yes checking for isatty... yes checking for mkdtemp... yes checking for mkstemp... yes checking for mktemp... yes checking for realpath... yes checking for sbrk... yes checking for setrlimit... (cached) yes checking for strdup... yes checking for strerror... yes checking for strerror_r... yes checking for strtoll... yes checking for strtoq... no checking for sysconf... yes checking for malloc_zone_statistics... no checking for setjmp... yes checking for longjmp... yes checking for sigsetjmp... no checking for siglongjmp... no checking if printf has the %a format character... checking for working alloca.h... yes checking for alloca... yes checking for srand48/lrand48/drand48 in ... yes checking whether the compiler implements namespaces... yes checking whether the compiler has defining template class std::hash_map... no checking whether the compiler has defining template class __gnu_cxx::hash_map... yes checking whether the compiler has defining template class ::hash_map... no checking whether the compiler has defining template class std::hash_set... no checking whether the compiler has defining template class __gnu_cxx::hash_set... yes checking whether the compiler has defining template class ::hash_set... no checking whether the compiler has the standard iterator... yes checking whether the compiler has the bidirectional iterator... no checking whether the compiler has forward iterators... no checking for isnan in ... yes checking for isnan in ... no checking for std::isnan in ... yes checking for isinf in ... yes checking for isinf in ... no checking for std::isinf in ... no checking for finite in ... yes checking for stdlib.h... (cached) yes checking for unistd.h... (cached) yes checking for getpagesize... (cached) yes checking for working mmap... no checking for mmap of files... yes checking if /dev/zero is needed for mmap... no checking for __dso_handle... no checking whether llvm-gcc is sane... no checking for compiler -fvisibility-inlines-hidden option... no configure: creating ./config.status config.status: creating Makefile.config config.status: creating llvm.spec config.status: creating docs/doxygen.cfg config.status: creating tools/llvm-config/llvm-config.in config.status: creating include/llvm/Config/config.h config.status: creating include/llvm/Support/DataTypes.h config.status: creating include/llvm/ADT/hash_map.h config.status: creating include/llvm/ADT/hash_set.h config.status: creating include/llvm/ADT/iterator.h config.status: executing setup commands config.status: executing Makefile commands config.status: executing Makefile.common commands config.status: executing examples/Makefile commands config.status: executing lib/Makefile commands config.status: executing runtime/Makefile commands config.status: executing test/Makefile commands config.status: executing test/Makefile.tests commands config.status: executing unittests/Makefile commands config.status: executing tools/Makefile commands config.status: executing utils/Makefile commands config.status: executing projects/Makefile commands config.status: executing bindings/Makefile commands config.status: executing bindings/ocaml/Makefile.ocaml commands === configuring in projects/sample (/usr/build/llvm/projects/sample) configure: running /bin/sh /usr/src/llvm/projects/sample/configure --prefix=/usr/llvm --cache-file=/dev/null --srcdir=/usr/src/llvm/projects/sample configure: creating ./config.status config.status: creating Makefile.common config.status: executing setup commands config.status: executing Makefile commands config.status: executing lib/Makefile commands config.status: executing lib/sample/Makefile commands config.status: executing tools/Makefile commands config.status: executing tools/sample/Makefile commands From Micah.Villmow at amd.com Thu Feb 19 12:35:54 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 19 Feb 2009 10:35:54 -0800 Subject: [LLVMdev] Possible error in LegalizeDAG In-Reply-To: References: <5BA674C5FF7B384A92C2C95D8CC71E1C827DA1@ssanexmb1.amd.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C8D4632@ssanexmb1.amd.com> -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Eli Friedman Sent: Wednesday, February 18, 2009 3:01 PM To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Possible error in LegalizeDAG On Wed, Feb 18, 2009 at 10:14 AM, Villmow, Micah wrote: > I'm still trying to track down some alignment issues with loads(i.e. 8/16 > bit loads being turned into 32bit sign extending loads) and I cannot for the > life of me seem to figure out how to enter this section of code: > > // If this is an unaligned load and the target doesn't support it, > > // expand it. >Why do you expect to enter this section of code? It's impossible for >an i8 load to be unaligned. On the hardware that I am targeting, which is not a CPU, I must support i8 loads, however the hardware only supports natively 32bit aligned loads, therefore I have to read in 4 i8's and unpack them and shift them based on the read address. So any i8 load has a 75% chance of being unaligned on my hardware, so I need a way to tell LLVM not to generate sext_loads, or if it does to expand them. Everything that has been suggested so far has not worked. > Any hints would be greatly appreciated, this is a blocking issue that I just > cannot seem to resolve without modifying the LLVM codebase to remove the > extend + load -> extload combining step. >LLVM will "uncombine" it for you if you use setLoadExtAction with the >appropriate arguments. >-Eli I've tried setting setLoadXAction to Custom, Legal, Expand and Promote. I assert somewhere when I try to custom expanding this operation because it expects it to be a certain form, but my custom load instruction has a different form. Setting it to Legal generates the sext_load in the first dag combine pass, because it never checks if it should make this combination. Since it doesn't enter the section of code I mentioned earlier, it never uncombines it. When I set it to promote, it asserts on "not yet implemented". Setting it to Expand does not expand it to sign_extend and load but to extload and sign_extend, but I don't support extload either. Please correct me if I am wrong, but I've been looking at this issue for awhile now and I cannot see where it uncombines the sextload to a load and sign_extension. My current solution is to just comment out that combination so that it never occurs. Thanks, Micah From jon at ffconsultancy.com Thu Feb 19 13:00:14 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 19 Feb 2009 19:00:14 +0000 Subject: [LLVMdev] Improving performance with optimization passes Message-ID: <200902191900.14089.jon@ffconsultancy.com> I'm toying with benchmarks on my HLVM and am unable to get any performance improvement from optimization passes. Moreover, some of my programs generate a lot of redundant code (e.g. alloca a struct, store a struct into it and read only one field without using the rest of the struct) and this does not appear to be optimized away. I simply copied the use of PassManager from the Kaleidoscope tutorial: let pm = PassManager.create_function mp in TargetData.add (ExecutionEngine.target_data ee) pm; add_constant_propagation pm; (* Do simple "peephole" optimizations and bit-twiddling optzn. *) add_instruction_combining pm; (* reassociate expressions. *) add_reassociation pm; (* Eliminate Common SubExpressions. *) add_gvn pm; (* Simplify the control flow graph (deleting unreachable blocks, etc). *) add_cfg_simplification pm; add_memory_to_register_promotion pm; and then I apply "PassManager.run_function" to every function after it is validated. Any idea what I might be doing wrong? Has anyone else got this functionality giving performance boosts from OCaml? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From stuart at apple.com Thu Feb 19 12:55:25 2009 From: stuart at apple.com (Stuart Hastings) Date: Thu, 19 Feb 2009 10:55:25 -0800 Subject: [LLVMdev] please review this fix for PR3510 Message-ID: <83E688BB-CE51-451A-93B4-5B2A0A7EEA57@apple.com> Please review this patch for PR3510 (and ). The bug is a failure to handle a "hole" inside an initialized structure, where the hole may be induced by a designated initializer or by alignment: http://llvm.org/bugs/show_bug.cgi?id=3510 The original code was greatly simplified by using FieldNo to index the LLVM fields and the initializer in lock-step. Alas, that fails when the initializer requires alignment padding; such padding is not recorded in the LLVM structure type. This implies that the initialized ResultElts[] may have many more fields than the LLVM type. The patched code tracks the HighWaterMarkInBits; it points to the next bit to allocate. If the starting offset of the next initializing value doesn't match, either byte-alignment or padding is indicated. FieldNo counts the fields (initializers or padding) created in ResultElts[], and LLVMFieldNo walks through the LLVM structure type fields. Note that ResultElts[] is now created to hold 2X the number of LLVM fields in order to accommodate padding fields; it is shrunk-to- fit after the initializer is complete. I've seen hints in the code that C++ can generate overlapping field declarations, but I haven't personally seen this, and my patch has no explicit provision for it. If any reader has experience with this issue, I would be grateful for assistance or a demonstrating testcase. I would not be surprised if my patch failed when encountering an unholy convergence of packed designated bitfields or whatever. The existing ProcessBitFieldInitialization() is invoked for every bitfield; it expects to be passed the same FieldNo pointing at a previously-created initializer so it can modify a previously-existing constant value. This is counter-intuitive when compared with the non- bitfield case that increments FieldNo after every initialization is processed, but I believe this is necessary. However, there is an awkward mental gear-shift when encountering a non-bitfield initializer following a bitfield; this is handled with an ugly "PredecessorWasBitfield" state-variable. (Suggestions for a less- inelegant solution are welcome.) The patch handles the PR3510 testcase correctly and has successfully passed the GCC DejaGNU testsuite. However, I don't think we have sufficient tests for this issue, so I'm working on a few new testcases in the background. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: llvm-gcc.test.diffs.txt Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/f3d97bb6/attachment.txt -------------- next part -------------- stuart From nicholas at mxc.ca Thu Feb 19 12:55:53 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Thu, 19 Feb 2009 10:55:53 -0800 Subject: [LLVMdev] -fPIC warning on every compile on Cygwin In-Reply-To: <9719867c0902191032n4478ad07o1145c0c8a50ae245@mail.gmail.com> References: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> <499D9C59.4060600@mxc.ca> <9719867c0902191032n4478ad07o1145c0c8a50ae245@mail.gmail.com> Message-ID: <499DAB39.9090600@mxc.ca> Aaron Gray wrote: > On Thu, Feb 19, 2009 at 5:52 PM, Nick Lewycky > wrote: > > Aaron Gray wrote: > > Hi, > > > > I partly built LLVM on Cygwin yesterday and it was fine as far as it > > went. But after doing a svn update today I am getting the following > > warning on every compile :- > > > > > > llvm[3]: Compiling LowerAllocations.cpp for Debug build > > /usr/src/llvm/lib/Transforms/Utils/LowerAllocations.cpp:1: > warning: > > -fPIC ignored for target (all code is position independent) > > > > > > This maybe happening on other targets too. > > Thanks for the report. This is certainly due to my change last night to > make LLVM build as PIC by default. > > I was a little perplex at the warning as I could not ascertain where > thery were coming from and could see no obvious commit !:) > > Out of interest, could you point me to the patch or commit, please. r65019 / r65020: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073983.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073984.html (note: links will change) Nick > > > > Part of the fix is going to be splitting apart whether we're building > PIC and whether we pass in the -fPIC flag. We want to know whether the > build is PIC in order to control whether libLTO should be built. On your > platform we'll still want to build libLTO but don't want to pass the > -fPIC flag. > > Nice > > >The other part of the fix I'm not so sure about. How should the build > >system detect that we're building PIC without the -fPIC flag on this > >platform? > > Looks like configure/autoconf territory. 'configure' flashes up that > Cygwin supports PIC. See attached config.out file. > > > Aaron > > > > Nick > > > Aaron > > > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu > http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From eli.friedman at gmail.com Thu Feb 19 13:17:59 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 19 Feb 2009 11:17:59 -0800 Subject: [LLVMdev] Possible error in LegalizeDAG In-Reply-To: <5BA674C5FF7B384A92C2C95D8CC71E1C8D4632@ssanexmb1.amd.com> References: <5BA674C5FF7B384A92C2C95D8CC71E1C827DA1@ssanexmb1.amd.com> <5BA674C5FF7B384A92C2C95D8CC71E1C8D4632@ssanexmb1.amd.com> Message-ID: On Thu, Feb 19, 2009 at 10:35 AM, Villmow, Micah wrote: > On the hardware that I am targeting, which is not a CPU, I must support > i8 loads, however the hardware only supports natively 32bit aligned > loads, therefore I have to read in 4 i8's and unpack them and shift them > based on the read address. So any i8 load has a 75% chance of being > unaligned on my hardware, Oh, okay, makes sense. > I've tried setting setLoadXAction to Custom, Legal, Expand and Promote. > Setting it to Expand does not expand it to > sign_extend and load but to extload and sign_extend, but I don't > support extload either. I suppose you could consider that a bug. That said, why is this difficult to implement? You can just treat an extload of an i8 as a load of an i8 and get correct code, no? -Eli From gordonhenriksen at me.com Thu Feb 19 13:32:14 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Thu, 19 Feb 2009 14:32:14 -0500 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <200902191900.14089.jon@ffconsultancy.com> References: <200902191900.14089.jon@ffconsultancy.com> Message-ID: Hi Jon, On 2009-02-19, at 14:00, Jon Harrop wrote: > I'm toying with benchmarks on my HLVM and am unable to get any > performance improvement from optimization passes. I simply copied > the use of PassManager from the Kaleidoscope tutorial: > > Any idea what I might be doing wrong? Has anyone else got this > functionality giving performance boosts from OCaml? That's a pretty barren optimization pipeline. http://llvm.org/docs/Passes.html See opt --help and look into what -std-compile-opts is. That's the usual starting point for new front ends, although it's a whole-module pass pipeline. But it's only a starting point, since it's tuned for llvm-gcc's codegen and yours will probably differ. > Moreover, some of my programs generate a lot of redundant code (e.g. > alloca a struct, store a struct into it and read only one field > without using the rest of the struct) and this does not appear to be > optimized away. I think first-class aggregates are mostly used for passing arguments in llvm-gcc and clang; maybe mem2reg can't see unravel the loads. Have you tried emitting loads and stores of the scalar elements to see if mem2reg can eliminate the allocas then? ? Gordon P.S. This is not a trivial problem domain. Here's an interesting paper on the subject. COLE: Compiler Optimization Level Exploration http://users.elis.ugent.be/~leeckhou/papers/cgo08.pdf From jon at ffconsultancy.com Thu Feb 19 13:44:29 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 19 Feb 2009 19:44:29 +0000 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <200902191900.14089.jon@ffconsultancy.com> References: <200902191900.14089.jon@ffconsultancy.com> Message-ID: <200902191944.29570.jon@ffconsultancy.com> On Thursday 19 February 2009 19:00:14 Jon Harrop wrote: > I'm toying with benchmarks on my HLVM and am unable to get any performance > improvement from optimization passes... I just disassembled some of the IR before and after optimization. This example function squares a complex number: let zsqr(r, i) = (r*r - i*i, 2*r*i) My compiler is generating: define fastcc i32 @zsqr({ double, double }*, { double, double }) { entry: %2 = alloca { double, double } ; <{ double, double }*> [#uses=2] %3 = getelementptr { double, double }* %2, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %1, { double, double }* %3 %4 = getelementptr { double, double }* %2, i32 0, i32 0 ; [#uses=1] %5 = load double* %4 ; [#uses=1] %6 = alloca { double, double } ; <{ double, double }*> [#uses=2] %7 = getelementptr { double, double }* %6, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %1, { double, double }* %7 %8 = getelementptr { double, double }* %6, i32 0, i32 0 ; [#uses=1] %9 = load double* %8 ; [#uses=1] %10 = mul double %5, %9 ; [#uses=1] %11 = alloca { double, double } ; <{ double, double }*> [#uses=2] %12 = getelementptr { double, double }* %11, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %1, { double, double }* %12 %13 = getelementptr { double, double }* %11, i32 0, i32 1 ; [#uses=1] %14 = load double* %13 ; [#uses=1] %15 = alloca { double, double } ; <{ double, double }*> [#uses=2] %16 = getelementptr { double, double }* %15, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %1, { double, double }* %16 %17 = getelementptr { double, double }* %15, i32 0, i32 1 ; [#uses=1] %18 = load double* %17 ; [#uses=1] %19 = mul double %14, %18 ; [#uses=1] %20 = sub double %10, %19 ; [#uses=1] %21 = alloca { double, double } ; <{ double, double }*> [#uses=2] %22 = getelementptr { double, double }* %21, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %1, { double, double }* %22 %23 = getelementptr { double, double }* %21, i32 0, i32 0 ; [#uses=1] %24 = load double* %23 ; [#uses=1] %25 = mul double 2.000000e+00, %24 ; [#uses=1] %26 = alloca { double, double } ; <{ double, double }*> [#uses=2] %27 = getelementptr { double, double }* %26, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %1, { double, double }* %27 %28 = getelementptr { double, double }* %26, i32 0, i32 1 ; [#uses=1] %29 = load double* %28 ; [#uses=1] %30 = mul double %25, %29 ; [#uses=1] %31 = alloca { double, double } ; <{ double, double }*> [#uses=3] %32 = getelementptr { double, double }* %31, i32 0, i32 0 ; [#uses=1] store double %20, double* %32 %33 = getelementptr { double, double }* %31, i32 0, i32 1 ; [#uses=1] store double %30, double* %33 %34 = getelementptr { double, double }* %31, i32 0 ; <{ double, double }*> [#uses=1] %35 = load { double, double }* %34 ; <{ double, double }> [#uses=1] %36 = getelementptr { double, double }* %0, i32 0 ; <{ double, double }*> [#uses=1] store { double, double } %35, { double, double }* %36 ret i32 0 } But those LLVM optimization passes only reduce it to: define fastcc i32 @zsqr({ double, double }*, { double, double }) { entry: %2 = alloca { double, double } ; <{ double, double }*> [#uses=2] store { double, double } %1, { double, double }* %2, align 8 %3 = getelementptr { double, double }* %2, i32 0, i32 0 ; [#uses=1] %4 = load double* %3, align 8 ; [#uses=1] %5 = alloca { double, double } ; <{ double, double }*> [#uses=2] store { double, double } %1, { double, double }* %5, align 8 %6 = getelementptr { double, double }* %5, i32 0, i32 0 ; [#uses=1] %7 = load double* %6, align 8 ; [#uses=1] %8 = mul double %4, %7 ; [#uses=1] %9 = alloca { double, double } ; <{ double, double }*> [#uses=2] store { double, double } %1, { double, double }* %9, align 8 %10 = getelementptr { double, double }* %9, i32 0, i32 1 ; [#uses=1] %11 = load double* %10, align 8 ; [#uses=1] %12 = alloca { double, double } ; <{ double, double }*> [#uses=2] store { double, double } %1, { double, double }* %12, align 8 %13 = getelementptr { double, double }* %12, i32 0, i32 1 ; [#uses=1] %14 = load double* %13, align 8 ; [#uses=1] %15 = mul double %11, %14 ; [#uses=1] %16 = sub double %8, %15 ; [#uses=1] %17 = alloca { double, double } ; <{ double, double }*> [#uses=2] store { double, double } %1, { double, double }* %17, align 8 %18 = getelementptr { double, double }* %17, i32 0, i32 0 ; [#uses=1] %19 = load double* %18, align 8 ; [#uses=1] %20 = mul double %19, 2.000000e+00 ; [#uses=1] %21 = alloca { double, double } ; <{ double, double }*> [#uses=2] store { double, double } %1, { double, double }* %21, align 8 %22 = getelementptr { double, double }* %21, i32 0, i32 1 ; [#uses=1] %23 = load double* %22, align 8 ; [#uses=1] %24 = mul double %20, %23 ; [#uses=1] %25 = alloca { double, double } ; <{ double, double }*> [#uses=3] %26 = getelementptr { double, double }* %25, i32 0, i32 0 ; [#uses=1] store double %16, double* %26, align 8 %27 = getelementptr { double, double }* %25, i32 0, i32 1 ; [#uses=1] store double %24, double* %27, align 8 %28 = load { double, double }* %25, align 8 ; <{ double, double }> [#uses=1] store { double, double } %28, { double, double }* %0 ret i32 0 } So the optimization passes are at least doing something but they are a long way from generating optimal code. Does LLVM have any optimization passes that would promote these structs out of the stack and replace the loads with extractvalue instructions? The ideal result is probably: define fastcc i32 @zsqr({ double, double }*, { double, double }) { entry: %1 = extractvalue {double, double} %1, 0 %2 = extractvalue {double, double} %1, 1 %3 = mul double %1, %1 %4 = mul double %2, %2 %5 = sub double %3, %4 %6 = getelementptr { double, double }* %0, i32 0, i32 0 store double %5, double* %6, align 8 %7 = mul double %1, 2.0 %8 = mul double %7, %2 %9 = getelementptr { double, double }* %0, i32 0, i32 1 store double %8, double* %9, align 8 ret i32 0 } -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From gohman at apple.com Thu Feb 19 13:44:47 2009 From: gohman at apple.com (Dan Gohman) Date: Thu, 19 Feb 2009 11:44:47 -0800 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <200902191900.14089.jon@ffconsultancy.com> References: <200902191900.14089.jon@ffconsultancy.com> Message-ID: <06998A17-CE1C-4308-B704-5B42C6FA098A@apple.com> To add to what Gordon said, the SROA pass, aka -scalarrepl, aka scalar replacement of aggregates, is the main pass which splits struct allocas into fields that the rest of the optimizer can work with. Dan On Feb 19, 2009, at 11:00 AM, Jon Harrop wrote: > > I'm toying with benchmarks on my HLVM and am unable to get any > performance > improvement from optimization passes. Moreover, some of my programs > generate > a lot of redundant code (e.g. alloca a struct, store a struct into > it and > read only one field without using the rest of the struct) and this > does not > appear to be optimized away. > > I simply copied the use of PassManager from the Kaleidoscope tutorial: > > let pm = PassManager.create_function mp in > TargetData.add (ExecutionEngine.target_data ee) pm; > > add_constant_propagation pm; > > (* Do simple "peephole" optimizations and bit-twiddling optzn. *) > add_instruction_combining pm; > > (* reassociate expressions. *) > add_reassociation pm; > > (* Eliminate Common SubExpressions. *) > add_gvn pm; > > (* Simplify the control flow graph (deleting unreachable blocks, > etc). *) > add_cfg_simplification pm; > > add_memory_to_register_promotion pm; > > and then I apply "PassManager.run_function" to every function after > it is > validated. > > Any idea what I might be doing wrong? Has anyone else got this > functionality > giving performance boosts from OCaml? > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From Micah.Villmow at amd.com Thu Feb 19 13:49:09 2009 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 19 Feb 2009 11:49:09 -0800 Subject: [LLVMdev] Possible error in LegalizeDAG In-Reply-To: References: <5BA674C5FF7B384A92C2C95D8CC71E1C827DA1@ssanexmb1.amd.com><5BA674C5FF7B384A92C2C95D8CC71E1C8D4632@ssanexmb1.amd.com> Message-ID: <5BA674C5FF7B384A92C2C95D8CC71E1C8D4659@ssanexmb1.amd.com> > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On > Behalf Of Eli Friedman > Sent: Thursday, February 19, 2009 11:18 AM > To: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Possible error in LegalizeDAG > > On Thu, Feb 19, 2009 at 10:35 AM, Villmow, Micah > wrote: > > On the hardware that I am targeting, which is not a CPU, I must support > > i8 loads, however the hardware only supports natively 32bit aligned > > loads, therefore I have to read in 4 i8's and unpack them and shift them > > based on the read address. So any i8 load has a 75% chance of being > > unaligned on my hardware, > > Oh, okay, makes sense. > > > I've tried setting setLoadXAction to Custom, Legal, Expand and Promote. > > Setting it to Expand does not expand it to > > sign_extend and load but to extload and sign_extend, but I don't > > support extload either. > > I suppose you could consider that a bug. That said, why is this > difficult to implement? You can just treat an extload of an i8 as a > load of an i8 and get correct code, no? > [Micah Villmow] The problem with the extload is that it is still generating a 32bit extload instead of an 8bit extload. > -Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From jon at ffconsultancy.com Thu Feb 19 14:01:38 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 19 Feb 2009 20:01:38 +0000 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: References: <200902191900.14089.jon@ffconsultancy.com> Message-ID: <200902192001.38845.jon@ffconsultancy.com> On Thursday 19 February 2009 19:32:14 Gordon Henriksen wrote: > Hi Jon, > > On 2009-02-19, at 14:00, Jon Harrop wrote: > > I'm toying with benchmarks on my HLVM and am unable to get any > > performance improvement from optimization passes. I simply copied > > the use of PassManager from the Kaleidoscope tutorial: > > > > Any idea what I might be doing wrong? Has anyone else got this > > functionality giving performance boosts from OCaml? > > That's a pretty barren optimization pipeline. Right but am I correct in believing that what I have done is pretty much all that you can do from OCaml right now? > http://llvm.org/docs/Passes.html > > See opt --help and look into what -std-compile-opts is. That's the > usual starting point for new front ends, although it's a whole-module > pass pipeline. But it's only a starting point, since it's tuned for > llvm-gcc's codegen and yours will probably differ. Thanks. > > Moreover, some of my programs generate a lot of redundant code (e.g. > > alloca a struct, store a struct into it and read only one field > > without using the rest of the struct) and this does not appear to be > > optimized away. > > I think first-class aggregates are mostly used for passing arguments > in llvm-gcc and clang; maybe mem2reg can't see unravel the loads. Have > you tried emitting loads and stores of the scalar elements to see if > mem2reg can eliminate the allocas then? If I could do that I wouldn't be in this mess! ;-) I am only generating these temporary structs on the stack because I cannot load and store struct elements any other way because the OCaml bindings do not yet have insertvalue and extractvalue. > P.S. This is not a trivial problem domain. Here's an interesting paper > on the subject. > > COLE: Compiler Optimization Level Exploration > http://users.elis.ugent.be/~leeckhou/papers/cgo08.pdf I'll check it out, thanks. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From anton at korobeynikov.info Thu Feb 19 14:02:10 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 19 Feb 2009 23:02:10 +0300 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <200902191944.29570.jon@ffconsultancy.com> References: <200902191900.14089.jon@ffconsultancy.com> <200902191944.29570.jon@ffconsultancy.com> Message-ID: <42A78FE5-3B67-4EE3-BC7B-3FAB444D69FF@korobeynikov.info> > > On Thursday 19 February 2009 19:00:14 Jon Harrop wrote: >> I'm toying with benchmarks on my HLVM and am unable to get any >> performance >> improvement from optimization passes... > > I just disassembled some of the IR before and after optimization. > This example > function squares a complex number: Something is definitely wrong with the way you're using optimization passes. > The ideal result is probably: It is indeed so: ./opt -std-compile-opts test.bc | ./llvm-dis ; ModuleID = '' define fastcc i32 @zsqr({ double, double }* nocapture, { double, double }) nounwind { entry: %2 = extractvalue { double, double } %1, 0 ; [#uses=1] %3 = extractvalue { double, double } %1, 0 ; [#uses=1] %4 = mul double %2, %3 ; [#uses=1] %5 = extractvalue { double, double } %1, 1 ; [#uses=1] %6 = extractvalue { double, double } %1, 1 ; [#uses=1] %7 = mul double %5, %6 ; [#uses=1] %8 = sub double %4, %7 ; [#uses=1] %9 = extractvalue { double, double } %1, 0 ; [#uses=1] %10 = mul double %9, 2.000000e+00 ; [#uses=1] %11 = extractvalue { double, double } %1, 1 ; [#uses=1] %12 = mul double %10, %11 ; [#uses=1] %insert = insertvalue { double, double } undef, double %8, 0 ; <{ double, double }> [#uses=1] %insert2 = insertvalue { double, double } %insert, double %12, 1 ; <{ double, double }> [#uses=1] store { double, double } %insert2, { double, double }* %0 ret i32 0 } --- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/1d985257/attachment-0001.html From gordonhenriksen at me.com Thu Feb 19 14:09:34 2009 From: gordonhenriksen at me.com (Gordon Henriksen) Date: Thu, 19 Feb 2009 15:09:34 -0500 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <200902192001.38845.jon@ffconsultancy.com> References: <200902191900.14089.jon@ffconsultancy.com> <200902192001.38845.jon@ffconsultancy.com> Message-ID: On 2009-02-19, at 15:01, Jon Harrop wrote: > On Thursday 19 February 2009 19:32:14 Gordon Henriksen wrote: >> > >> Have you tried emitting loads and stores of the scalar elements to >> see if mem2reg can eliminate the allocas then? > > If I could do that I wouldn't be in this mess! ;-) > > I am only generating these temporary structs on the stack because I > cannot load and store struct elements any other way because the > OCaml bindings do not yet have insertvalue and extractvalue. Path of least resistance would definitely be to add some new bindings. ? Gordon From aaronngray.lists at googlemail.com Thu Feb 19 14:36:34 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Thu, 19 Feb 2009 20:36:34 +0000 Subject: [LLVMdev] Whats GoogleTest ? Message-ID: <9719867c0902191236x508d106di48dcef46ba04fc04@mail.gmail.com> What is googletest ? Its aufully messy warnings wise on Cygwin. Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/8de0a211/attachment.html From tonic at nondot.org Thu Feb 19 14:53:39 2009 From: tonic at nondot.org (Tanya M. Lattner) Date: Thu, 19 Feb 2009 12:53:39 -0800 (PST) Subject: [LLVMdev] Whats GoogleTest ? In-Reply-To: <9719867c0902191236x508d106di48dcef46ba04fc04@mail.gmail.com> References: <9719867c0902191236x508d106di48dcef46ba04fc04@mail.gmail.com> Message-ID: > What is googletest ? Framework for units tests in llvm. http://code.google.com/p/googletest/ > Its aufully messy warnings wise on Cygwin. If you have gcc 3.X you are going to get a bunch of warnings. What gcc are you using? -Tanya From aaronngray.lists at googlemail.com Thu Feb 19 14:48:20 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Thu, 19 Feb 2009 20:48:20 +0000 Subject: [LLVMdev] -fPIC warning on every compile on Cygwin In-Reply-To: <499DAB39.9090600@mxc.ca> References: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> <499D9C59.4060600@mxc.ca> <9719867c0902191032n4478ad07o1145c0c8a50ae245@mail.gmail.com> <499DAB39.9090600@mxc.ca> Message-ID: <9719867c0902191248k5f49eb5cg1ec4930b84a680a9@mail.gmail.com> On Thu, Feb 19, 2009 at 6:55 PM, Nick Lewycky wrote: > Aaron Gray wrote: > > On Thu, Feb 19, 2009 at 5:52 PM, Nick Lewycky > > wrote: > > > > Aaron Gray wrote: > > > Hi, > > > > > > I partly built LLVM on Cygwin yesterday and it was fine as far as > it > > > went. But after doing a svn update today I am getting the > following > > > warning on every compile :- > > > > > > > > > llvm[3]: Compiling LowerAllocations.cpp for Debug build > > > /usr/src/llvm/lib/Transforms/Utils/LowerAllocations.cpp:1: > > warning: > > > -fPIC ignored for target (all code is position independent) > > > > > > > > > This maybe happening on other targets too. > > > > Thanks for the report. This is certainly due to my change last night > to > > make LLVM build as PIC by default. > > > > I was a little perplex at the warning as I could not ascertain where > > thery were coming from and could see no obvious commit !:) > > > > Out of interest, could you point me to the patch or commit, please. > > r65019 / r65020: > > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073983.html > > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073984.html > Can you please email me if and when it is fixed :) Thanks, Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/3f83f521/attachment.html From eli.friedman at gmail.com Thu Feb 19 15:29:02 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 19 Feb 2009 13:29:02 -0800 Subject: [LLVMdev] Whats GoogleTest ? In-Reply-To: References: <9719867c0902191236x508d106di48dcef46ba04fc04@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 12:53 PM, Tanya M. Lattner wrote: >> Its aufully messy warnings wise on Cygwin. > > If you have gcc 3.X you are going to get a bunch of warnings. What gcc are > you using? Cygwin still packages gcc 3.4, AFAIK. -Eli From mrs at apple.com Thu Feb 19 15:58:40 2009 From: mrs at apple.com (Mike Stump) Date: Thu, 19 Feb 2009 13:58:40 -0800 Subject: [LLVMdev] Whats GoogleTest ? In-Reply-To: References: <9719867c0902191236x508d106di48dcef46ba04fc04@mail.gmail.com> Message-ID: <073BE6DC-0920-489E-A985-5AB0F50AE921@apple.com> On Feb 19, 2009, at 12:53 PM, Tanya M. Lattner wrote: >> Its aufully messy warnings wise on Cygwin. > > If you have gcc 3.X you are going to get a bunch of warnings. Since we don't develop googletest, we could use -w as a compilation flag on the bits we don't care about or maintain. From clattner at apple.com Thu Feb 19 16:31:29 2009 From: clattner at apple.com (Chris Lattner) Date: Thu, 19 Feb 2009 14:31:29 -0800 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <200902191944.29570.jon@ffconsultancy.com> References: <200902191900.14089.jon@ffconsultancy.com> <200902191944.29570.jon@ffconsultancy.com> Message-ID: On Feb 19, 2009, at 11:44 AM, Jon Harrop wrote: > On Thursday 19 February 2009 19:00:14 Jon Harrop wrote: >> I'm toying with benchmarks on my HLVM and am unable to get any >> performance >> improvement from optimization passes... > > I just disassembled some of the IR before and after optimization. > This example > function squares a complex number: > > let zsqr(r, i) = (r*r - i*i, 2*r*i) > > My compiler is generating: > > define fastcc i32 @zsqr({ double, double }*, { double, double }) { > entry: > %2 = alloca { double, double } ; <{ double, double }*> [#uses=2] Jon, make sure you always emit allocas into the entry block of the function. Otherwise mem2reg and SROA won't hack on them. -Chris From aaronngray.lists at googlemail.com Thu Feb 19 16:57:03 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Thu, 19 Feb 2009 22:57:03 +0000 Subject: [LLVMdev] Whats GoogleTest ? In-Reply-To: References: <9719867c0902191236x508d106di48dcef46ba04fc04@mail.gmail.com> Message-ID: <9719867c0902191457m46509cabm1f8b7086bf2a77bc@mail.gmail.com> On Thu, Feb 19, 2009 at 8:53 PM, Tanya M. Lattner wrote: > > > What is googletest ? > > Framework for units tests in llvm. > http://code.google.com/p/googletest/ > > > Its aufully messy warnings wise on Cygwin. > > If you have gcc 3.X you are going to get a bunch of warnings. What gcc are > you using? Yes 3.4.4, its not too bad though. > > -Tanya > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/a52017de/attachment.html From nicolas.geoffray at lip6.fr Thu Feb 19 17:51:14 2009 From: nicolas.geoffray at lip6.fr (Nicolas Geoffray) Date: Fri, 20 Feb 2009 00:51:14 +0100 Subject: [LLVMdev] VMKit (was Parametric polymorphism) In-Reply-To: <200902191703.55889.jon@ffconsultancy.com> References: <200902191218.37545.jon@ffconsultancy.com> <499D610C.6090801@lip6.fr> <200902191703.55889.jon@ffconsultancy.com> Message-ID: <499DF072.3060800@lip6.fr> Jon Harrop wrote: > What approach do you take to generics on the CLR? I didn't implement the generics on the CLR port, but I think it instantiates a new LLVM method the first time the method is called. Then I think it's using caches to not instantiate multiple times the method with the same type. > Also, are you gearing up for > another release of VMKit to coincide with LLVM 2.5? > > Yes, soon to come, with many exciting new features! :) Nicolas From hbrenkun at yahoo.cn Thu Feb 19 22:26:44 2009 From: hbrenkun at yahoo.cn (=?gb2312?B?yM7ApA==?=) Date: Fri, 20 Feb 2009 12:26:44 +0800 (CST) Subject: [LLVMdev] help: about how to use tblgen to constraint operand. Message-ID: <971617.13744.qm@web92414.mail.cnh.yahoo.com> hi, Dear Evan Cheng: My cpu is i32 embeded CPU. I define pseudo register pair registers. In mytargetRegisterInfo.td: def T0: RegisterWithSubRegs<"t0",[R0,R1]>; ... def GPR64 : RegisterClass<"mytarget", [i64], 64, [T0, T1.....] In mytargetISelLowering.cpp: I define i1, i8 , i16 and i32 are legal. 1. I still have problem. I save my function?return double ?value in R0?and R1.? It is expanded into two i32. But my GPR64 is defined to save i64. llvm finds I have i64 GPR register. It?will automatically decide not to expand i64 to two i32. 2. I guess I need a special pseudo instruction to move between GPR32 and GPR64. How to move R0, R1 to T1( R2, R3 pair). and don't convert two i32 to i64? Could I use MyTargetInstrInfo::copyRegToReg() to handle this logic issue? 3. Maybe I can study INSERT_SUBREG/EXTRACT_SUBREG at X86 porting file. I will do some research more deeply.?I think the best way is that TableGen has register pair TypeProfile feature. :( But I find i64 data will not be ex --- 09?2?20????, Evan Cheng ??? ???: Evan Cheng ??: Re: [LLVMdev] help: about how to use tblgen to constraint operand. ???: hbrenkun at yahoo.cn, "LLVM Developers Mailing List" ??: 2009,220,??,1:11?? Currently there is no constraint that tells the register allocator to allocate a consecutive register pair. What I would suggest you do is to declare pseudo register pair registers (and corresponding register class, say PAIR_GPR). In this case, your myFMDRR would take one input of PAIR_GPR class. The asm printer should be taught to print a PAIR_GPR register as two GPR registers (you should also teach the JIT of the same thing). A PAIR_GPR register should be a super register of two GPR registers. e.g. r0r1_pair is a super register of r0 and r1. In order to *construct* a PAIR_GPR register, you have to use two INSERT_SUBREG. To extract out a GPR from a PAIR_GPR, you need to issue EXTRACT_SUBREG. In most cases, these will be nop's. In other cases, they are copies. Evan On Feb 19, 2009, at 2:00 AM, ?? wrote: I define a pattern to move two 32bits gpr to 64bits fpr. like arm instructure fmdrr. But I need to use an even/odd register pair to save its 2 operands. I define in mytarget.td: myfmdrr: SDTypeProfile<1, 2, [SDTCisVT<0, f64>, SDTCisVT<1, i32>, ????SDTCisSameAs<1, 2>]>; def my_fmdrr : ........... def myFMDRR : .... ??????????????????????(outs FPR: $result), ins(GPR: $op1, GPR:$op2 ) ???????????????????????[(setFPR: $result, (my_fmdrr GPR: $op1, GPR:$op2) )] I create myfmdrr instructure in mytargetISelLowering.cpp. and its operands are in R0 and R1. But after optimization, the operands are save R2 and R1. I know optimization pass does not know myfmdrr operands constraint. But How I tell optimzition pass by tblgen?? Could I can control operand constraint in mytargetiSelLowering.cpp? How do I control?? ?????????????????_______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu ????????http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev ___________________________________________________________ ????????????????? http://card.mail.cn.yahoo.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/b63c7cbb/attachment.html From scooter.phd at gmail.com Fri Feb 20 00:41:45 2009 From: scooter.phd at gmail.com (Scott Michel) Date: Thu, 19 Feb 2009 22:41:45 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <20090218094151.GA6282@pom.apple.com> Message-ID: <258cd3200902192241k6f629524ub923ae46382143ea@mail.gmail.com> For the complete truth in advertising, this was pretty much a trial balloon to gauge reaction. I'm not a big fan of rejecting commits for style violations, but the dev guide has certain guidelines regarding formatting and style. And we're all supposed to be good citizens... My biggest nit, however, was contemplating a commit where 80%+ was trailing whitespace trimming. Yeah, my editor happens to practice good hygiene. I could have been a complete *$$**le and committed a global hygiene patch. I only touched the files that I'll end up committing in another day or two. Since there are style rules, I also decided to see how much reaction there would be if they were enforced at commit time. Evidently, it's as popular as a skunk at a garden party. So, it's not something I'm personally looking to invest much time into. -scooter On Wed, Feb 18, 2009 at 8:53 AM, Devang Patel wrote: > > On Feb 18, 2009, at 1:41 AM, Julien Lerouge wrote: > > > Yet another _fun_ way of doing this is to setup a buildbot slave just > > for that. The slave can fix minor stuff like tabs and trailing > > whitespaces on its own (checking the changes back in), and yell for > > things like 80-col violations and whatnot where the changes would > > not be > > so trivial. > > If you're going to change anything then this is the best alternative, > otherwise I can live with status quo. > > Do not reject commit just because of formatting issues. It can have > serious -ve impact on productivity. > > To folks who prefers to reject commits due to formatting errors -- You > already rely on a some kind of "tool" to make your day to day life > easier. [ Most likely you've your editor automatically replacing tabs > into spaces. Your terminal window is only 80 col. wide or your editor > is displaying a vertical line to warn you about 80 col. and so > on... ]. The build bot suggested by Julien is yet another "tool" that > accomplishes the same. One the slave bot can use clang static > analyzer ... :) > > - > Devang > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090219/e2dc0829/attachment.html From syoyofujita at gmail.com Fri Feb 20 01:31:45 2009 From: syoyofujita at gmail.com (Syoyo Fujita) Date: Fri, 20 Feb 2009 16:31:45 +0900 Subject: [LLVMdev] Loop elimination with floating point counter. In-Reply-To: References: <1f06970b0901080436t4223f990m1de1fac4cebb1696@mail.gmail.com> <1f06970b0901080922k1e80e4b6ree7239a9ff4c427e@mail.gmail.com> <5296F463-E937-4774-A4DE-26858DA14942@mac.com> <759EC041-923A-4F12-9833-26EE246DF42F@gmx.net> <92381DB0-F6B9-4AD8-8242-7F03B280C132@mac.com> <4aca3dc20901082011u74266dc2mf88e3490a6145e01@mail.gmail.com> <1f06970b0901140511ufea2358u92f458c9d252282@mail.gmail.com> Message-ID: <1f06970b0902192331w7d9f94aawfd77218af377a31@mail.gmail.com> On Sat, Jan 17, 2009 at 5:24 AM, Chris Lattner wrote: > > On Jan 14, 2009, at 5:11 AM, Syoyo Fujita wrote: > > > Thanks for many comments. > > > > The loop with finite fp values(which could be representable in IEEE754 > > fp format) such like, > > Sure, LLVM could definitely do this. If you're interested, I'd > suggest starting by extending the existing code that we have to do > this. The existing code just handles increments by unit constants, so > it doesn't trigger with 1.2. Yes, I'd like to commit the patches as much as possible. Would you tell me the source code location where I have to investigate? -- Syoyo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/b75cd574/attachment.html From nicholas at mxc.ca Fri Feb 20 02:30:59 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 20 Feb 2009 00:30:59 -0800 Subject: [LLVMdev] Improving performance with optimization passes In-Reply-To: <42A78FE5-3B67-4EE3-BC7B-3FAB444D69FF@korobeynikov.info> References: <200902191900.14089.jon@ffconsultancy.com> <200902191944.29570.jon@ffconsultancy.com> <42A78FE5-3B67-4EE3-BC7B-3FAB444D69FF@korobeynikov.info> Message-ID: <499E6A43.7030500@mxc.ca> Anton Korobeynikov wrote: >> >> On Thursday 19 February 2009 19:00:14 Jon Harrop wrote: >>> I'm toying with benchmarks on my HLVM and am unable to get any >>> performance >>> improvement from optimization passes... >> >> I just disassembled some of the IR before and after optimization. This >> example >> function squares a complex number: > Something is definitely wrong with the way you're using optimization passes. > >> The ideal result is probably: > It is indeed so: > > ./opt -std-compile-opts test.bc | ./llvm-dis > ; ModuleID = '' > > define fastcc i32 @zsqr({ double, double }* nocapture, { double, double > }) nounwind { > entry: > %2 = extractvalue { double, double } %1, 0 ; [#uses=1] > %3 = extractvalue { double, double } %1, 0 ; [#uses=1] I've filed llvm.org/PR3623 to track the fact that these aren't merged. Nick > %4 = mul double %2, %3 ; [#uses=1] > %5 = extractvalue { double, double } %1, 1 ; [#uses=1] > %6 = extractvalue { double, double } %1, 1 ; [#uses=1] > %7 = mul double %5, %6 ; [#uses=1] > %8 = sub double %4, %7 ; [#uses=1] > %9 = extractvalue { double, double } %1, 0 ; [#uses=1] > %10 = mul double %9, 2.000000e+00 ; [#uses=1] > %11 = extractvalue { double, double } %1, 1 ; [#uses=1] > %12 = mul double %10, %11 ; [#uses=1] > %insert = insertvalue { double, double } undef, double %8, 0 ; <{ > double, double }> [#uses=1] > %insert2 = insertvalue { double, double } %insert, double %12, 1 ; <{ > double, double }> [#uses=1] > store { double, double } %insert2, { double, double }* %0 > ret i32 0 > } > > --- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From mainmanmauricio at gmail.com Fri Feb 20 02:37:47 2009 From: mainmanmauricio at gmail.com (Maurice Gittens) Date: Fri, 20 Feb 2009 09:37:47 +0100 Subject: [LLVMdev] libLTO warning Message-ID: <305911e30902200037k4d6b937fn67d095822386847a@mail.gmail.com> Hi all, I just svn-updated the 2.5 branch on my machine and I noticed this warning during the build. *** Warning: Linking the shared library /home/maurice/installation/llvm/Debug/lib/libLTO.la against the non-libtool *** objects /home/maurice/installation/llvm/Debug/lib/LLVMCppBackend.o /home/maurice/installation/llvm/Debug/lib/LLVMMSIL.o /home/maurice/installation/llvm/Debug/lib/LLVMCBackend.o /home/maurice/installation/llvm/Debug/lib/LLVMXCore.o /home/maurice/installation/llvm/Debug/lib/LLVMPIC16.o /home/maurice/installation/llvm/Debug/lib/LLVMCellSPUCodeGen.o /home/maurice/installation/llvm/Debug/lib/LLVMCellSPUAsmPrinter.o /home/maurice/installation/llvm/Debug/lib/LLVMMips.o /home/maurice/installation/llvm/Debug/lib/LLVMARMAsmPrinter.o /home/maurice/installation/llvm/Debug/lib/LLVMARMCodeGen.o /home/maurice/installation/llvm/Debug/lib/LLVMIA64.o /home/maurice/installation/llvm/Debug/lib/LLVMAlphaCodeGen.o /home/maurice/installation/llvm/Debug/lib/LLVMAlphaAsmPrinter.o /home/maurice/installation/llvm/Debug/lib/LLVMPowerPCAsmPrinter.o /home/maurice/installation/llvm/Debug/lib/LLVMPowerPCCodeGen.o /home/maurice/installation/llvm/Debug/lib/LLVMSparcCodeGen.o /home/maurice/installation/llvm/Debug/lib/LLVMSparcAsmPrinter.o /home/maurice/installation/llvm/Debug/lib/LLVMX86AsmPrinter.o /home/maurice/installation/llvm/Debug/lib/LLVMX86CodeGen.o is not portable! I haven't noticed any ill-effects. Is this a harmless warning or does it need to be fixed? I'm on Linux/Fedora 9, x86-64. Kind regards, Maurice -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/83a15957/attachment-0001.html From nicholas at mxc.ca Fri Feb 20 02:51:51 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 20 Feb 2009 00:51:51 -0800 Subject: [LLVMdev] libLTO warning In-Reply-To: <305911e30902200037k4d6b937fn67d095822386847a@mail.gmail.com> References: <305911e30902200037k4d6b937fn67d095822386847a@mail.gmail.com> Message-ID: <499E6F27.20909@mxc.ca> Maurice Gittens wrote: > Hi all, > > I just svn-updated the 2.5 branch on my machine and I noticed this > warning during the build. > > *** Warning: Linking the shared library > /home/maurice/installation/llvm/Debug/lib/libLTO.la against the non-libtool > *** objects /home/maurice/installation/llvm/Debug/lib/LLVMCppBackend.o > /home/maurice/installation/llvm/Debug/lib/LLVMMSIL.o > /home/maurice/installation/llvm/Debug/lib/LLVMCBackend.o > /home/maurice/installation/llvm/Debug/lib/LLVMXCore.o > /home/maurice/installation/llvm/Debug/lib/LLVMPIC16.o > /home/maurice/installation/llvm/Debug/lib/LLVMCellSPUCodeGen.o > /home/maurice/installation/llvm/Debug/lib/LLVMCellSPUAsmPrinter.o > /home/maurice/installation/llvm/Debug/lib/LLVMMips.o > /home/maurice/installation/llvm/Debug/lib/LLVMARMAsmPrinter.o > /home/maurice/installation/llvm/Debug/lib/LLVMARMCodeGen.o > /home/maurice/installation/llvm/Debug/lib/LLVMIA64.o > /home/maurice/installation/llvm/Debug/lib/LLVMAlphaCodeGen.o > /home/maurice/installation/llvm/Debug/lib/LLVMAlphaAsmPrinter.o > /home/maurice/installation/llvm/Debug/lib/LLVMPowerPCAsmPrinter.o > /home/maurice/installation/llvm/Debug/lib/LLVMPowerPCCodeGen.o > /home/maurice/installation/llvm/Debug/lib/LLVMSparcCodeGen.o > /home/maurice/installation/llvm/Debug/lib/LLVMSparcAsmPrinter.o > /home/maurice/installation/llvm/Debug/lib/LLVMX86AsmPrinter.o > /home/maurice/installation/llvm/Debug/lib/LLVMX86CodeGen.o is not > portable! > > I haven't noticed any ill-effects. Is this a harmless warning or does it > need to be fixed? > > I'm on Linux/Fedora 9, x86-64. The warnings are harmless, but regardless libLTO shouldn't be trying to build on any platform except Darwin. It looks like the 2.5 rebranching took place at just the wrong time and picked up a change when I tried to have libLTO build on Linux as well. That change should never have made it into the 2.5 branch. Tanya, could you please revert 62987 from the 2.5 branch? It's this patch: http://llvm.org/viewvc/llvm-project/llvm/branches/release_25/tools/Makefile?r1=62895&r2=62987 fortunately just a one-liner. I'm really sorry for the trouble... Nick > > Kind regards, > Maurice > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From nicholas at mxc.ca Fri Feb 20 02:55:23 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 20 Feb 2009 00:55:23 -0800 Subject: [LLVMdev] libLTO warning In-Reply-To: <499E6F27.20909@mxc.ca> References: <305911e30902200037k4d6b937fn67d095822386847a@mail.gmail.com> <499E6F27.20909@mxc.ca> Message-ID: <499E6FFB.9040601@mxc.ca> Nick Lewycky wrote: > Maurice Gittens wrote: >> Hi all, >> >> I just svn-updated the 2.5 branch on my machine and I noticed this >> warning during the build. >> >> *** Warning: Linking the shared library >> /home/maurice/installation/llvm/Debug/lib/libLTO.la against the non-libtool >> *** objects /home/maurice/installation/llvm/Debug/lib/LLVMCppBackend.o >> /home/maurice/installation/llvm/Debug/lib/LLVMMSIL.o >> /home/maurice/installation/llvm/Debug/lib/LLVMCBackend.o >> /home/maurice/installation/llvm/Debug/lib/LLVMXCore.o >> /home/maurice/installation/llvm/Debug/lib/LLVMPIC16.o >> /home/maurice/installation/llvm/Debug/lib/LLVMCellSPUCodeGen.o >> /home/maurice/installation/llvm/Debug/lib/LLVMCellSPUAsmPrinter.o >> /home/maurice/installation/llvm/Debug/lib/LLVMMips.o >> /home/maurice/installation/llvm/Debug/lib/LLVMARMAsmPrinter.o >> /home/maurice/installation/llvm/Debug/lib/LLVMARMCodeGen.o >> /home/maurice/installation/llvm/Debug/lib/LLVMIA64.o >> /home/maurice/installation/llvm/Debug/lib/LLVMAlphaCodeGen.o >> /home/maurice/installation/llvm/Debug/lib/LLVMAlphaAsmPrinter.o >> /home/maurice/installation/llvm/Debug/lib/LLVMPowerPCAsmPrinter.o >> /home/maurice/installation/llvm/Debug/lib/LLVMPowerPCCodeGen.o >> /home/maurice/installation/llvm/Debug/lib/LLVMSparcCodeGen.o >> /home/maurice/installation/llvm/Debug/lib/LLVMSparcAsmPrinter.o >> /home/maurice/installation/llvm/Debug/lib/LLVMX86AsmPrinter.o >> /home/maurice/installation/llvm/Debug/lib/LLVMX86CodeGen.o is not >> portable! >> >> I haven't noticed any ill-effects. Is this a harmless warning or does it >> need to be fixed? >> >> I'm on Linux/Fedora 9, x86-64. > > The warnings are harmless, but regardless libLTO shouldn't be trying to > build on any platform except Darwin. > > It looks like the 2.5 rebranching took place at just the wrong time and > picked up a change when I tried to have libLTO build on Linux as well. > That change should never have made it into the 2.5 branch. > > Tanya, could you please revert 62987 from the 2.5 branch? It's this patch: > http://llvm.org/viewvc/llvm-project/llvm/branches/release_25/tools/Makefile?r1=62895&r2=62987 > > fortunately just a one-liner. I'm really sorry for the trouble... BTW, this has already been reverted on trunk. Here's the revert: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090209/073590.html > Nick > >> Kind regards, >> Maurice >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From alex.lavoro.propio at gmail.com Fri Feb 20 04:49:01 2009 From: alex.lavoro.propio at gmail.com (Alex) Date: Fri, 20 Feb 2009 11:49:01 +0100 Subject: [LLVMdev] Obfuscation/software watermarking backend Message-ID: <4d77c5f20902200249r775a0038lce2477c828a5bdf3@mail.gmail.com> I'd like to know if there is any known project doing obfucated code generation or software watermarking in LLVM. The [obfucation/software watermarking] in machine instruction level usually requires to insert dead code, constant "unfolding", computationally intensive "opaque predicate", redundant calculation, duplicated calculation, etc, which all make the program inefficient. But it seems to be possible to generate the code in an [obfucated/watermarked] manner in the first place so that the code generated in this way may not be so optimized but it's more efficient than applying an additional (independent) obfucation pass later. Alex. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/6922fe68/attachment.html From evan.cheng at apple.com Fri Feb 20 12:51:47 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Fri, 20 Feb 2009 10:51:47 -0800 Subject: [LLVMdev] help: about how to use tblgen to constraint operand. In-Reply-To: <971617.13744.qm@web92414.mail.cnh.yahoo.com> References: <971617.13744.qm@web92414.mail.cnh.yahoo.com> Message-ID: <519F9A3E-36F2-44A8-9525-5D89908EC57B@apple.com> On Feb 19, 2009, at 8:26 PM, ?? wrote: > hi, Dear Evan Cheng: > > My cpu is i32 embeded CPU. I define pseudo register pair registers. > > In mytargetRegisterInfo.td: > def T0: RegisterWithSubRegs<"t0",[R0,R1]>; > ... > def GPR64 : RegisterClass<"mytarget", [i64], 64, [T0, T1.....] > > In mytargetISelLowering.cpp: > I define i1, i8 , i16 and i32 are legal. > > 1. I still have problem. I save my function return double value in > R0 and R1. > It is expanded into two i32. But my GPR64 is defined to save i64. > llvm finds > I have i64 GPR register. It will automatically decide not to expand > i64 to two i32. > > 2. I guess I need a special pseudo instruction to move between GPR32 > and GPR64. > How to move R0, R1 to T1( R2, R3 pair). and don't convert two i32 to > i64? > Could I use MyTargetInstrInfo::copyRegToReg() to handle this logic > issue? No. copyRegToReg only supports copying registers of the same (or compatible register classes). > > > 3. Maybe I can study INSERT_SUBREG/EXTRACT_SUBREG at X86 porting file. Yes. > > > I will do some research more deeply. I think the best way is that > TableGen has register pair TypeProfile feature. :( It's not a tablegen issue. It's easy to add the constraint to tablegen but the register allocator has to be able to allocate register pairs. That is definitely not a trivial task. Evan > > > > > > But I find i64 data will not be ex > --- 09?2?20????, Evan Cheng ??? > ???: Evan Cheng > ??: Re: [LLVMdev] help: about how to use tblgen to constraint > operand. > ???: hbrenkun at yahoo.cn, "LLVM Developers Mailing List" uc.edu> > ??: 2009,220,??,1:11?? > > Currently there is no constraint that tells the register allocator > to allocate a consecutive register pair. What I would suggest you do > is to declare pseudo register pair registers (and corresponding > register class, say PAIR_GPR). In this case, your myFMDRR would take > one input of PAIR_GPR class. The asm printer should be taught to > print a PAIR_GPR register as two GPR registers (you should also > teach the JIT of the same thing). > > A PAIR_GPR register should be a super register of two GPR registers. > e.g. r0r1_pair is a super register of r0 and r1. In order to > *construct* a PAIR_GPR register, you have to use two INSERT_SUBREG. > To extract out a GPR from a PAIR_GPR, you need to issue > EXTRACT_SUBREG. In most cases, these will be nop's. In other cases, > they are copies. > > Evan > > On Feb 19, 2009, at 2:00 AM, ?? wrote: > >> I define a pattern to move two 32bits gpr to 64bits fpr. like arm >> instructure fmdrr. >> But I need to use an even/odd register pair to save its 2 operands. >> I define in mytarget.td: >> >> myfmdrr: >> SDTypeProfile<1, 2, [SDTCisVT<0, f64>, SDTCisVT<1, i32>, >> SDTCisSameAs<1, 2>]>; >> def my_fmdrr : ........... >> def myFMDRR : .... >> (outs FPR: $result), ins(GPR: $op1, GPR:$op2 ) >> [(setFPR: $result, (my_fmdrr GPR: $op1, GPR: >> $op2) )] >> >> I create myfmdrr instructure in mytargetISelLowering.cpp. and its >> operands are in R0 and R1. >> But after optimization, the operands are save R2 and R1. I know >> optimization pass does not >> know myfmdrr operands constraint. But How I tell optimzition pass >> by tblgen?? >> >> Could I can control operand constraint in mytargetiSelLowering.cpp? >> How do I control?? >> >> >> ????????????????? >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > ????????????????? > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/0be84356/attachment.html From brukman at gmail.com Fri Feb 20 14:45:29 2009 From: brukman at gmail.com (Misha Brukman) Date: Fri, 20 Feb 2009 15:45:29 -0500 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: <258cd3200902192241k6f629524ub923ae46382143ea@mail.gmail.com> References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <20090218094151.GA6282@pom.apple.com> <258cd3200902192241k6f629524ub923ae46382143ea@mail.gmail.com> Message-ID: On a related note, I wrote a few scripts to detect and correct some types of such style errors, see llvm/utils/lint/* . I also added a function to llvm/utils/vim/vimrc to delete trailing whitespace and highlight existing trailing whitespace -- if anyone's an Emacs-lisp hacker, please add it to the emacs config file as well. Sure, this doesn't enforce anything, but I'm hoping folks will start to use these tools and will over time clean up the style in the entire code base. 2009/2/20 Scott Michel > For the complete truth in advertising, this was pretty much a trial balloon > to gauge reaction. I'm not a big fan of rejecting commits for style > violations, but the dev guide has certain guidelines regarding formatting > and style. And we're all supposed to be good citizens... > > My biggest nit, however, was contemplating a commit where 80%+ was trailing > whitespace trimming. Yeah, my editor happens to practice good hygiene. I > could have been a complete *$$**le and committed a global hygiene patch. I > only touched the files that I'll end up committing in another day or two. I've been fixing things on a directory-by-directory basis as I come across style violations while browsing the code. I'm not in favor of a single global change to fix everything everywhere; I think this can be done gradually over time and the diff will be easier to read if it's smaller, so you can verify that the script (or your editor) did not mangle anything. Misha -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/ac1fa480/attachment.html From anton at korobeynikov.info Fri Feb 20 15:26:53 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sat, 21 Feb 2009 00:26:53 +0300 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <20090218094151.GA6282@pom.apple.com> <258cd3200902192241k6f629524ub923ae46382143ea@mail.gmail.com> Message-ID: <94020932-947F-462A-9BA0-923B0F631071@korobeynikov.info> > I also added a function to llvm/utils/vim/vimrc to delete trailing > whitespace and highlight existing trailing whitespace -- if anyone's > an Emacs-lisp hacker, please add it to the emacs config file as well. Usually this is done via develock minor mode (one can google for 'develock.el') --- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dio.rahman at gmail.com Fri Feb 20 15:24:59 2009 From: dio.rahman at gmail.com (Dio) Date: Fri, 20 Feb 2009 13:24:59 -0800 (PST) Subject: [LLVMdev] Support for Visual Studio 2008 Message-ID: <081abcf8-ceed-4a31-934e-35d29f67c6de@n10g2000vbl.googlegroups.com> Helllo all! I need to know whether this project intends to support Visual Studio 2008 C++ compiler or not. Thanks! Dio PS. I have a hard time to make the project compiled by MSVC 2008 -- eventhough it is smoothly compiled by MSVC 2005 From ofv at wanadoo.es Fri Feb 20 15:49:42 2009 From: ofv at wanadoo.es (=?windows-1252?Q?=D3scar_Fuentes?=) Date: Fri, 20 Feb 2009 22:49:42 +0100 Subject: [LLVMdev] Support for Visual Studio 2008 References: <081abcf8-ceed-4a31-934e-35d29f67c6de@n10g2000vbl.googlegroups.com> Message-ID: Dio writes: [snip] > PS. I have a hard time to make the project compiled by MSVC 2008 -- > eventhough it is smoothly compiled by MSVC 2005 Have you tried cmake? http://llvm.org/docs/CMake.html -- Oscar From scottm at aero.org Fri Feb 20 16:10:32 2009 From: scottm at aero.org (Scott Michel) Date: Fri, 20 Feb 2009 14:10:32 -0800 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <20090218094151.GA6282@pom.apple.com> <258cd3200902192241k6f629524ub923ae46382143ea@mail.gmail.com> Message-ID: On Feb 20, 2009, at 12:45 PM, Misha Brukman wrote: > I've been fixing things on a directory-by-directory basis as I come > across style violations while browsing the code. I'm not in favor > of a single global change to fix everything everywhere; I think > this can be done gradually over time and the diff will be easier to > read if it's smaller, so you can verify that the script (or your > editor) did not mangle anything. I've got a fairly simple perl script that trims trailing whitespace and it's remarkably effective. It even works over recursive directories (not hard to do, but it's the way to do this globally.) I'm sure it's not too much of a stretch to translate tabs to spaces, although that's controversial. -scooter From dag at cray.com Fri Feb 20 17:05:41 2009 From: dag at cray.com (David Greene) Date: Fri, 20 Feb 2009 17:05:41 -0600 Subject: [LLVMdev] Possible DAGCombiner or TargetData Bug In-Reply-To: References: <200902181849.38871.dag@cray.com> Message-ID: <200902201705.41785.dag@cray.com> On Wednesday 18 February 2009 21:43, Dan Gohman wrote: > I agree, that doesn't look right. It looks like this > is what was intended: > > Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp > =================================================================== > --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) > +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy) > @@ -4903,9 +4903,9 @@ > // resultant store does not need a higher alignment than the original. > if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() && > ST->isUnindexed()) { > - unsigned Align = ST->getAlignment(); > + unsigned OrigAlign = ST->getAlignment(); > MVT SVT = Value.getOperand(0).getValueType(); > - unsigned OrigAlign = TLI.getTargetData()-> > + unsigned Align = TLI.getTargetData()-> > getABITypeAlignment(SVT.getTypeForMVT()); > if (Align <= OrigAlign && > ((!LegalOperations && !ST->isVolatile()) || > > Does that look right to you? Yes, and it fixes the problem. What's your opinion about how TargetData and X86Subtarget define ABI alignment for SSE registers? I think that's suspect too. It's too bad we can't specify separate ABI alignments for v16i/f8, v8i/f16, v4i/f32 and v2i/f64 as we should probably set the ABI alignment to the element alignment. But I guess to be conservative we should set it to 8 bits. Unless I'm misunderstanding the purpose of the ABI alignment. -Dave From brukman at gmail.com Fri Feb 20 17:18:02 2009 From: brukman at gmail.com (Misha Brukman) Date: Fri, 20 Feb 2009 18:18:02 -0500 Subject: [LLVMdev] svn pre-commit hook: help needed In-Reply-To: References: <258cd3200902171421x67f6f437he937c6ee661b2a65@mail.gmail.com> <20090218094151.GA6282@pom.apple.com> <258cd3200902192241k6f629524ub923ae46382143ea@mail.gmail.com> Message-ID: 2009/2/20 Scott Michel > On Feb 20, 2009, at 12:45 PM, Misha Brukman wrote: > I've got a fairly simple perl script that trims trailing whitespace [...] > % cat llvm/utils/lint/remove_trailing_whitespace.sh #!/bin/sh # Deletes trailing whitespace in-place in the passed-in files. # Sample syntax: # $0 *.cpp perl -pi -e 's/\s+$/\n/' $* Yep, it's a one-liner. > [...] and it's remarkably effective. It even works over recursive > directories (not hard to do, but it's the way to do this globally.) With recursion into subdirectories: % remove_trailing_whitespace.sh `find . -name \*\.h` > I'm sure it's not too much of a stretch to translate tabs to spaces, > although that's controversial. Good point, I should add a verifier to the lint tool to check for tabs in non-Makefiles. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/c9cada97/attachment.html From tonic at nondot.org Fri Feb 20 17:21:52 2009 From: tonic at nondot.org (Tanya M. Lattner) Date: Fri, 20 Feb 2009 15:21:52 -0800 (PST) Subject: [LLVMdev] 2.5 Pre-release1 available for testing In-Reply-To: <200902091900.10593.baldrick@free.fr> References: <200902091900.10593.baldrick@free.fr> Message-ID: Did you file PRs for these? -Tanya On Mon, 9 Feb 2009, Duncan Sands wrote: > Hi Tanya, I see the following warnings when building. I'm not sure > how to fix any of them. The last one looks like it might be serious > (seems like a job for Chris). > > llvm[1]: Compiling Path.cpp for Release build > In file included from Path.cpp:270: > Unix/Path.inc: In member function ?bool llvm::sys::Path::eraseFromDisk(bool, std::string*) const?: > Unix/Path.inc:661: warning: ignoring return value of ?int system(const char*)?, declared with attribute warn_unused_result > > > llvm[1]: Compiling raw_ostream.cpp for Release build > raw_ostream.cpp: In member function ?virtual void llvm::raw_fd_ostream::flush_impl()?: > raw_ostream.cpp:245: warning: ignoring return value of ?ssize_t write(int, const void*, size_t)?, declared with attribute warn_unused_result > > > llvm[2]: Compiling LLParser.cpp for Release build > LLParser.cpp: In member function ?bool llvm::LLParser::ParseGlobal(const std::string&, const char*, unsigned int, bool, unsigned int)?: > LLParser.cpp:448: warning: ?IsConstant? may be used uninitialized in this function > > > Ciao, > > Duncan. > From tonic at nondot.org Fri Feb 20 17:45:45 2009 From: tonic at nondot.org (Tanya Lattner) Date: Fri, 20 Feb 2009 15:45:45 -0800 Subject: [LLVMdev] 2.5 Pre-release2 available for testing Message-ID: LLVMers, The 2.5 pre-release2 is finally available for testing: http://llvm.org/prereleases/2.5/ If you have time, I'd appreciate anyone who can help test the release. Please do the following: 1) Download/compile llvm source, and either compile llvm-gcc source or use llvm-gcc binary (please compile llvm-gcc with fortran if you can). 2) Run make check, send me the testrun.log 3) Run "make TEST=nightly report" and send me the report.nightly.txt 4) Please provide details on what platform you compiled LLVM on, how you built LLMV (src == obj, or src != obj), gcc version, and if you compiled llvm-gcc with support for fortran. The more details, the better. Please COMPLETE ALL TESTING BY end of the day on Feb. 28th! We hope to have the final release out on 3/2/2009 (assuming no new regressions). Thanks, Tanya Lattner -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090220/0becc548/attachment.html From gohman at apple.com Fri Feb 20 17:53:59 2009 From: gohman at apple.com (Dan Gohman) Date: Fri, 20 Feb 2009 15:53:59 -0800 Subject: [LLVMdev] Possible DAGCombiner or TargetData Bug In-Reply-To: <200902201705.41785.dag@cray.com> References: <200902181849.38871.dag@cray.com> <200902201705.41785.dag@cray.com> Message-ID: <73A1B838-0530-4CBF-8302-3B3429F745FA@apple.com> On Feb 20, 2009, at 3:05 PM, David Greene wrote: > On Wednesday 18 February 2009 21:43, Dan Gohman wrote: >> I agree, that doesn't look right. It looks like this >> is what was intended: >> >> Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp >> =================================================================== >> --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) >> +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy) >> @@ -4903,9 +4903,9 @@ >> // resultant store does not need a higher alignment than the >> original. >> if (Value.getOpcode() == ISD::BIT_CONVERT && !ST- >> >isTruncatingStore() && >> ST->isUnindexed()) { >> - unsigned Align = ST->getAlignment(); >> + unsigned OrigAlign = ST->getAlignment(); >> MVT SVT = Value.getOperand(0).getValueType(); >> - unsigned OrigAlign = TLI.getTargetData()-> >> + unsigned Align = TLI.getTargetData()-> >> getABITypeAlignment(SVT.getTypeForMVT()); >> if (Align <= OrigAlign && >> ((!LegalOperations && !ST->isVolatile()) || >> >> Does that look right to you? > > Yes, and it fixes the problem. Cool. I've committed this on trunk now. If you have a reasonably reduced testcase for this, please add it. > What's your opinion about how TargetData and X86Subtarget define ABI > alignment > for SSE registers? I think that's suspect too. It's too bad we > can't specify > separate ABI alignments for v16i/f8, v8i/f16, v4i/f32 and v2i/f64 as > we > should probably set the ABI alignment to the element alignment. But > I guess > to be conservative we should set it to 8 bits. Unless I'm > misunderstanding > the purpose of the ABI alignment. The purpose of ABI alignment is to govern things like struct layouts, global variables, allocas, and so on. So SSE types on x86 should probably all remain ABI-aligned at 16 bytes. I think the particular DAGCombine you pointed out is using ABI alignment as a conservative heuristic. In some cases it may be safe to transform the store to a store that doesn't have the ABI alignment for the stored value, but DAGCombine doesn't know when it's safe. I guess this could be fixed by having the target provide a third kind of alignment value: the minimum alignment that the target can store values of a particular type to. Dan From dag at cray.com Fri Feb 20 18:02:52 2009 From: dag at cray.com (David Greene) Date: Fri, 20 Feb 2009 18:02:52 -0600 Subject: [LLVMdev] Possible DAGCombiner or TargetData Bug In-Reply-To: <73A1B838-0530-4CBF-8302-3B3429F745FA@apple.com> References: <200902181849.38871.dag@cray.com> <200902201705.41785.dag@cray.com> <73A1B838-0530-4CBF-8302-3B3429F745FA@apple.com> Message-ID: <200902201802.52921.dag@cray.com> On Friday 20 February 2009 17:53, Dan Gohman wrote: > > Yes, and it fixes the problem. > > Cool. I've committed this on trunk now. If you have a reasonably > reduced > testcase for this, please add it. Working on it. I'm getting our build validated first. > The purpose of ABI alignment is to govern things like struct layouts, > global > variables, allocas, and so on. So SSE types on x86 should probably all > remain ABI-aligned at 16 bytes. Ok, makes sense. > I think the particular DAGCombine you pointed out is using ABI alignment > as a conservative heuristic. In some cases it may be safe to > transform the > store to a store that doesn't have the ABI alignment for the stored > value, > but DAGCombine doesn't know when it's safe. I guess this could be fixed > by having the target provide a third kind of alignment value: the > minimum > alignment that the target can store values of a particular type to. Yes, that would provide more information. I'm not sure how critical it is. My concern is whether someone might think "ABI alignment" is equivalent to "safe alignment." If there were a third option, we would somehow want to discriminate based on vector element type. -Dave From baldrick at free.fr Sat Feb 21 01:25:35 2009 From: baldrick at free.fr (Duncan Sands) Date: Sat, 21 Feb 2009 08:25:35 +0100 Subject: [LLVMdev] please review this fix for PR3510 In-Reply-To: <83E688BB-CE51-451A-93B4-5B2A0A7EEA57@apple.com> References: <83E688BB-CE51-451A-93B4-5B2A0A7EEA57@apple.com> Message-ID: <200902210825.35413.baldrick@free.fr> Hi Stuart, thanks for doing this. I will try to review in the next few days. Small comment: please use spaces rather than tabs. Ciao, Duncan. From dio.rahman at gmail.com Sat Feb 21 08:03:03 2009 From: dio.rahman at gmail.com (Dio) Date: Sat, 21 Feb 2009 06:03:03 -0800 (PST) Subject: [LLVMdev] Support for Visual Studio 2008 In-Reply-To: References: <081abcf8-ceed-4a31-934e-35d29f67c6de@n10g2000vbl.googlegroups.com> Message-ID: <2b62f4de-c12b-4642-800b-2bd2daefc8fb@z10g2000prl.googlegroups.com> > Have you tried cmake? > > http://llvm.org/docs/CMake.html Yes! Of course! :-) I have configured the cmake to create VS 2008 solution. But , it is failed. Dio From ofv at wanadoo.es Sat Feb 21 08:09:48 2009 From: ofv at wanadoo.es (=?windows-1252?Q?=D3scar_Fuentes?=) Date: Sat, 21 Feb 2009 15:09:48 +0100 Subject: [LLVMdev] Support for Visual Studio 2008 References: <081abcf8-ceed-4a31-934e-35d29f67c6de@n10g2000vbl.googlegroups.com> <2b62f4de-c12b-4642-800b-2bd2daefc8fb@z10g2000prl.googlegroups.com> Message-ID: Dio writes: >> Have you tried cmake? >> >> http://llvm.org/docs/CMake.html > > Yes! Of course! :-) I have configured the cmake to create VS 2008 > solution. But , it is failed. Okay, and how it fails? Please file a bug report on http://www.llvm.org/bugs/ with the relevant info. -- Oscar From freddyisaac at yahoo.com Sat Feb 21 03:45:33 2009 From: freddyisaac at yahoo.com (Frederick Isaac) Date: Sat, 21 Feb 2009 01:45:33 -0800 (PST) Subject: [LLVMdev] llvm on mips Message-ID: <429314.23467.qm@web34201.mail.mud.yahoo.com> Hi, I see that llvm has a Mips target already written up. Does anyone know how current this is? Does anyone maintain this and/or have any details about this target in terms of functionality regarding JIT etc.? Also I was trying to configure it with CC and CXX set to the mips 7.4.4 compilers but I get the error gcc|icc not configured. I am a real newb and don't really get how the gcc front end is used on and sgi Irix platform. Basically any info would be appreciated Thanks Freddy From sherief at mganin.com Sat Feb 21 13:11:12 2009 From: sherief at mganin.com (Sherief N. Farouk) Date: Sat, 21 Feb 2009 14:11:12 -0500 Subject: [LLVMdev] .lib file naming scheme Message-ID: <000001c99458$31241150$936c33f0$@com> Building 2.5 with VS2008, I noticed that the generated libraries for both debug and release builds share the same name - making them troublesome to merge into a unified 'lib'-like directory. Would a boost-like naming scheme be considered? - Sherief -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090221/2b2ba643/attachment.html From nicholas at mxc.ca Sat Feb 21 13:14:35 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 21 Feb 2009 11:14:35 -0800 Subject: [LLVMdev] -fPIC warning on every compile on Cygwin In-Reply-To: <9719867c0902191248k5f49eb5cg1ec4930b84a680a9@mail.gmail.com> References: <9719867c0902190840vaa85105qa75da734ff1a5b0c@mail.gmail.com> <499D9C59.4060600@mxc.ca> <9719867c0902191032n4478ad07o1145c0c8a50ae245@mail.gmail.com> <499DAB39.9090600@mxc.ca> <9719867c0902191248k5f49eb5cg1ec4930b84a680a9@mail.gmail.com> Message-ID: <49A0529B.3040104@mxc.ca> Aaron Gray wrote: > On Thu, Feb 19, 2009 at 6:55 PM, Nick Lewycky > wrote: > > Aaron Gray wrote: > > On Thu, Feb 19, 2009 at 5:52 PM, Nick Lewycky > > >> wrote: > > > > Aaron Gray wrote: > > > Hi, > > > > > > I partly built LLVM on Cygwin yesterday and it was fine as > far as it > > > went. But after doing a svn update today I am getting the > following > > > warning on every compile :- > > > > > > > > > llvm[3]: Compiling LowerAllocations.cpp for Debug build > > > /usr/src/llvm/lib/Transforms/Utils/LowerAllocations.cpp:1: > > warning: > > > -fPIC ignored for target (all code is position > independent) > > > > > > > > > This maybe happening on other targets too. > > > > Thanks for the report. This is certainly due to my change > last night to > > make LLVM build as PIC by default. > > > > I was a little perplex at the warning as I could not ascertain where > > thery were coming from and could see no obvious commit !:) > > > > Out of interest, could you point me to the patch or commit, please. > > r65019 / r65020: > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073983.html > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073984.html > > Can you please email me if and when it is fixed :) Should be fixed in r65229: http://llvm.org/viewvc/llvm-project?view=rev&revision=65229 > > Thanks, > > Aaron > > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From aaronngray.lists at googlemail.com Sat Feb 21 15:54:23 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Sat, 21 Feb 2009 21:54:23 +0000 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: References: Message-ID: <9719867c0902211354k2464afb5ke98897fa6ac33c4e@mail.gmail.com> On Fri, Feb 20, 2009 at 11:45 PM, Tanya Lattner wrote: > LLVMers, > > The 2.5 pre-release2 is finally available for testing: > http://llvm.org/prereleases/2.5/ > > If you have time, I'd appreciate anyone who can help test the release. > Please do the following: > > 1) Download/compile llvm source, and either compile llvm-gcc source or use > llvm-gcc binary *(please compile llvm-gcc with fortran if you can).* > 2) Run make check, send me the testrun.log > 3) Run "make TEST=nightly report" and send me the report.nightly.txt > 4) Please provide details on what platform you compiled LLVM on, how you > built LLMV (src == obj, or src != obj), gcc version, and if you compiled > llvm-gcc with support for fortran. The more details, the better. > Cygwin with GCC 3.4.4 is failing llvm-gcc-4.2-2.5 with :- cc1plus: error: unrecognized command line option "-Wno-variadic-macros" Is anyone testing Cygwin with GCC 4.2 or 4.4 ? Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090221/ee1692a8/attachment.html From aaronngray.lists at googlemail.com Sat Feb 21 20:56:35 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Sun, 22 Feb 2009 02:56:35 +0000 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: <9719867c0902211354k2464afb5ke98897fa6ac33c4e@mail.gmail.com> References: <9719867c0902211354k2464afb5ke98897fa6ac33c4e@mail.gmail.com> Message-ID: <9719867c0902211856r3550e543xd7d050502e26c0ed@mail.gmail.com> On Sat, Feb 21, 2009 at 9:54 PM, Aaron Gray wrote: > On Fri, Feb 20, 2009 at 11:45 PM, Tanya Lattner wrote: > >> LLVMers, >> >> The 2.5 pre-release2 is finally available for testing: >> http://llvm.org/prereleases/2.5/ >> >> If you have time, I'd appreciate anyone who can help test the release. >> Please do the following: >> >> 1) Download/compile llvm source, and either compile llvm-gcc source or use >> llvm-gcc binary *(please compile llvm-gcc with fortran if you can).* >> 2) Run make check, send me the testrun.log >> 3) Run "make TEST=nightly report" and send me the report.nightly.txt >> 4) Please provide details on what platform you compiled LLVM on, how you >> built LLMV (src == obj, or src != obj), gcc version, and if you compiled >> llvm-gcc with support for fortran. The more details, the better. >> > > > Cygwin with GCC 3.4.4 is failing llvm-gcc-4.2-2.5 with :- > > cc1plus: error: unrecognized command line option "-Wno-variadic-macros" > > Is anyone testing Cygwin with GCC 4.2 or 4.4 ? > I hacked "-Wno-variadic-macros" bug but llvm-gcc is hanging on GCC 3.4.4 Cygwin later on first bootstrap pass in configure when "checking executable suffix", presumably this is the first self call. On the face of it, it looks like it may be a tough one to debug, dont know if GCC 3.4.4 Cygwin will make it to 2.5 release. I'll try with GCC 4.2 on Cygwin tommorow, unless anyone else is covering it. Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090222/671c1e5a/attachment.html From baldrick at free.fr Sun Feb 22 02:29:55 2009 From: baldrick at free.fr (Duncan Sands) Date: Sun, 22 Feb 2009 09:29:55 +0100 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: References: Message-ID: <200902220929.55192.baldrick@free.fr> Hi Tanya, the gcc testsuite doesn't seem to be present in llvm-gcc4.2-2.5.source. I don't think removing it is a good idea. I noticed this when I wanted to check that the release passes the Ada checks. Ciao, Duncan. From jon at ffconsultancy.com Sun Feb 22 08:17:34 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Sun, 22 Feb 2009 14:17:34 +0000 Subject: [LLVMdev] Broke my tail (call) Message-ID: <200902221417.34957.jon@ffconsultancy.com> I have written a variety tests of tail calls for my HLVM and all passed with flying colors until I wrote this test (which is actually for algebraic datatypes) and discovered that it segfaults after ~100k iterations through what I think should be a tail call. Here's the IR: define fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8* }*, i8* }, i32) { entry: %2 = alloca { i32, { { i8*, i8* }*, i8* } } ; <{ i32, { { i8 *, i8* }*, i8* } }*> [#uses=3] %3 = alloca { { i8*, i8* }*, i8* } ; <{ { i8*, i8* }*, i8* }*> [#uses=3] br label %start start: ; preds = %entry %4 = getelementptr { i32, { { i8*, i8* }*, i8* } }* %2, i32 0, i32 0 ; [#uses=1] store i32 %1, i32* %4 %5 = getelementptr { i32, { { i8*, i8* }*, i8* } }* %2, i32 0, i32 1 ; <{ { i8*, i8* }*, i8* }*> [#uses=1] store { { i8*, i8* }*, i8* } %0, { { i8*, i8* }*, i8* }* %5 %6 = getelementptr { i32, { { i8*, i8* }*, i8* } }* %2, i32 0 ; <{ i32, { { i8*, i8* }*, i8* } }*> [#uses=1] %7 = load { i32, { { i8*, i8* }*, i8* } }* %6 ; <{ i32, { { i8 *, i8* }*, i8* } }> [#uses=1] %8 = malloc { i32, { { i8*, i8* }*, i8* } } ; <{ i32, { { i8 *, i8* }*, i8* } }*> [#uses=2] %9 = getelementptr { i32, { { i8*, i8* }*, i8* } }* %8, i32 0 ; <{ i32, { { i8*, i8* }*, i8* } }*> [#uses=1] store { i32, { { i8*, i8* }*, i8* } } %7, { i32, { { i8*, i8* }*, i8* } }* %9 %10 = getelementptr { { i8*, i8* }*, i8* }* %3, i32 0, i32 0 ; <{ i8*, i8* }**> [#uses=1] store { i8*, i8* }* @Cons, { i8*, i8* }** %10 %11 = bitcast { i32, { { i8*, i8* }*, i8* } }* %8 to i8* ; [#uses=1] %12 = getelementptr { { i8*, i8* }*, i8* }* %3, i32 0, i32 1 ; [#uses=1] store i8* %11, i8** %12 %13 = getelementptr { { i8*, i8* }*, i8* }* %3, i32 0 ; <{ { i 8*, i8* }*, i8* }*> [#uses=1] %14 = load { { i8*, i8* }*, i8* }* %13 ; <{ { i8*, i8* }*, i8* }> [#uses=2] %15 = icmp eq i32 %1, 0 ; [#uses=1] br i1 %15, label %pass, label %fail fail: ; preds = %start %16 = sub i32 %1, 1 ; [#uses=1] %17 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8* }*, i8* } %14, i32 %16) ; <{ { i8*, i8* }*, i8* }> [#uses=1] ret { { i8*, i8* }*, i8* } %17 pass: ; preds = %start ret { { i8*, i8* }*, i8* } %14 } Am I going mad or should that tail call three lines up not be leaking stack space? The only possible explanation I can think of is that LLVM believes it cannot make this a tail call because it thinks it is passing a pointer to a struct that is local to the caller. Is that correct and, if so, how can i work around it? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From jon at ffconsultancy.com Sun Feb 22 10:12:46 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Sun, 22 Feb 2009 16:12:46 +0000 Subject: [LLVMdev] Broke my tail (call) In-Reply-To: <200902221417.34957.jon@ffconsultancy.com> References: <200902221417.34957.jon@ffconsultancy.com> Message-ID: <200902221612.46213.jon@ffconsultancy.com> On Sunday 22 February 2009 14:17:34 Jon Harrop wrote: > define fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8* }*, i8* }, i32) { I just noticed that I am accidentally returning a struct rather than going via a pointer in the first argument. Might that be related? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From aaronngray.lists at googlemail.com Sun Feb 22 11:19:39 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Sun, 22 Feb 2009 17:19:39 +0000 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: <9719867c0902211856r3550e543xd7d050502e26c0ed@mail.gmail.com> References: <9719867c0902211354k2464afb5ke98897fa6ac33c4e@mail.gmail.com> <9719867c0902211856r3550e543xd7d050502e26c0ed@mail.gmail.com> Message-ID: <9719867c0902220919x2df43fbci88b0c71d1ddc788c@mail.gmail.com> On Sun, Feb 22, 2009 at 2:56 AM, Aaron Gray wrote: > On Sat, Feb 21, 2009 at 9:54 PM, Aaron Gray < > aaronngray.lists at googlemail.com> wrote: > >> On Fri, Feb 20, 2009 at 11:45 PM, Tanya Lattner wrote: >> >>> LLVMers, >>> >>> The 2.5 pre-release2 is finally available for testing: >>> http://llvm.org/prereleases/2.5/ >>> >>> If you have time, I'd appreciate anyone who can help test the release. >>> Please do the following: >>> >>> 1) Download/compile llvm source, and either compile llvm-gcc source or >>> use llvm-gcc binary *(please compile llvm-gcc with fortran if you can).* >>> 2) Run make check, send me the testrun.log >>> 3) Run "make TEST=nightly report" and send me the report.nightly.txt >>> 4) Please provide details on what platform you compiled LLVM on, how you >>> built LLMV (src == obj, or src != obj), gcc version, and if you compiled >>> llvm-gcc with support for fortran. The more details, the better. >>> >> >> >> Cygwin with GCC 3.4.4 is failing llvm-gcc-4.2-2.5 with :- >> >> cc1plus: error: unrecognized command line option >> "-Wno-variadic-macros" >> >> Is anyone testing Cygwin with GCC 4.2 or 4.4 ? >> > > I hacked "-Wno-variadic-macros" bug but llvm-gcc is hanging on GCC 3.4.4 > Cygwin later on first bootstrap pass in configure when "checking executable > suffix", presumably this is the first self call. > > On the face of it, it looks like it may be a tough one to debug, dont know > if GCC 3.4.4 Cygwin will make it to 2.5 release. > Actually its [configure-stage3-intl] where its hanging. Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090222/540067b5/attachment.html From tonic at nondot.org Sun Feb 22 11:38:50 2009 From: tonic at nondot.org (Tanya Lattner) Date: Sun, 22 Feb 2009 09:38:50 -0800 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: <200902220929.55192.baldrick@free.fr> References: <200902220929.55192.baldrick@free.fr> Message-ID: <879FFABE-8640-4758-A3D6-8D891793DFA8@nondot.org> On Feb 22, 2009, at 12:29 AM, Duncan Sands wrote: > Hi Tanya, the gcc testsuite doesn't seem to be present > in llvm-gcc4.2-2.5.source. I don't think removing it > is a good idea. I noticed this when I wanted to check > that the release passes the Ada checks. > Its been removed from the final release for many releases now. I personally don't care but was told to do this. I'll check with Chris when he gets back. -Tanya > Ciao, > > Duncan. From baldrick at free.fr Sun Feb 22 14:36:52 2009 From: baldrick at free.fr (Duncan Sands) Date: Sun, 22 Feb 2009 21:36:52 +0100 Subject: [LLVMdev] Broke my tail (call) In-Reply-To: <200902221417.34957.jon@ffconsultancy.com> References: <200902221417.34957.jon@ffconsultancy.com> Message-ID: <200902222136.52661.baldrick@free.fr> Hi Jon, > I have written a variety tests of tail calls for my HLVM and all passed with > flying colors until I wrote this test (which is actually for algebraic > datatypes) and discovered that it segfaults after ~100k iterations through > what I think should be a tail call. Here's the IR: is this really a tail call? I didn't look closely but at a glance it seems to be passing a local stack variable as a call parameter. Ciao, Duncan. From jon at ffconsultancy.com Sun Feb 22 17:20:22 2009 From: jon at ffconsultancy.com (Jon Harrop) Date: Sun, 22 Feb 2009 23:20:22 +0000 Subject: [LLVMdev] Broke my tail (call) In-Reply-To: <200902222136.52661.baldrick@free.fr> References: <200902221417.34957.jon@ffconsultancy.com> <200902222136.52661.baldrick@free.fr> Message-ID: <200902222320.22928.jon@ffconsultancy.com> On Sunday 22 February 2009 20:36:52 Duncan Sands wrote: > Hi Jon, > > > I have written a variety tests of tail calls for my HLVM and all passed > > with flying colors until I wrote this test (which is actually for > > algebraic datatypes) and discovered that it segfaults after ~100k > > iterations through what I think should be a tail call. Here's the IR: > > is this really a tail call? >From what I have understood of the LLVM docs about when tail calls get eliminated on x86 and x64 it should be a tail call, yes. http://llvm.org/docs/CodeGenerator.html#tailcallopt . Caller and callee have the calling convention fastcc. . The call is a tail call - in tail position (ret immediately follows call and ret uses value of call or is void). . Option -tailcallopt is enabled. . No variable argument lists are used. . On x86-64 when generating GOT/PIC code only module-local calls (visibility = hidden or protected) are supported. Those are all satisfied. > I didn't look closely but at a glance it seems to be passing a local stack > variable as a call parameter. In this case, the arguments are a { { i8*, i8* }*, i8* } and a i32. As I understand it, first-class structs are simply unpacked for argument passing so that is equivalent to passing { i8*, i8* }* and i8* and i32. In this case, the first is a pointer to a global variable and the second is a pointer to a malloc'd block. So I don't see why any of the arguments should be inhibiting tail call elimination. I just tested my theory that returning a first-class struct from a function inhibits tail call elimination and it seems that I was correct: altering this function to pass its return struct by pointer in the first argument fixes the stack overflow. Is this a bug in LLVM? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From anton at korobeynikov.info Sun Feb 22 17:15:49 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 23 Feb 2009 02:15:49 +0300 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: <9719867c0902220919x2df43fbci88b0c71d1ddc788c@mail.gmail.com> References: <9719867c0902211354k2464afb5ke98897fa6ac33c4e@mail.gmail.com> <9719867c0902211856r3550e543xd7d050502e26c0ed@mail.gmail.com> <9719867c0902220919x2df43fbci88b0c71d1ddc788c@mail.gmail.com> Message-ID: > > Actually its [configure-stage3-intl] where its hanging. This can easily be due to inline FP math in the stdlib headers. For example - I had to maintain slightly hacked mingw32 headers which do not contain inline FP assembler, otherwise at least libstdc++ configure would hang. No idea about cygwin though. --- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090223/f4769baf/attachment.html From wjl at icecavern.net Sun Feb 22 17:25:55 2009 From: wjl at icecavern.net (Wesley J. Landaker) Date: Sun, 22 Feb 2009 16:25:55 -0700 Subject: [LLVMdev] Creating an LLVM backend for a very small stack machine Message-ID: <200902221626.01038.wjl@icecavern.net> Hi folks, I am interesting in creating an LLVM backend for a very small stack machine. Before I start in on the actual implementation, I'd like to make sure that I'm on the right track. Any comments, suggestions, warnings, tips, etc would be greatly appreciated. Background ---------- There are a number of small as other embedded microprocessors that are often used in FPGAs such as Xilinx's Picoblaze, Lattice's Mico8, or Bernd Paysan's b16-small. I am developping a very small stack machine primarily for use inside FPGAs and ASICs. This serves roughly the same niche as the above processors, however from my perpsective, my design has a number of advantages. For example, mine has a much smaller minimal configuration, has higher code density, is much more parameterizable, fully supports indirect branches, is FPGA-architecture independent, etc. The main goal, however, is to not just have a shiny new architecture, but to have an entire optimized supporting toolchain. I am hoping to build on LLVM as infrastructure. I don't think I need to sell you all on the reasons why this would be handy. I am already pretty "familiar" with LLVM, in that I have successfully made my own front-end DSL for another project by basically following the Kalidescope tutorial. I have a good grasp of the LLVM assembly language, and have a lot of notes (just a sanity-check paper excercize) about how I would lower just about every instruction into my architecture if I were doing it by hand. I've read through all the LLVM documentation available online, and (superficially) looked at a lot of the other backend targets. I have written custom assemblers and peephole optimizers outside of LLVM in the past. I am familiar academically with stack-machine specific "register allocation" optimizations. I'm basically ready to get started on an LLVM backend for my processor. My main concerns are that in my searching online, I've gleaned a number of things that urge me to caution. I don't know how much of this is "true", but it's my impression after spending a lot of time googling: * A Picoblaze backend was apparently attempted both for LLVM and GCC, but apparently didn't go anywhere on either due to missing support for register indirect jumps, and possibly for other reasons. Of course my processor does support this, but ... * There were a number of threads in various GCC lists and other places implying that, at least for GCC, targeting a stack machine would be very very difficult because of it's backend assumptions. Of course LLVM is not GCC, but TableGen seems register-biased ... * There seems to be a (to me) strange negative vibe whenever someone brings up targeting a compiler to a very small microprocessor, usually including an argument that it's not worth it. Of course, I'm the one doing the work, and I already have decided it's worth it ... Anyway, hopefully that gives you an idea of where I'm coming from. Before I get too knee-deep into writing a backend, I have a few questions that I hope someone can help me with. Questions --------- * Has anyone else out there targeted (or tried to target) a stack machine before? Was it successfull? What problems did you have? * What parts of the LLVM backend code generator infrastructure would be usable for targeting a stack machine? e.g. Is it even possible to use TableGen to target a stack machine? * When/where/how do things like big integer (iXXXXX), phi nodes, llvm.* instrincs get lowered; e.g. does my target have to do that, or is it done generically? Ultimtely, I'm wondering if targeting a stack machine with the current LLVM infrastructure is going to be somewhat straightforward even if it's not totally optimal (desirable), or if it's going to be so problematic that I'd be better off implementing an entire new code-generator myself (undesirable). Any other comments or discussion is welcome. All of my work (hardware design and all software) will be publicly and freely available. -- Wesley J. Landaker OpenPGP FP: 4135 2A3B 4726 ACC5 9094 0097 F0A9 8A4C 4CD6 E3D2 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090222/4db71cb1/attachment.bin From eli.friedman at gmail.com Sun Feb 22 18:06:06 2009 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 22 Feb 2009 16:06:06 -0800 Subject: [LLVMdev] Creating an LLVM backend for a very small stack machine In-Reply-To: <200902221626.01038.wjl@icecavern.net> References: <200902221626.01038.wjl@icecavern.net> Message-ID: On Sun, Feb 22, 2009 at 3:25 PM, Wesley J. Landaker wrote: > * Has anyone else out there targeted (or tried to target) a stack machine > before? Was it successfull? What problems did you have? Haven't done that, and I don't think there are any existing backends like this. It should be feasible, though; the backend code is pretty flexible. > * What parts of the LLVM backend code generator infrastructure would be > usable for targeting a stack machine? e.g. Is it even possible to use > TableGen to target a stack machine? You should be able to use existing LLVM backend code and TableGen at least through instruction selection; I'm not sure whether you'll want register allocation or not, but it should be easy to choose either way. The whole thing is quite flexible; see LLVMTargetMachine::addCommonCodeGenPasses in lib/CodeGen/LLVMTargetMachine.cpp for a high-level overview of how CodeGen works. It might also be useful to look at LLVM handles x87 floating-point; the relevant code is in lib/Target/X86/X86FloatingPoint.cpp. > * When/where/how do things like big integer (iXXXXX), phi nodes, llvm.* > instrincs get lowered; e.g. does my target have to do that, or is it done > generically? Aribitrary-width integers, vectors, llvm.*, etc. are lowered generically by the Legalize infrastructure; the backend just has to say what it can and can't support. See lib/Target/X86/X86ISelLowering.cpp for an example. I don't know the details of PHI nodes, but that's also taken care of by instruction selection. -Eli From aaronngray.lists at googlemail.com Sun Feb 22 18:12:24 2009 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Mon, 23 Feb 2009 00:12:24 +0000 Subject: [LLVMdev] 2.5 Pre-release2 available for testing In-Reply-To: References: <9719867c0902211354k2464afb5ke98897fa6ac33c4e@mail.gmail.com> <9719867c0902211856r3550e543xd7d050502e26c0ed@mail.gmail.com> <9719867c0902220919x2df43fbci88b0c71d1ddc788c@mail.gmail.com> Message-ID: <9719867c0902221612w2254359agcbd5282f77176834@mail.gmail.com> On Sun, Feb 22, 2009 at 11:15 PM, Anton Korobeynikov < anton at korobeynikov.info> wrote: > > Actually its [configure-stage3-intl] where its hanging. > > This can easily be due to inline FP math in the stdlib headers. For example > - I had to maintain slightly hacked mingw32 headers which do not contain > inline FP assembler, otherwise at least libstdc++ configure would hang. No > idea about cygwin though. > Don't think it is that. I looked at the configure code and I cannot work out why its failing on working out what the exe suffix is. Weird bug. I have GCC 4.2.2 built on Cygwin now and will try with that, Cheers, Aaron > > --- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090223/ccdc6c89/attachment.html From wjl at icecavern.net Sun Feb 22 19:43:18 2009 From: wjl at icecavern.net (Wesley J. Landaker) Date: Sun, 22 Feb 2009 18:43:18 -0700 Subject: [LLVMdev] Creating an LLVM backend for a very small stack machine In-Reply-To: References: <200902221626.01038.wjl@icecavern.net> Message-ID: <200902221843.23257.wjl@icecavern.net> On Sunday 22 February 2009 17:06:06 Eli Friedman wrote: > On Sun, Feb 22, 2009 at 3:25 PM, Wesley J. Landaker wrote: > > * Has anyone else out there targeted (or tried to target) a stack > > machine before? Was it successfull? What problems did you have? > > Haven't done that, and I don't think there are any existing backends > like this. It should be feasible, though; the backend code is pretty > flexible. At the very least, there isn't anything in the LLVM instruction set that I think I would have any trouble lowering to the architecture ... but so far that's just on paper. ;) I would love to see a Kalescope-like tutorial that goes step-by-step through making a backend. At the very least, I'll be documenting my adventure, so maybe once I know what I'm doing I can turn it into a tutorial. > You should be able to use existing LLVM backend code and TableGen at > least through instruction selection; I'm not sure whether you'll want > register allocation or not, but it should be easy to choose either > way. The whole thing is quite flexible; see > LLVMTargetMachine::addCommonCodeGenPasses in > lib/CodeGen/LLVMTargetMachine.cpp for a high-level overview of how > CodeGen works. It might also be useful to look at LLVM handles x87 > floating-point; the relevant code is in > lib/Target/X86/X86FloatingPoint.cpp. Thank you for the references, I'll have a look at those. I've read quite a few papers on stack-machine-specific "register allocation" algorithms, but at least for the first pass, I want to make this as straightforward as possible. One thing I was considering was pretending I had multiple registers, and then when they actually get emitted I would thunk in code to do stack manipulations, hoping that my architecture specific peephole optimizer would be able to clean it up (or not, for the first cut). > Aribitrary-width integers, vectors, llvm.*, etc. are lowered > generically by the Legalize infrastructure; the backend just has to > say what it can and can't support. See > lib/Target/X86/X86ISelLowering.cpp for an example. I don't know the > details of PHI nodes, but that's also taken care of by instruction > selection. Okay, I obviously need to learn more about the infrastructure here, but this at least sounds promising. I was worried that if I didn't have a register architecture that I'd have to reinvent the wheel in more places than it soudns like I will have to. I'm sure I will be back with more questions once I seriously try starting a target. -- Wesley J. Landaker OpenPGP FP: 4135 2A3B 4726 ACC5 9094 0097 F0A9 8A4C 4CD6 E3D2 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20090222/051ef2c6/attachment-0001.bin From clattner at apple.com Mon Feb 23 00:18:25 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 22 Feb 2009 22:18:25 -0800 Subject: [LLVMdev] Creating an LLVM backend for a very small stack machine In-Reply-To: <200902221843.23257.wjl@icecavern.net> References: <200902221626.01038.wjl@icecavern.net> <200902221843.23257.wjl@icecavern.net> Message-ID: On Feb 22, 2009, at 5:43 PM, Wesley J. Landaker wrote: > > I would love to see a Kalescope-like tutorial that goes step-by-step > through > making a backend. At the very least, I'll be documenting my > adventure, so > maybe once I know what I'm doing I can turn it into a tutorial. Have you seen: http://llvm.org/docs/WritingAnLLVMBackend.html If you're targeting a stack machine, I'd strongly recommend not using the llvm register allocators and just run you own custom stackifier pass instead. -Chris From baldrick at free.fr Mon Feb 23 02:17:07 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Feb 2009 09:17:07 +0100 Subject: [LLVMdev] Broke my tail (call) In-Reply-To: <200902222320.22928.jon@ffconsultancy.com> References: <200902221417.34957.jon@ffconsultancy.com> <200902222136.52661.baldrick@free.fr> <200902222320.22928.jon@ffconsultancy.com> Message-ID: <200902230917.07656.baldrick@free.fr> Hi Jon, > >From what I have understood of the LLVM docs about when tail calls get > eliminated on x86 and x64 it should be a tail call, yes. > > http://llvm.org/docs/CodeGenerator.html#tailcallopt > > . Caller and callee have the calling convention fastcc. > . The call is a tail call - in tail position (ret immediately follows call and > ret uses value of call or is void). > . Option -tailcallopt is enabled. > . No variable argument lists are used. > . On x86-64 when generating GOT/PIC code only module-local calls (visibility = > hidden or protected) are supported. > > Those are all satisfied. this list is for the code generator, and it seems obviously incomplete: it makes no mention of local variables (alloca). Probably it is implicitly assuming that the call was marked "tail call" by the LLVM optimizers. So you also need to check under what conditions the LLVM optimizers do that. > > I didn't look closely but at a glance it seems to be passing a local stack > > variable as a call parameter. > > In this case, the arguments are a { { i8*, i8* }*, i8* } and a i32. Maybe, but I'm pretty sure at least one of these values was calculated by mucking around with allocas. Did you add the "tail call" mark yourself? If not, try removing it, and see if the LLVM optimizers add it back. > I just tested my theory that returning a first-class struct from a function > inhibits tail call elimination and it seems that I was correct: altering this > function to pass its return struct by pointer in the first argument fixes the > stack overflow. > > Is this a bug in LLVM? Could be, but first I'd like to be sure that you are not misusing tail calls. Ciao, Duncan. From marks at dcs.gla.ac.uk Mon Feb 23 04:23:59 2009 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Mon, 23 Feb 2009 10:23:59 -0000 Subject: [LLVMdev] Creating an LLVM backend for a very small stack machine References: <200902221626.01038.wjl@icecavern.net> Message-ID: <58577CAC1C0FB34DAA24FA9DC76613F1029622FB@ex1.ad.dcs.gla.ac.uk> Hi Wesley, I've done quite a lot of work on register allocation for stack machines. You might want to look at my papers: http://www.dcs.gla.ac.uk/~marks/euroforth.pdf http://www.dcs.gla.ac.uk/~marks/thesis.pdf Good luck, Mark. -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu on behalf of Wesley J. Landaker Sent: Sun 22/02/2009 23:25 To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] Creating an LLVM backend for a very small stack machine[MESSAGE NOT SCANNED] Hi folks, I am interesting in creating an LLVM backend for a very small stack machine. Before I start in on the actual implementation, I'd like to make sure that I'm on the right track. Any comments, suggestions, warnings, tips, etc would be greatly appreciated. Background ---------- There are a number of small as other embedded microprocessors that are often used in FPGAs such as Xilinx's Picoblaze, Lattice's Mico8, or Bernd Paysan's b16-small. I am developping a very small stack machine primarily for use inside FPGAs and ASICs. This serves roughly the same niche as the above processors, however from my perpsective, my design has a number of advantages. For example, mine has a much smaller minimal configuration, has higher code density, is much more parameterizable, fully supports indirect branches, is FPGA-architecture independent, etc. The main goal, however, is to not just have a shiny new architecture, but to have an entire optimized supporting toolchain. I am hoping to build on LLVM as infrastructure. I don't think I need to sell you all on the reasons why this would be handy. I am already pretty "familiar" with LLVM, in that I have successfully made my own front-end DSL for another project by basically following the Kalidescope tutorial. I have a good grasp of the LLVM assembly language, and have a lot of notes (just a sanity-check paper excercize) about how I would lower just about every instruction into my architecture if I were doing it by hand. I've read through all the LLVM documentation available online, and (superficially) looked at a lot of the other backend targets. I have written custom assemblers and peephole optimizers outside of LLVM in the past. I am familiar academically with stack-machine specific "register allocation" optimizations. I'm basically ready to get started on an LLVM backend for my processor. My main concerns are that in my searching online, I've gleaned a number of things that urge me to caution. I don't know how much of this is "true", but it's my impression after spending a lot of time googling: