From evan.cheng at apple.com Tue Mar 1 00:26:07 2011 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 28 Feb 2011 22:26:07 -0800 Subject: [LLVMdev] Use of movupd instead of movapd for x86 In-Reply-To: <17F9E444F61B644FAAA6EA20EE53E4DBBA44E9D69F@SAFEX1MAIL2.st.com> References: <17F9E444F61B644FAAA6EA20EE53E4DBBA44E9D0AD@SAFEX1MAIL2.st.com> <17F9E444F61B644FAAA6EA20EE53E4DBBA44E9D69F@SAFEX1MAIL2.st.com> Message-ID: <933C840E-64F7-4E43-A342-A0A02522EB40@apple.com> On Feb 28, 2011, at 2:58 AM, Sebastien DELDON-GNB wrote: > Understood for the aligned case, I want to measure performance degradation for unaligned case. > I mean unaligned case versus aligned. I know this is stupid, but I want to try to pass a <4 x float>* as parameter of a routine and at the call site I want to pass a misaligned pointer. Since LLVM is generating movapd instruction it will raise an exception (SEGFAULT), I just want to know if there is a way to enforce > generation of movupd instruction instead of movapd. If llvm is generating movapd then it believes the pointer is aligned. Without having more information it's impossible to tell what the issue is. Evan > > Seb > >> -----Original Message----- >> From: David A. Greene [mailto:greened at obbligato.org] >> Sent: Friday, February 25, 2011 5:13 PM >> To: Sebastien DELDON-GNB >> Cc: llvmdev at cs.uiuc.edu >> Subject: Re: [LLVMdev] Use of movupd instead of movapd for x86 >> >> Sebastien DELDON-GNB writes: >> >>> Hi all, >>> >>> Is there a way to force llc to generate movupd instruction instead of >> movapd for x86 target ? >>> >>> I know that movapd is more performant, but I would like to measure >> degradation when alignment constraints are not met. >> >> On modern processors a movupd on aligned data is going to be >> indistinguishable in performance from a movapd. >> >> -Dave > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From clattner at apple.com Tue Mar 1 00:51:29 2011 From: clattner at apple.com (Chris Lattner) Date: Mon, 28 Feb 2011 22:51:29 -0800 Subject: [LLVMdev] Using clang+llvm from Xcode 3 project yields 1.5k linkage warnings In-Reply-To: References: Message-ID: On Feb 28, 2011, at 8:07 PM, F?lix Cloutier wrote: > I'm using Xcode 3 to program with LLVM and Clang (both about yesterday's latest revisions), and when I compile, I get 1501 link-time warnings. All those I read were about symbol visibility. Here's an example: > > ld: warning: namespace::class::method() has different visibility (default) in /usr/local/lib/libclangCodeGen.a(CodeGenAction.o) and (hidden) in /Users/myself/Projets/path/build/project.build/Debug/project.build/Objects-normal/x86_64/object.o > > 1501 seems large enough that it could simply be all symbols referenced from the static libraries. It works anyways though. > What am I doing wrong? As usual with stuff causing spectacular diagnostics, I must be missing something fairly elementary. Hi F?lix, The most likely cause of this is that you're building some llvm code with -fvisibility-inlines-hidden. Xcode likes to add this flag by default, you can change it in the target build options. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110228/3de2cdb7/attachment.html From christian.plessl at uni-paderborn.de Tue Mar 1 02:52:24 2011 From: christian.plessl at uni-paderborn.de (Christian Plessl) Date: Tue, 1 Mar 2011 09:52:24 +0100 Subject: [LLVMdev] LLVM teaching materials In-Reply-To: <4D6BB4BA.8090709@fim.uni-passau.de> References: <3B03B939-47AA-4E6F-8213-0F2F8DAAAA6F@uni-paderborn.de> <4D6BB4BA.8090709@fim.uni-passau.de> Message-ID: Hi Tobi On 28.02.2011, at 15:44, Tobias Grosser wrote: > On 02/28/2011 05:27 AM, Christian Plessl wrote: >> Does anybody know of good teaching materials on LLVM? For example, a basic compiler course that use LLVM as a example? I searched the web a bit but didn't find anything suitable. > > I put some slides I used on my webpage. > > http://www.grosser.es > > You can use them. However, they are just a bunch and only about specific > topics. No general compiler introduction. Furthermore, we did have a lot > of interactive discussions, so without having attended the class they > may be difficult to understand. Thanks for sharing these slides. Though not exactly what I was looking for it is a good inspiration for my own slides. > It would be great to create a svn branch to share slides for such a course. I fully agree. Such slides would be useful for teaching LLVM to other but also for people that want to learn LLVM themselves. Recently AMD has published teaching materials for OpenCL (http://developer.amd.com/zones/openclzone/universities/pages/default.aspx). I think something similar for LLVM would be very helpful. Cheers, Christian -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4919 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/063d518b/attachment.bin From Matthieu.Moy at grenoble-inp.fr Tue Mar 1 03:40:47 2011 From: Matthieu.Moy at grenoble-inp.fr (Matthieu Moy) Date: Tue, 01 Mar 2011 10:40:47 +0100 Subject: [LLVMdev] Live values detection in LLVM In-Reply-To: <4D676B40.8000308@imag.fr> (Julien Henry's message of "Fri, 25 Feb 2011 09:41:36 +0100") References: <4D676B40.8000308@imag.fr> Message-ID: [ Note: I'm working with Julien, we talked about the issue off-list ] Julien Henry writes: > Hi all, > > > At some points of my program, I would like to know if some LLVM values > are live or not. For that, I'm using the LiveValues pass, which gives me > methods such as : > > isLiveThroughBlock(Value * v, BasicBlock * b) > isKilledInBlock(Value * v, BasicBlock * b) This pass has just been removed in the trunk LLVM: commit 11ae8292f73d31a3740097cc446a789ca13cdd7f Author: Dan Gohman Date: Mon Feb 28 19:37:59 2011 +0000 Delete the LiveValues pass. I won't get get back to the project it was started for in the foreseeable future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk at 126668 91177308-0d34-0410-b5e6-96231b3b80d8 You should probably look at LiveVariables (yes, "Variables", not "Values") instead: http://llvm.org/docs/doxygen/html/LiveVariables_8h_source.html -- Matthieu Moy http://www-verimag.imag.fr/~moy/ From xinfinity_a at yahoo.com Tue Mar 1 04:10:56 2011 From: xinfinity_a at yahoo.com (Xinfinity) Date: Tue, 1 Mar 2011 02:10:56 -0800 (PST) Subject: [LLVMdev] metadata to inform the optimizers that some code should stay unchanged Message-ID: <31039471.post@talk.nabble.com> Hello LLVM, I am working on some passes that perform code transformations. Since I am interested in performance, I apply the O3 passes, right after my pass. However, the optimization passes modify the code inserted by my pass in an undesirable way. As far I know, there is no way to prevent the optimizers from optimizing some regions of code. So what I intend to do is to attach metadata to the instructions contained in basicblocks that I want to remain unchanged, and to modify some of the optimization passes to be aware of the metadata. For now, I am interested only in passes that would affect the control flow graph, so for a start I modify the Simplify CFG pass and the jump threading pass, but I will check which other passes might duplicate the code, merge blocks etc. Do you think the performance will drop significantly if some regions of code are not optimized ? And do you consider this modification would be of any use to the community? Thank you. Alexandra -- View this message in context: http://old.nabble.com/metadata-to-inform-the-optimizers-that-some-code-should-stay-unchanged-tp31039471p31039471.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From felixcca at yahoo.ca Tue Mar 1 06:02:44 2011 From: felixcca at yahoo.ca (=?iso-8859-1?Q?F=E9lix_Cloutier?=) Date: Tue, 1 Mar 2011 07:02:44 -0500 Subject: [LLVMdev] Using clang+llvm from Xcode 3 project yields 1.5k linkage warnings In-Reply-To: References: Message-ID: <4507E1DE-17D1-42E6-92C1-DB8D48F16385@yahoo.ca> Yes, that was it. Thank you very much. Le 2011-03-01 ? 01:51:29, Chris Lattner a ?crit : > > On Feb 28, 2011, at 8:07 PM, F?lix Cloutier wrote: > >> I'm using Xcode 3 to program with LLVM and Clang (both about yesterday's latest revisions), and when I compile, I get 1501 link-time warnings. All those I read were about symbol visibility. Here's an example: >> >> ld: warning: namespace::class::method() has different visibility (default) in /usr/local/lib/libclangCodeGen.a(CodeGenAction.o) and (hidden) in /Users/myself/Projets/path/build/project.build/Debug/project.build/Objects-normal/x86_64/object.o >> >> 1501 seems large enough that it could simply be all symbols referenced from the static libraries. It works anyways though. >> What am I doing wrong? As usual with stuff causing spectacular diagnostics, I must be missing something fairly elementary. > > Hi F?lix, > > The most likely cause of this is that you're building some llvm code with -fvisibility-inlines-hidden. Xcode likes to add this flag by default, you can change it in the target build options. > > -Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/cf60e82c/attachment.html From baldrick at free.fr Tue Mar 1 06:07:37 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 01 Mar 2011 13:07:37 +0100 Subject: [LLVMdev] Can CallGraphSCCPass distinguish different function pointer types in CallGraphNode? In-Reply-To: References: Message-ID: <4D6CE189.9000503@free.fr> Hi Heming Cui, > For example, > > int aa(int i); > > int main() { > int (*foo)(int); > foo = &aa; > foo(1); > } > > If I want to use the CallGraphSCCPass to get to know that the "foo" > pointer, which is an external node in the CallGraphSCCPass, actually has the > same function type the same as aa(int), how could I get this type information > from CallGraphSCCPass? the CallGraphNode for "main" keeps a list of (call_instruction, called_function) pairs. In it you should find the pair ("foo(1)", aa). You can extract the type of the callee from the call (or invoke) instruction. Ciao, Duncan. From baldrick at free.fr Tue Mar 1 06:15:13 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 01 Mar 2011 13:15:13 +0100 Subject: [LLVMdev] Missing some passes in llvm-ld In-Reply-To: <4FEB04A6-E925-4B14-BE82-9FD2F81CAAC2@gmail.com> References: <89e0944507495fc8eb3f14dae3633306.squirrel@webmail.cs.wisc.edu> <4D34D996.1030806@mxc.ca> <7221226ddb884b77fdf4c4525421299c.squirrel@webmail.cs.wisc.edu> <4D351751.1030607@illinois.edu> <8285a75e191c231030f87197aee5ac54.squirrel@webmail.cs.wisc.edu> <4FEB04A6-E925-4B14-BE82-9FD2F81CAAC2@gmail.com> Message-ID: <4D6CE351.1080309@free.fr> Hi Haohui, > It seems that I can't force some passes to run in llvm-ld as what I can do with opt. > > $ ~/opt/bin/llvm-ld -reassociate > llvm-ld: Unknown command line argument '-reassociate. Try: 'opt/bin/llvm-ld -help' > > llvm-ld definitely linked with scalaropts, and RegisterPass is in the library. > > Running with these passes with opt definitely work, but it'll take some time as my .bc is big (~40M). > > I'll appreciate any ideas how to make llvm-ld run these passes. you can always have llvm-ld output a bitcode file which you then optimize using opt. Alternatively you can modify createStandardLTOPasses in StandardPasses.h, since these are the passes llvm-ld uses. Ciao, Duncan. From dpatel at apple.com Tue Mar 1 12:44:10 2011 From: dpatel at apple.com (Devang Patel) Date: Tue, 1 Mar 2011 10:44:10 -0800 Subject: [LLVMdev] metadata to inform the optimizers that some code should stay unchanged In-Reply-To: <31039471.post@talk.nabble.com> References: <31039471.post@talk.nabble.com> Message-ID: <008BF062-0CD4-469B-9F72-DEB755906343@apple.com> On Mar 1, 2011, at 2:10 AM, Xinfinity wrote: > > > Hello LLVM, > > I am working on some passes that perform code transformations. Since I am > interested in performance, I apply the O3 passes, right after my pass. > However, the optimization passes modify the code inserted by my pass in an > undesirable way. As far I know, there is no way to prevent the optimizers > from optimizing some regions of code. So what I intend to do is to attach > metadata to the instructions contained in basicblocks that I want to remain > unchanged, and to modify some of the optimization passes to be aware of the > metadata. IMO, this is not the right way to go. > For now, I am interested only in passes that would affect the > control flow graph, so for a start I modify the Simplify CFG pass and the > jump threading pass, but I will check which other passes might duplicate the > code, merge blocks etc. > > Do you think the performance will drop significantly if some regions of code > are not optimized ? > And do you consider this modification would be of any > use to the community? > > Thank you. > Alexandra > > -- > View this message in context: http://old.nabble.com/metadata-to-inform-the-optimizers-that-some-code-should-stay-unchanged-tp31039471p31039471.html > Sent from the LLVM - Dev mailing list archive at Nabble.com. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From bijoy123_8 at yahoo.com Tue Mar 1 14:04:22 2011 From: bijoy123_8 at yahoo.com (akramul azim) Date: Tue, 1 Mar 2011 12:04:22 -0800 (PST) Subject: [LLVMdev] warnings for LLVM in MinGW Message-ID: <426082.47494.qm@web121706.mail.ne1.yahoo.com> Hi, ??? I can successfully build LLVM 2.8?with Clang 2.8?and LLVM 2.8?with LLVM-GCC 2.8 in Windows using MinGW. However, when I go for running a simple Hello World program, I get warnings.?Using LLVM-GCC, the warning i get is the following: ? C:\MinGW\bin/ld.exe: Warning: type of symbol `_main' changed from 32 to 512 in D :/DOCUME~2/AZIM/LOCALS~1/Temp/ccnsGHVf.o ? Using?Clang, the warning?i get is the following: ? ?C:/MinGW/bin/../lib/gcc/mingw32/4.5.0/../../../../mingw32/bin/ld.exe: Warning: type of symbol `_main' changed from 32 to 512 in C :/DOCUME~2/AZIM/LOCALS~1/Temp/cc-000001.o ? In both cases, the executable is produced and I can run the executable. I am wondering is there any way to fix the warnings. ? Thanks, Akramul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/27fc17c3/attachment.html From anton at korobeynikov.info Tue Mar 1 14:49:05 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 1 Mar 2011 23:49:05 +0300 Subject: [LLVMdev] warnings for LLVM in MinGW In-Reply-To: <426082.47494.qm@web121706.mail.ne1.yahoo.com> References: <426082.47494.qm@web121706.mail.ne1.yahoo.com> Message-ID: > In both cases, the executable is produced and I can run the executable. I am wondering is there any way to fix the warnings. This was fixed in ToT. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From shaosu.liu at gmail.com Tue Mar 1 14:47:56 2011 From: shaosu.liu at gmail.com (Shaosu Liu) Date: Tue, 1 Mar 2011 14:47:56 -0600 Subject: [LLVMdev] cannot build safecode. Message-ID: Hello, I am trying to build llvm-poolalloc and safecode under current trunk llvm. After building llvm, I cannot build poolalloc. I got following error message: make[1]: Entering directory `/host/llvm/projects/llvm-poolalloc/lib' make[2]: Entering directory `/host/llvm/projects/llvm-poolalloc/lib/DSA' llvm[2]: Compiling AddressTakenAnalysis.cpp for Debug+Asserts build (PIC) AddressTakenAnalysis.cpp: In constructor ?llvm::AddressTakenAnalysis::AddressTakenAnalysis()?: AddressTakenAnalysis.cpp:34: error: no matching function for call to ?llvm::ModulePass::ModulePass(char*)? /host/llvm/include/llvm/Pass.h:235: note: candidates are: llvm::ModulePass::ModulePass(char&) /host/llvm/include/llvm/Pass.h:220: note: llvm::ModulePass::ModulePass(const llvm::ModulePass&) /bin/rm: cannot remove `/host/llvm/projects/llvm-poolalloc/lib/DSA/Debug+Asserts/AddressTakenAnalysis.d.tmp': No such file or directory make[2]: *** [/host/llvm/projects/llvm-poolalloc/lib/DSA/Debug+Asserts/AddressTakenAnalysis.o] Error 1 make[2]: Leaving directory `/host/llvm/projects/llvm-poolalloc/lib/DSA' make[1]: *** [all] Error 1 make[1]: Leaving directory `/host/llvm/projects/llvm-poolalloc/lib' make: *** [all] Error 1 It seems to be that a function is passed a pointer but it is supposed to be passed with a reference. I tried to fix this one. But lots of similar error popped up. I am not sure if I did something wrong with the configuring llvm. Any help is appreciated. Shaosu Liu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/9075c22e/attachment.html From willdtz at gmail.com Tue Mar 1 14:57:10 2011 From: willdtz at gmail.com (Will Dietz) Date: Tue, 1 Mar 2011 14:57:10 -0600 Subject: [LLVMdev] cannot build safecode. In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 2:47 PM, Shaosu Liu wrote: > Hello, > I am trying to build llvm-poolalloc and safecode under current trunk llvm. > After building llvm, I cannot build poolalloc. I got following error > message: The poolalloc and safecode projects currently build against llvm 2.7, and haven't been updated to 2.8 yet (and especially not chasing ToT). Try building with llvm 2.7, sorry :). ~Will From criswell at illinois.edu Tue Mar 1 14:58:05 2011 From: criswell at illinois.edu (John Criswell) Date: Tue, 1 Mar 2011 14:58:05 -0600 Subject: [LLVMdev] cannot build safecode. In-Reply-To: References: Message-ID: <4D6D5DDD.9080100@illinois.edu> On 3/1/11 2:47 PM, Shaosu Liu wrote: > Hello, > I am trying to build llvm-poolalloc and safecode under current trunk llvm. > After building llvm, I cannot build poolalloc. I got following error > message: SAFECode and Poolalloc do not compile with LLVM trunk. They must be compiled with LLVM 2.6 or LLVM 2.7 (depending on whether you're using the release_26 branch or mainline of these projects, respectively). -- John T. > > make[1]: Entering directory `/host/llvm/projects/llvm-poolalloc/lib' > make[2]: Entering directory `/host/llvm/projects/llvm-poolalloc/lib/DSA' > llvm[2]: Compiling AddressTakenAnalysis.cpp for Debug+Asserts build (PIC) > AddressTakenAnalysis.cpp: In constructor > ?llvm::AddressTakenAnalysis::AddressTakenAnalysis()?: > AddressTakenAnalysis.cpp:34: error: no matching function for call to > ?llvm::ModulePass::ModulePass(char*)? > /host/llvm/include/llvm/Pass.h:235: note: candidates are: > llvm::ModulePass::ModulePass(char&) > /host/llvm/include/llvm/Pass.h:220: note: > llvm::ModulePass::ModulePass(const llvm::ModulePass&) > /bin/rm: cannot remove > `/host/llvm/projects/llvm-poolalloc/lib/DSA/Debug+Asserts/AddressTakenAnalysis.d.tmp': > No such file or directory > make[2]: *** > [/host/llvm/projects/llvm-poolalloc/lib/DSA/Debug+Asserts/AddressTakenAnalysis.o] > Error 1 > make[2]: Leaving directory `/host/llvm/projects/llvm-poolalloc/lib/DSA' > make[1]: *** [all] Error 1 > make[1]: Leaving directory `/host/llvm/projects/llvm-poolalloc/lib' > make: *** [all] Error 1 > > It seems to be that a function is passed a pointer but it is supposed > to be passed with a reference. I tried to fix this one. But lots of > similar error popped up. > > I am not sure if I did something wrong with the configuring llvm. > Any help is appreciated. > Shaosu Liu From dneto.llvm at gmail.com Tue Mar 1 15:06:07 2011 From: dneto.llvm at gmail.com (David Neto) Date: Tue, 1 Mar 2011 16:06:07 -0500 Subject: [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords) In-Reply-To: <20110228214100.GB4192@pcc.me.uk> References: <20101216223130.GA1885@pcc.me.uk> <20110104214222.GA22362@pcc.me.uk> <20110223192611.GA30802@pcc.me.uk> <-1506191760974906508@unknownmsgid> <20110228214100.GB4192@pcc.me.uk> Message-ID: On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne wrote: > > The more I think about it, the more I become uncomfortable with the > concept of language-specific address spaces in LLVM. ?These are the > main issues I see with language-specific address spaces: ... > Instead of language-specific address spaces, each target should > concentrate on exposing all of its address spaces as target-specific > address spaces, and frontends should use a language -> target mapping > in target-specific code. ?We can continue to expose the target's main > shared writable address space as address space 0 as we do now. > > For example, Clang could define a set of internal address space > constants for OpenCL and use TargetCodeGenInfo to provide the mapping > to target address spaces. In principle this is a fine idea. I think the difficulty is that LLVM and Clang provide an infrastructure for numbered address spaces, but no standard assignment on top of that infrastructure. The trick is define some conventions, e.g. what the numbers might mean for a language front-end, and whether the interpretation of the numbers change as the IR moves to later stages. We're working in a bit of a vacuum. For example, you're proposing a remapping step somewhere along the line: that could be entirely inside a back-end code generator. Or it could conceivably be an LLVM pass itself, which then could be used with multiple backends that understand the new convention. So I think we need a couple of things: - proposals for number assignments and their associated semantics. - code to flesh out and embody those semantics. e.g. a sample implementation / translation layer Basically Anton got the ball rolling: his code patch was a bit of both. And I think he's planning to post a number of OpenCL proposals in general. As it is, I hope that backends that do not understand address spaces at all know to error out when they receive IR that uses address spaces. david From damien.llvm at gmail.com Tue Mar 1 15:24:54 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Tue, 1 Mar 2011 13:24:54 -0800 Subject: [LLVMdev] Sub registers in inline assembly Message-ID: I was wondering if llvm supports sub registers in an inline asm string. For example, in gcc, using modifiers %w0 makes it possible to access ax if %0 refers to eax. If there is any support, do you know where it is implemented ? I'd like to add such a support for another target. Thank you ! Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/e74873b4/attachment.html From brooks at freebsd.org Tue Mar 1 08:39:04 2011 From: brooks at freebsd.org (Brooks Davis) Date: Tue, 1 Mar 2011 08:39:04 -0600 Subject: [LLVMdev] RFC: LLVM Release Documentation Changes In-Reply-To: References: <834D88FE-FC76-43DB-9043-7E8B087CF5E9@apple.com> Message-ID: <20110301143904.GB57437@lor.one-eyed-alien.net> On Sat, Feb 26, 2011 at 10:50:23PM +0000, Renato Golin wrote: > On 25 February 2011 22:46, Bill Wendling wrote: > > Duncan brought up the question of how to release clang. It would be nice to package clang by itself without all of the LLVM tools. Basically, making it a standalone package. I have no problem with this (in fact, I think it's a great idea), but it does mean changing Makefiles and stuff. Any help with this would be greatly appreciated. > > Hi Bill, > > We were discussing exactly that this week. With Clang inside LLVM's > tree, it's hard for us to create separate (internal) products for > them, so we can build and test them separately (EDG+LLVM against > Clang+LLVM). But I'm not sure we're the most common types of users... > > I believe the Debian package for Clang is separate from LLVM, which > makes sense, but that might be hard to produce, given that they're too > tightly coupled... Debian maintainers would know more... ;) We're doing separate builds in the FreeBSD ports collection. You have to have the LLVM source try around and we configure against the whole tree, but we can build against and installed version of llvm with only a few hacks. The basic process is: - extract llvm and clang sources. cd ./configuire ln -s ${PREFIX}/include/llvm/Intrinsics.gen /include/llvm cd utils/unittest/googletest gmake cd /tools/clang gmake gmake install The need to build googletest is relatively recent. Oddly, if you just install libGoogleTest.a as part of the LLVM package build you get build errors. -- Brooks -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 188 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/7fac90a2/attachment.bin From clattner at apple.com Tue Mar 1 16:59:28 2011 From: clattner at apple.com (Chris Lattner) Date: Tue, 1 Mar 2011 14:59:28 -0800 Subject: [LLVMdev] Sub registers in inline assembly In-Reply-To: References: Message-ID: <01EC01F1-CC1C-4E38-991A-B032F31397C0@apple.com> On Mar 1, 2011, at 1:24 PM, Damien Vincent wrote: > > I was wondering if llvm supports sub registers in an inline asm string. > For example, in gcc, using modifiers %w0 makes it possible to access ax if %0 refers to eax. > > If there is any support, do you know where it is implemented ? I'd like to add such a support for another target. Hi Damien, It sure does. These are considered "modifier" characters and are handled in (e.g.) X86AsmPrinter::PrintAsmOperand. You can send some C code with asms in it through x86 clang to see what IR it generates or use the llvm demo page. -Chris From miyamoto31b at gmail.com Tue Mar 1 16:47:07 2011 From: miyamoto31b at gmail.com (moesasji) Date: Tue, 1 Mar 2011 22:47:07 +0000 (UTC) Subject: [LLVMdev] Vim auto completion References: <4CC0AC96.2080408@fim.uni-passau.de> Message-ID: Dan Gohman apple.com> writes: > > The clang patches are now applied, so vim auto completion now works mostly > out-of-the-box -- just build clang, set up your PATH for clang and clang++, > and copy utils/vim/vimrc to ~/.vimrc (or symlink, or do your own thing). > > To configure the clang command-line, look for the g: configuration > variables in the vimrc file. > I've been trying to get VIM to compile c++ code using clang++ based on the above comments as I prefer its debug output over gcc. Unfortunately so far without much success and Google does not really help me in finding a solution. So I hope that someone can point me in the correct direction as I would like to see how the vim autocompletion works. I'm currently using the 2.9 branch of clang + llvm on FreeBSD stable and can compile a hello world program from the commandline directly so both clang and clang++ are in the path. However when running vim with the above vimrc file I however fail to compile the same bit of code from within VIM (using :make ). In this I set make in VIM to cLang by typing :set makeprg=clang in VIM. Yet the output I get makes no sense to me although it clearly is using clang to do the compile. Hopefully somebody can point out what I do wrong here? fyi) below is the compiler or linker output I get when running :make hello although it probably is not relevant. The same code compiles without problems when compiling from the CMD-line --- hello:(.data+0x8): multiple definition of `__dso_handle' /usr/lib/crtbegin.o:(.data+0x0): first defined here hello: In function `_init': (.init+0x0): multiple definition of `_init' /usr/lib/crti.o:/usr/src/lib/csu/amd64/crti.S:(.init+0x0): first defined here hello:(.data+0x0): multiple definition of `__progname' /usr/lib/crt1.o:(.data+0x0): first defined here hello: In function `_start': (.text+0x0): multiple definition of `_start' /usr/lib/crt1.o:crt1.c:(.text+0x0): first defined here hello: In function `_fini': (.fini+0x0): multiple definition of `_fini' /usr/lib/crti.o:/usr/src/lib/csu/amd64/crti.S:(.fini+0x0): first defined here /usr/local/bin/ld: error in hello(.eh_frame); no .eh_frame_hdr table will be created. clang: error: linker command failed with exit code 1 (use -v to see invocation) --- From damien.llvm at gmail.com Tue Mar 1 17:55:35 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Tue, 1 Mar 2011 15:55:35 -0800 Subject: [LLVMdev] Sub registers in inline assembly In-Reply-To: <01EC01F1-CC1C-4E38-991A-B032F31397C0@apple.com> References: <01EC01F1-CC1C-4E38-991A-B032F31397C0@apple.com> Message-ID: Thank you Chris ! I didn't realize this was just an asm printing issue... and I just checked X86ISelLowering.cpp where you have all the asm constraints... On Tue, Mar 1, 2011 at 2:59 PM, Chris Lattner wrote: > > On Mar 1, 2011, at 1:24 PM, Damien Vincent wrote: > > > > > I was wondering if llvm supports sub registers in an inline asm string. > > For example, in gcc, using modifiers %w0 makes it possible to access ax > if %0 refers to eax. > > > > If there is any support, do you know where it is implemented ? I'd like > to add such a support for another target. > > Hi Damien, > > It sure does. These are considered "modifier" characters and are handled > in (e.g.) X86AsmPrinter::PrintAsmOperand. > > You can send some C code with asms in it through x86 clang to see what IR > it generates or use the llvm demo page. > > -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110301/4d2df32a/attachment.html From greened at obbligato.org Tue Mar 1 21:12:21 2011 From: greened at obbligato.org (David A. Greene) Date: Tue, 01 Mar 2011 21:12:21 -0600 Subject: [LLVMdev] [cfe-dev] Reminder: LLVM 2.9 Branching in One Week In-Reply-To: (NAKAMURA Takumi's message of "Tue, 1 Mar 2011 13:03:16 +0900") References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> Message-ID: <87mxlei5be.fsf@smith.obbligato.org> NAKAMURA Takumi writes: > Matthieu, > > On Tue, Mar 1, 2011 at 8:02 AM, Matthieu Moy > wrote: >> At some point in the past, an anti-git-svn system had been set up on >> llvm.org. Has this been disabled since? I don't manage to do much with >> git-svn: > > Maybe sure. Anton said it is disabled to access upper directories with svn. > Thus, we (accessing llvm.org remotely) cannot do git-svn with branches > for homebrew. This is exactly right. > Andreas and Anton, could you please launch git release branches for llvm.org? I think the trouble with branches is the lockdown of the root repository directory. I tried something like this but it breaks due to the restrictions: git svn init --stdlayout https://@llvm.org/svn/llvm-project/llvm \ --ignore-paths="^.*(Apple|PowerPC.*|SVA|eh-experimental|ggreif|non-call-eh|parallel|release_.*|vector_llvm|wendling|May2007|checker|cremebrulee|start|RELEASE_1.*|RELEASE_2[0-7])" Obviously, replace with whatever it needs to be to allow dcommit to work. Ideally we'd have clang and llvm-gcc git mirrors as well via the --prefix argument to git-svn init, but let's not get ahead of ourselves. :) It appears that there's not much those of us who would like git branches for svn release tags can do without help from the server side. -Dave From speziale.ettore at gmail.com Wed Mar 2 01:12:23 2011 From: speziale.ettore at gmail.com (Speziale Ettore) Date: Wed, 02 Mar 2011 08:12:23 +0100 Subject: [LLVMdev] [cfe-dev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords) In-Reply-To: References: <20101216223130.GA1885@pcc.me.uk> <20110104214222.GA22362@pcc.me.uk> <20110223192611.GA30802@pcc.me.uk> <-1506191760974906508@unknownmsgid> <20110228214100.GB4192@pcc.me.uk> Message-ID: <1299049943.3583.20.camel@mars> Hi, > On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne wrote: > > > > The more I think about it, the more I become uncomfortable with the > > concept of language-specific address spaces in LLVM. These are the > > main issues I see with language-specific address spaces: > > ... > > > Instead of language-specific address spaces, each target should > > concentrate on exposing all of its address spaces as target-specific > > address spaces, and frontends should use a language -> target mapping > > in target-specific code. We can continue to expose the target's main > > shared writable address space as address space 0 as we do now. > > > > For example, Clang could define a set of internal address space > > constants for OpenCL and use TargetCodeGenInfo to provide the mapping > > to target address spaces. > > In principle this is a fine idea. > > I think the difficulty is that LLVM and Clang provide an > infrastructure for numbered address spaces, but no standard assignment > on top of that infrastructure. The trick is define some conventions, > e.g. what the numbers might mean for a language front-end, and whether > the interpretation of the numbers change as the IR moves to later > stages. We're working in a bit of a vacuum. > > For example, you're proposing a remapping step somewhere along the > line: that could be entirely inside a back-end code generator. Or it > could conceivably be an LLVM pass itself, which then could be used > with multiple backends that understand the new convention. > > So I think we need a couple of things: > - proposals for number assignments and their associated semantics. > - code to flesh out and embody those semantics. e.g. a sample > implementation / translation layer > > Basically Anton got the ball rolling: his code patch was a bit of > both. And I think he's planning to post a number of OpenCL proposals > in general. > > As it is, I hope that backends that do not understand address spaces > at all know to error out when they receive IR that uses address > spaces. The OpenCL standard talks about addess spaces, but I think they can be interpreted as scopes (except __constants): * __global: globally accessible variables * __private: visible only to a work item * __local: accessible by all work item in a work group The address space is the way scoping rules are implemented in hardware, e.g __local variables are mapped in the address space X which is a fast memory shared by all ALU inside a GPU multiprocessor. Maybe introducing such "scopes", it is possible to decouple backends fom frontends. __constant is a corner case: it can be modelled as a global scope that contains read only data Have a nice day, speziale.ettore at gmail.com From kd at kendyck.com Wed Mar 2 08:38:55 2011 From: kd at kendyck.com (Ken Dyck) Date: Wed, 2 Mar 2011 09:38:55 -0500 Subject: [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords) In-Reply-To: References: <20101216223130.GA1885@pcc.me.uk> <20110104214222.GA22362@pcc.me.uk> <20110223192611.GA30802@pcc.me.uk> <-1506191760974906508@unknownmsgid> <20110228214100.GB4192@pcc.me.uk> Message-ID: On Tue, Mar 1, 2011 at 4:06 PM, David Neto wrote: > On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne wrote: >> >> The more I think about it, the more I become uncomfortable with the >> concept of language-specific address spaces in LLVM. ?These are the >> main issues I see with language-specific address spaces: > > ... > >> Instead of language-specific address spaces, each target should >> concentrate on exposing all of its address spaces as target-specific >> address spaces, and frontends should use a language -> target mapping >> in target-specific code. ?We can continue to expose the target's main >> shared writable address space as address space 0 as we do now. >> >> For example, Clang could define a set of internal address space >> constants for OpenCL and use TargetCodeGenInfo to provide the mapping >> to target address spaces. > > In principle this is a fine idea. > > I think the difficulty is that LLVM and Clang provide an > infrastructure for numbered address spaces, but no standard assignment > on top of that infrastructure. You can trace back the origins of the addrspace attribute in the mailing list archives to this thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-November/011385.html. >From there, it is pretty clear that addrspace was introduced specifically as a mechanism for implementing the 'named address space' extensions defined in the Embedded C standard (ISO/IEC TR 18037, http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf). The Embedded C standard gives this overview of the 'named address space' extension: Many embedded processors have multiple distinct banks of memory and require that data be grouped in different banks to achieve maximum performance. Ensuring the simultaneous flow of data and coefficient data to the multiplier/accumulator of processors designed for FIR filtering, for example, is critical to their operation. In order to allow the programmer to declare the memory space from which a specific data object must be fetched, this Technical Report specifies basic support for multiple address spaces. As a result, optimizing compilers can utilize the ability of processors that support multiple address spaces, for instance, to read data from two separate memories in a single cycle to maximize execution speed. If you dig into the Embedded C standard, you'll find that the 'named address space' extension is highly target-specific. It is only portable insofar as two target processors have similar memory organization and use identical names for their address spaces. So the reason that there aren't any conventions for the address space numbers in clang/llvm is because there aren't any conventions for how chip designers incorporate memories into the architectures that they design. The one convention that the Embedded C standard does specify is that when the address space of a type is unspecified, the type is assumed to be in the 'generic' space. Clang currently emits an address space of zero in this case. Arguably, LLVM could define a single enum value, GENERIC, for use by the code generators. > The trick is define some conventions, > e.g. what the numbers might mean for a language front-end, and whether > the interpretation of the numbers change as the IR moves to later > stages. ?We're working in a bit of a vacuum. > > ... > > So I think we need a couple of things: > - proposals for number assignments and their associated semantics. > - code to flesh out and embody those semantics. e.g. a sample > implementation / translation layer In my opinion, any knowledge that front ends have of address spaces should be dictated by the target's back end. Perhaps we should add some virtual methods to LLVM's TargetMachine interface so front ends can query the back end for the names and numbers of the address spaces that they recognize, and expose them to end users in a standard way. But having front ends impose the requirement on back ends that they recognize some arbitrary set of language-specific address spaces seems like a great misuse of the feature to me for reasons that Peter has already pointed out. > Basically Anton got the ball rolling: his code patch was a bit of > both. ?And I think he's planning to post a number of OpenCL proposals > in general. It seems to me, as Speziale already pointed out, that the OpenCL type qualifiers aren't address space qualifiers at all (in the Embedded C sense). They might be better implemented as a separate set of qualifiers in the way that Objective-C defines its garbage-collection qualifiers, __strong and __weak. See the Qualifiers class in AST/Type.h. > As it is, I hope that backends that do not understand address spaces > at all know to error out when they receive IR that uses address > spaces. This is currently not the case. The back ends for architectures that don't have multiple address spaces simply ignore the address space number on the address operands of load and store nodes. The back ends that do support multiple address spaces treat any address space number that they don't recognize in the same way that they address space 0. -Ken From vincent_de_bruyne at hotmail.com Wed Mar 2 09:44:56 2011 From: vincent_de_bruyne at hotmail.com (Vincent De Bruyne) Date: Wed, 2 Mar 2011 15:44:56 +0000 Subject: [LLVMdev] Compile C files to one .bc file Message-ID: Hi I'm trying to compile the "bh" C program from the Olden benchmark to one bc file. # compile source files into an LLVM bitcode file llvm-gcc -emit-llvm -c args.c -o args.bc -w -DTORONTO llvm-gcc -emit-llvm -c newbh.c -o newbh.bc -w -DTORONTO llvm-gcc -emit-llvm -c util.c -o util.bc -w -DTORONTO llvm-gcc -emit-llvm -c walksub.c -o walksub.bc -w -DTORONTO # To link files together using llvm-ld llvm-ld -o bh.bc newbh.bc args.bc util.bc walksub.bc -lm But when I try to run my pass over bh.bc file or just compile it to native code llc bh.bc -o bh.s I get the following error. llc: bh.bc:1:1: error: expected top-level entity Do you need to do some special stuff when you want to compile a C program with different files to one bc file. Thx, Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110302/b65c52a3/attachment.html From rafael.espindola at gmail.com Wed Mar 2 09:57:16 2011 From: rafael.espindola at gmail.com (Rafael Avila de Espindola) Date: Wed, 02 Mar 2011 10:57:16 -0500 Subject: [LLVMdev] Compile C files to one .bc file In-Reply-To: References: Message-ID: <4D6E68DC.9000203@gmail.com> On 11-03-02 10:44 AM, Vincent De Bruyne wrote: > Hi > > I'm trying to compile the "bh" C program from the Olden benchmark to one > bc file. > > # compile source files into an LLVM bitcode file > llvm-gcc -emit-llvm -c args.c -o args.bc -w -DTORONTO > llvm-gcc -emit-llvm -c newbh.c -o newbh.bc -w -DTORONTO > llvm-gcc -emit-llvm -c util.c -o util.bc -w -DTORONTO > llvm-gcc -emit-llvm -c walksub.c -o walksub.bc -w -DTORONTO > > # To link files together using llvm-ld > llvm-ld -o bh.bc newbh.bc args.bc util.bc walksub.bc -lm Check what is in bh.bc. If I remember correctly llvm-ld used to create a shell script that would run lli on the actual IL file. > But when I try to run my pass over bh.bc file or just compile it to > native code > llc bh.bc -o bh.s > > I get the following error. > llc: bh.bc:1:1:*error: expected top-level entity* > > Do you need to do some special stuff when you want to compile a C > program with different files to one bc file. For large programs you will probably a system linker that supports llvm (the apple one) or plugin (gold and very recent versions of gnu ld). For small programs using llvm-link or llvm-ld should be ok. > Thx, > Vincent > Cheers, Rafael From samuraileumas at yahoo.com Wed Mar 2 10:10:16 2011 From: samuraileumas at yahoo.com (Samuel Crow) Date: Wed, 2 Mar 2011 08:10:16 -0800 (PST) Subject: [LLVMdev] Fw: Compile C files to one .bc file In-Reply-To: References: Message-ID: <697340.29170.qm@web62007.mail.re1.yahoo.com> Sorry, forgot to CC the list. ----- Forwarded Message ---- > From: Samuel Crow > To: Vincent De Bruyne > Sent: Wed, March 2, 2011 10:08:57 AM > Subject: Re: [LLVMdev] Compile C files to one .bc file > > Hi Vincent, > > You probably need the C runtime library. > > --Sam > > > > >From: Vincent De Bruyne > >To: llvmdev > >Sent: Wed, March 2, 2011 9:44:56 AM > >Subject: [LLVMdev] Compile C files to one .bc file > > > --snip-- > > > > > >Do you need to do some special stuff when you want to compile a C program >with > > >different files to one bc file. > > > > > > From justin.holewinski at gmail.com Wed Mar 2 10:10:33 2011 From: justin.holewinski at gmail.com (Justin Holewinski) Date: Wed, 2 Mar 2011 11:10:33 -0500 Subject: [LLVMdev] Compile C files to one .bc file In-Reply-To: References: Message-ID: On Wed, Mar 2, 2011 at 10:44 AM, Vincent De Bruyne < vincent_de_bruyne at hotmail.com> wrote: > Hi > > I'm trying to compile the "bh" C program from the Olden benchmark to one bc > file. > > # compile source files into an LLVM bitcode file > llvm-gcc -emit-llvm -c args.c -o args.bc -w -DTORONTO > llvm-gcc -emit-llvm -c newbh.c -o newbh.bc -w -DTORONTO > llvm-gcc -emit-llvm -c util.c -o util.bc -w -DTORONTO > llvm-gcc -emit-llvm -c walksub.c -o walksub.bc -w -DTORONTO > > # To link files together using llvm-ld > llvm-ld -o bh.bc newbh.bc args.bc util.bc walksub.bc -lm > llvm-link would be better suited for this purpose. It takes in several byte-code files and combines them into a single byte-code file. Also, you do not need the -lm flag until you generate the final executable. > > But when I try to run my pass over bh.bc file or just compile it to native > code > llc bh.bc -o bh.s > > I get the following error. > llc: bh.bc:1:1:* error: expected top-level entity* > > Do you need to do some special stuff when you want to compile a C program > with different files to one bc file. > > Thx, > Vincent > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110302/4ede2064/attachment.html From drizzle76 at gmail.com Wed Mar 2 10:23:55 2011 From: drizzle76 at gmail.com (drizzle drizzle) Date: Wed, 2 Mar 2011 08:23:55 -0800 Subject: [LLVMdev] live variable analysis Message-ID: Hi As I understand live variable analysis will set the def/kill properties of operands. In that case, is it still needed to set the kill flags when possible during lowering? thanks dz From stoklund at 2pi.dk Wed Mar 2 11:20:56 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 2 Mar 2011 09:20:56 -0800 Subject: [LLVMdev] live variable analysis In-Reply-To: References: Message-ID: <6158B189-92DD-41A6-A872-DD332E197D54@2pi.dk> On Mar 2, 2011, at 8:23 AM, drizzle drizzle wrote: > Hi > As I understand live variable analysis will set the def/kill > properties of operands. In that case, is it still needed to set the > kill flags when possible during lowering? Are any passes before register allocation using the kill flags? LiveVariables is not run in the -O0 pipeline. The fast register allocator should work fine without kill flags. /jakob From dneto.llvm at gmail.com Wed Mar 2 17:14:09 2011 From: dneto.llvm at gmail.com (David Neto) Date: Wed, 2 Mar 2011 18:14:09 -0500 Subject: [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords) In-Reply-To: References: <20101216223130.GA1885@pcc.me.uk> <20110104214222.GA22362@pcc.me.uk> <20110223192611.GA30802@pcc.me.uk> <-1506191760974906508@unknownmsgid> <20110228214100.GB4192@pcc.me.uk> Message-ID: On Wed, Mar 2, 2011 at 9:38 AM, Ken Dyck wrote: > > You can trace back the origins of the addrspace attribute in the > mailing list archives to this thread: > http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-November/011385.html. > From there, it is pretty clear that addrspace was introduced > specifically as a mechanism for implementing the 'named address space' > extensions defined in the Embedded C standard (ISO/IEC TR 18037, > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf). > ... > > If you dig into the Embedded C standard, you'll find that the 'named > address space' extension is highly target-specific. It is only > portable insofar as two target processors have similar memory > organization and use identical names for their address spaces. Yes, I have read that section of the Embedded C standard. I agree that named address spaces are target specific. I read OpenCL's use of address spaces as a subsetting of the Embedded C concept of address spaces, i.e. a specific set of them, with specific names, and language-level restrictions. I disagree with Speziale. OpenCL address spaces are not "scopes" or scope-like things. OpenCL address spaces describe disjoint storage locations with incompatible pointer types, and with independent address numbering (i.e. if you happened to cast a pointer to an integer). For example, Clang makes sure that you can't assign a pointer to one address space into a pointer to another address space. "illegal implicit conversion between two pointers with different address spaces". OpenCL has that restriction as well: Section 6.8 paragraph a. So I argue you really do want to use Clang address spaces to represent OpenCL address spaces. Speziale wrote: > The OpenCL standard talks about addess spaces, but I think they can be > interpreted as scopes (except __constants): > > * __global: globally accessible variables > * __private: visible only to a work item > * __local: accessible by all work item in a work group > > The address space is the way scoping rules are implemented in hardware, > e.g __local variables are mapped in the address space X which is a fast > memory shared by all ALU inside a GPU multiprocessor. Maybe introducing > such "scopes", it is possible to decouple backends fom frontends. > > __constant is a corner case: it can be modelled as a global scope that > contains read only data But note that OpenCL doesn't even express work items at the language level: it's all implied by the workings of the runtime. What Speziale calls scoping in hardware is really just the GPU view of how the program is run: yes, it was the original model, but not the only one. If you run the work items and work groups serially, then it's not longer "scoping" but rather data lifetime that comes into play: each __private lives only as long as its work item; each __local lives as long as the work group, each __global lives as long as the buffer is in global memory (possibly multiple kernel executions). (Yes, I understand that a program with barrier() calls requires some parallelism or interleaving between different work items.) But those lifetimes/scoping rules are extra semantics over and above the baseline (restricted) concept of address spaces from Embedded C. On Wed, Mar 2, 2011 at 9:38 AM, Ken Dyck wrote: > > So the reason that there aren't any conventions for the address space > numbers in clang/llvm is because there aren't any conventions for how > chip designers incorporate memories into the architectures that they > design. Sure. > > The one convention that the Embedded C standard does specify is that > when the address space of a type is unspecified, the type is assumed > to be in the 'generic' space. Clang currently emits an address space > of zero in this case. Arguably, LLVM could define a single enum value, > GENERIC, for use by the code generators. Similarly, OpenCL says that anything without an address space qualifier ends up in __private. > > In my opinion, any knowledge that front ends have of address spaces > should be dictated by the target's back end. Perhaps we should add > some virtual methods to LLVM's TargetMachine interface so front ends > can query the back end for the names and numbers of the address spaces > that they recognize, and expose them to end users in a standard way. > But having front ends impose the requirement on back ends that they > recognize some arbitrary set of language-specific address spaces seems > like a great misuse of the feature to me for reasons that Peter has > already pointed out. This makes sense to me. So the TargetMachine would advertise what address spaces it has, and how they map to OpenCL address spaces (if at all). Then Clang could error out gracefully if the user is compiling OpenCL code and the target doesn't support OpenCL. That addresses the basic validity issue. If the target does support OpenCL, then the front end would dynamically adopt whatever backend numbers were defined by the target. We should probably keep the convention that address space 0 is the generic space, always. This neatly solves the ARM vs. someone-else difference between numberings of local vs. constant. > > It seems to me, as Speziale already pointed out, that the OpenCL type > qualifiers aren't address space qualifiers at all (in the Embedded C > sense). They might be better implemented as a separate set of > qualifiers in the way that Objective-C defines its garbage-collection > qualifiers, __strong and __weak. See the Qualifiers class in > AST/Type.h. I very much disagree, for reasons I gave above. Sorry if I've gone on too long. cheers, david From andrew at sidefx.com Wed Mar 2 18:27:59 2011 From: andrew at sidefx.com (Andrew Clinton) Date: Wed, 02 Mar 2011 19:27:59 -0500 Subject: [LLVMdev] How to write optimizer loop Message-ID: <4D6EE08F.3060200@sidefx.com> I've written an optimization loop, with the following form: PassManager lpm; lpm.add(createLoopDeletionPass()); lpm.add(createSCCPPass()); lpm.add(createAggressiveDCEPass()); lpm.add(createGlobalOptimizerPass()); lpm.add(createGlobalDCEPass()); lpm.add(createDeadStoreEliminationPass()); lpm.add(createLoopDeletionPass()); lpm.add(createInstructionCombiningPass()); lpm.add(createCFGSimplificationPass()); const int maxit = 100; int it = 0; bool changed = true; while (changed && it < maxit) { changed = lpm.run(*myModule); it++; } Aside from the possibility that the optimizations don't converge (handled by the "maxit" variable), this code is erroneous since the Loop Deletion pass incurs LCSSA and loop-simplify, which will likely always modify code that has been simplified via CFGSimplification. Is there a recommended method to write an optimization loop that correctly detects when an iteration has made changes to the module? I'm now thinking that it will be necessary to compare the new and previous module, possibly with a hash function. Also, how could I embed this loop into another PassManager so that it doesn't need to recompute stuff such as DominatorTree? Andrew From ezengbin at gmail.com Wed Mar 2 22:34:21 2011 From: ezengbin at gmail.com (Bin Zeng) Date: Wed, 02 Mar 2011 23:34:21 -0500 Subject: [LLVMdev] MachineOperand type Message-ID: <4D6F1A4D.80402@gmail.com> Hi all, I have a question about the types of MachineOperand. There are 12 different types of MachineOperand such as MO_Register, MO_Immediate and so on. Some of the names are self-explanatory such as MO_Register and MO_Immediate. Some of them are a little confusing such as MO_FrameIndex, MO_ConstantPoolIndex and so on. For example, what is the different between MO_ExternalSymbol and MO_GlobalAddress? Are these two types orthogonal? I found that memset and memcpy are MO_ExternalSymbol and printf and fprintf and so on are MO_GlobalAddress. Thanks a lot in advance. Any advice will be greatly appreciated. Bin From clattner at apple.com Thu Mar 3 00:21:56 2011 From: clattner at apple.com (Chris Lattner) Date: Wed, 2 Mar 2011 22:21:56 -0800 Subject: [LLVMdev] MachineOperand type In-Reply-To: <4D6F1A4D.80402@gmail.com> References: <4D6F1A4D.80402@gmail.com> Message-ID: <6B872D03-8E98-4FA9-8DD9-A2FBF3C34524@apple.com> On Mar 2, 2011, at 8:34 PM, Bin Zeng wrote: > Hi all, > > I have a question about the types of MachineOperand. There are 12 > different types of MachineOperand such as MO_Register, MO_Immediate and > so on. Some of the names are self-explanatory such as MO_Register and > MO_Immediate. Some of them are a little confusing such as MO_FrameIndex, > MO_ConstantPoolIndex and so on. For example, what is the different > between MO_ExternalSymbol and MO_GlobalAddress? Are these two types > orthogonal? I found that memset and memcpy are MO_ExternalSymbol and > printf and fprintf and so on are MO_GlobalAddress. > > Thanks a lot in advance. Any advice will be greatly appreciated. GlobalAddress is used to refer to something that exists in IR, such as a global variable. ExternalSymbol is used to refer to something that isn't in the IR because it was synthesized by codegen, such as __addsi3. -Chris From 2sandeepchandran at gmail.com Thu Mar 3 01:08:05 2011 From: 2sandeepchandran at gmail.com (Sandeep) Date: Thu, 3 Mar 2011 02:08:05 -0500 Subject: [LLVMdev] Error when loading libraries using 'opt -load' command Message-ID: Hi, I have managed to install LLVM on my system following the documentation given on the site. But I am facing issues when trying to run some of the example transforms which are distributed as part of the installation. Below is the error message : *Error opening '/home/sandy/llvm/src/Debug/lib/LLVMHello.so': /home/sandy/llvm/src/Debug/lib/LLVMHello.so: undefined symbol: _ZNK4llvm12FunctionPass17createPrinterPassERNS_11raw_ostreamERKSs* * -load request ignored. opt: Unknown command line argument '-Hello'. Try: 'opt -help'* I get the same error message when I try to load any of the generated shared objects (*FunctionPass* in the error message becomes *ModulePass* as is defined in the code). I have tried installing llvm multiple times but it has not helped. Please help me fix this. Regards, Sandeep -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/b59300e3/attachment.html From sebastien.deldon at st.com Thu Mar 3 04:27:11 2011 From: sebastien.deldon at st.com (Sebastien DELDON-GNB) Date: Thu, 3 Mar 2011 11:27:11 +0100 Subject: [LLVMdev] Use of movupd instead of movapd for x86 In-Reply-To: <933C840E-64F7-4E43-A342-A0A02522EB40@apple.com> References: <17F9E444F61B644FAAA6EA20EE53E4DBBA44E9D0AD@SAFEX1MAIL2.st.com> <17F9E444F61B644FAAA6EA20EE53E4DBBA44E9D69F@SAFEX1MAIL2.st.com> <933C840E-64F7-4E43-A342-A0A02522EB40@apple.com> Message-ID: <17F9E444F61B644FAAA6EA20EE53E4DBBA44F4AE05@SAFEX1MAIL2.st.com> OK, I found a work-around by adding ,align 8 to vector load/store. Thanks all for your answers > -----Original Message----- > From: Evan Cheng [mailto:evan.cheng at apple.com] > Sent: Tuesday, March 01, 2011 7:26 AM > To: Sebastien DELDON-GNB > Cc: David A. Greene; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Use of movupd instead of movapd for x86 > > > On Feb 28, 2011, at 2:58 AM, Sebastien DELDON-GNB wrote: > > > Understood for the aligned case, I want to measure performance > degradation for unaligned case. > > I mean unaligned case versus aligned. I know this is stupid, but I > want to try to pass a <4 x float>* as parameter of a routine and at the > call site I want to pass a misaligned pointer. Since LLVM is generating > movapd instruction it will raise an exception (SEGFAULT), I just want > to know if there is a way to enforce > > generation of movupd instruction instead of movapd. > > If llvm is generating movapd then it believes the pointer is aligned. > Without having more information it's impossible to tell what the issue > is. > > Evan > > > > > Seb > > > >> -----Original Message----- > >> From: David A. Greene [mailto:greened at obbligato.org] > >> Sent: Friday, February 25, 2011 5:13 PM > >> To: Sebastien DELDON-GNB > >> Cc: llvmdev at cs.uiuc.edu > >> Subject: Re: [LLVMdev] Use of movupd instead of movapd for x86 > >> > >> Sebastien DELDON-GNB writes: > >> > >>> Hi all, > >>> > >>> Is there a way to force llc to generate movupd instruction instead > of > >> movapd for x86 target ? > >>> > >>> I know that movapd is more performant, but I would like to measure > >> degradation when alignment constraints are not met. > >> > >> On modern processors a movupd on aligned data is going to be > >> indistinguishable in performance from a movapd. > >> > >> -Dave > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From scottm at mail.face.aero.org Thu Mar 3 11:57:10 2011 From: scottm at mail.face.aero.org (B. Scott Michel) Date: Thu, 03 Mar 2011 09:57:10 -0800 Subject: [LLVMdev] =?utf-8?q?Possible_CellSPU_Bug=3F?= In-Reply-To: References: <1296460241.1885.5.camel@LLVMbuilder.research.nokia.com> Message-ID: <4042a0ac8467b5a8089beec9fd5fb715@mail.face.aero.org> On Mon, 31 Jan 2011 11:34:05 -0600, greened at obbligato.org wrote: > Kalle Raiskila writes: > >> Looks like a bug to me. xshw (extend signed half-word(16bits) to >> word(32bits)) takes a v8i16 and produces a v4i32. This has likely >> gone >> unnoticed as there is only one type of vector register class (i.e. >> VECREG) that is used for all vectors. >> >> Nice catch :) Are there more of these? > > I don't know. I stopped implementing the stricter typechecking when > I > saw this. I wanted to make sure there wasn't some official trickery > going on. :) > > -Dave It's not official trickery. It's just the way things need to be done on Cell. -scooter -- B. Scott Michel, Ph.D. Director, Computer Systems Research Department The Aerospace Corporation Ofc: (310) 336-5034 Cell: (310) 426-4993 From scottm at mail.face.aero.org Thu Mar 3 11:59:55 2011 From: scottm at mail.face.aero.org (B. Scott Michel) Date: Thu, 03 Mar 2011 09:59:55 -0800 Subject: [LLVMdev] =?utf-8?q?Possible_CellSPU_Bug=3F?= In-Reply-To: References: Message-ID: <48799bad3982a741614bcc9d6da9c376@mail.face.aero.org> On Sat, 29 Jan 2011 17:21:07 -0600, David Greene wrote: > I'm working on enhancing TableGen's type checking and it triggered > with > a problem in CellSPU's specification: > > XSHWv4i32: (set VECREG:v8i16:$rDest, (sext:v8i16 > VECREG:v4i32:$rSrc)) > > It's complaining that v4i32 is not smaller than v8i16, which is true > in > the sense of vector bit size, and true in the sense of vector element > size. To me, a sign extension from i32 to i16 makes no sense. > >>From the .td file, it looks as if src and dest types have been >> swapped: > > class XSHWVecInst: > XSHWInst<(outs VECREG:$rDest), (ins VECREG:$rSrc), > [(set (out_vectype VECREG:$rDest), > (sext (in_vectype VECREG:$rSrc)))]>; > > multiclass ExtendHalfwordWord { > def v4i32: XSHWVecInst; > > The multiclass name leads me to believe this was supposed to sign > extend > from i16 to i32 but the XSHWVecInst class takes the types in SRC -> > DST > order, not DST <- SRC order. > > Is this pattern as intended, or did I find a real problem? > > -Dave It's intentional. Everything on Cell is a vector, with the exception of loads and stores. Unless you really want to write code that determines the exact vector element that needs to be changed and do all of the juggling to modify that element. There are no individual registers. If it's easier to flip the order of the operands, then do so. That's just style. -scooter -- B. Scott Michel, Ph.D. Director, Computer Systems Research Department The Aerospace Corporation Ofc: (310) 336-5034 Cell: (310) 426-4993 From damien.llvm at gmail.com Thu Mar 3 14:54:42 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Thu, 3 Mar 2011 12:54:42 -0800 Subject: [LLVMdev] Improving select_cc lowering for targets with conditional move Message-ID: Let's consider the following piece of C code: (incomplete and not compilable ;) ) result = initValue; for(i) { ... if(condition) result = updatedValue_i; ... } For targets with conditional moves, the result is updated using the following sequence of instructions: regTmp = regFalse; if(condition2) regTmp = regTrue; regResult = regTmp; Now, you have 2 cases: 1) either condition2 = condition In this case: regFalse is a phi node between the initial value of result and the current result. It's very likely that regFalse and regResult will be the same hardware register. So the sequence of instructions reduce to just: if(condition2) regResult = regTrue 2) either condition2 = !condition In this case, regTrue is a phi node between the initial value of result and the current result. In this case, the sequence of 3 instructions cannot be reduced easily ! So, my question is (finally ;)): Is there a way to introduce some intelligence in the select_cc lowering by reversing the condition if it is likely to generate more efficient code ? I am asking this question because the lowering works on basic blocks with a set of input virtual registers and output virtual registers and all the connections between these 2 set of registers seemed to be "lost" (I don't think we know at the lowering stage that an input register is a phi node between an output register and an initial value...) Thank you ! Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/609f69ca/attachment.html From zhangwen at cse.ohio-state.edu Thu Mar 3 15:09:10 2011 From: zhangwen at cse.ohio-state.edu (Wenbin Zhang) Date: Thu, 3 Mar 2011 16:09:10 -0500 Subject: [LLVMdev] how can I have LoopInfo in a module pass? Message-ID: <85078525D9154266BF80CC611DE857F8@osuc90d096e394> Hi all, I tried to have a LoopInfo object in a function pass, add addRequired in getAnalysisUsage, and then use getAnalysis in runOnFunction(). It worked OK. Now I want to have a module pass to traverse the functions, and similarly I want to have to loop information of the functions. When I did the above in runOnModule, and run the pass, the following error popped out: AnalysisType& llvm::Pass::getAnalysis() const [with AnalysisType = llvm::DominatorTree]: Assertion `Resolver && "Pass has not been inserted into a PassManager object!"' failed. Can anyone tell me the correct way to handle this in a module pass? Thanks a lot! Best, --Wenbin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/03ca1631/attachment.html From criswell at illinois.edu Thu Mar 3 15:26:31 2011 From: criswell at illinois.edu (John Criswell) Date: Thu, 3 Mar 2011 15:26:31 -0600 Subject: [LLVMdev] how can I have LoopInfo in a module pass? In-Reply-To: <85078525D9154266BF80CC611DE857F8@osuc90d096e394> References: <85078525D9154266BF80CC611DE857F8@osuc90d096e394> Message-ID: <4D700787.7080808@illinois.edu> On 3/3/11 3:09 PM, Wenbin Zhang wrote: > Hi all, > I tried to have a LoopInfo object in a function pass, add > addRequired in getAnalysisUsage, and then use > getAnalysis in runOnFunction(). It worked OK. > Now I want to have a module pass to traverse the functions, and > similarly I want to have to loop information of the functions. When I > did the above in runOnModule, and run the pass, the following error > popped out: > /AnalysisType& llvm::Pass::getAnalysis() const [with AnalysisType = > llvm::DominatorTree]: Assertion `Resolver && "Pass has not been > inserted into a PassManager object!"' failed./ > // > Can anyone tell me the correct way to handle this in a module pass? > Thanks a lot! LoopInfo is a FunctionPass, so you have to use getAnalysis(F) where F is a pointer to the function that you want analyzed. Note that LoopInfo, in this instance, will be re-run every time you call getAnalysis on it (this is a result of using a FunctionPass within a ModulePass). Be sure to structure you code to only call getAnalysis(F) on each function just once, if possible. Also be sure that F is not a function declaration (i.e., a function with no body). -- John T. > Best, > --Wenbin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/00370104/attachment.html From Micah.Villmow at amd.com Thu Mar 3 15:55:08 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 3 Mar 2011 15:55:08 -0600 Subject: [LLVMdev] Summer Intern Position Message-ID: GPU Compiler Intern, OpenCL Compiler Team AMD is looking for an summer intern to work with our core team developing OpenCL, an open standard for heterogonous general purpose programming, compilers for multi-core CPU and many-core graphics systems. The intern will be tasked with helping develop optimization and/or code generation passes for AMD GPU's. Knowledge of LLVM or contribution to LLVM a plus. Knowledge of OpenCL API or other GPGPU programming models is a plus but not needed. The individual will be a member of a team where communication and team skills are highly valued. Requirement: Basic knowledge of compiler implementation and design. Strong C and C++ programming skills is a must, along with an understanding of software engineering practices. Exposure to algorithms used for optimizations and code generation for CPU or graphics hardware is a strong plus but not a must, as the primary sought quality is a strong interest and ability in developing high quality software on top of lower level functionality. Experience with software development tools such as source level debugging and code profiler is required. If interested, please send resume to micah dot villmow at amd dot com. Thanks, Micah -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/f8437d4c/attachment.html From zhangwen at cse.ohio-state.edu Thu Mar 3 16:04:37 2011 From: zhangwen at cse.ohio-state.edu (Wenbin Zhang) Date: Thu, 3 Mar 2011 17:04:37 -0500 Subject: [LLVMdev] how can I have LoopInfo in a module pass? References: <85078525D9154266BF80CC611DE857F8@osuc90d096e394> <4D700787.7080808@illinois.edu> Message-ID: Thanks John, I modify my code to like this: bool XXX::ModulePass(Module &M){ .... LoopInfo &li = getAnalysis(fi); .... } Here fi is a Function* pointing to main(). Now when I run the pass, another error shows up: AnalysisType& llvm::Pass::getAnalysisID(const llvm::PassInfo*, llvm::Function&) [with AnalysisType = llvm::LoopInfo]: Assertion `ResultPass && "Unable to find requested analysis info"' failed. Did I miss something? Thanks! Best, --Wenbin ----- Original Message ----- From: John Criswell To: Wenbin Zhang Cc: llvmdev at cs.uiuc.edu Sent: Thursday, March 03, 2011 4:26 PM Subject: Re: [LLVMdev] how can I have LoopInfo in a module pass? On 3/3/11 3:09 PM, Wenbin Zhang wrote: Hi all, I tried to have a LoopInfo object in a function pass, add addRequired in getAnalysisUsage, and then use getAnalysis in runOnFunction(). It worked OK. Now I want to have a module pass to traverse the functions, and similarly I want to have to loop information of the functions. When I did the above in runOnModule, and run the pass, the following error popped out: AnalysisType& llvm::Pass::getAnalysis() const [with AnalysisType = llvm::DominatorTree]: Assertion `Resolver && "Pass has not been inserted into a PassManager object!"' failed. Can anyone tell me the correct way to handle this in a module pass? Thanks a lot! LoopInfo is a FunctionPass, so you have to use getAnalysis(F) where F is a pointer to the function that you want analyzed. Note that LoopInfo, in this instance, will be re-run every time you call getAnalysis on it (this is a result of using a FunctionPass within a ModulePass). Be sure to structure you code to only call getAnalysis(F) on each function just once, if possible. Also be sure that F is not a function declaration (i.e., a function with no body). -- John T. Best, --Wenbin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/7ffb73cb/attachment.html From zhangwen at cse.ohio-state.edu Thu Mar 3 16:26:20 2011 From: zhangwen at cse.ohio-state.edu (Wenbin Zhang) Date: Thu, 3 Mar 2011 17:26:20 -0500 Subject: [LLVMdev] how can I have LoopInfo in a module pass? References: <85078525D9154266BF80CC611DE857F8@osuc90d096e394><4D700787.7080808@illinois.edu> Message-ID: I think this assertion failure may be caused by the getAnalysisUsage(). Mine is like the following: class myclass : public ModulePass{ ... virtual void getAnalysisUsage(AnalysisUsage &AU) const { AU.addRequired(); } ... } Is it enough? Thanks! Best, --Wenbin ----- Original Message ----- From: Wenbin Zhang To: John Criswell Cc: llvmdev at cs.uiuc.edu Sent: Thursday, March 03, 2011 5:04 PM Subject: Re: [LLVMdev] how can I have LoopInfo in a module pass? Thanks John, I modify my code to like this: bool XXX::ModulePass(Module &M){ .... LoopInfo &li = getAnalysis(fi); .... } Here fi is a Function* pointing to main(). Now when I run the pass, another error shows up: AnalysisType& llvm::Pass::getAnalysisID(const llvm::PassInfo*, llvm::Function&) [with AnalysisType = llvm::LoopInfo]: Assertion `ResultPass && "Unable to find requested analysis info"' failed. Did I miss something? Thanks! Best, --Wenbin ----- Original Message ----- From: John Criswell To: Wenbin Zhang Cc: llvmdev at cs.uiuc.edu Sent: Thursday, March 03, 2011 4:26 PM Subject: Re: [LLVMdev] how can I have LoopInfo in a module pass? On 3/3/11 3:09 PM, Wenbin Zhang wrote: Hi all, I tried to have a LoopInfo object in a function pass, add addRequired in getAnalysisUsage, and then use getAnalysis in runOnFunction(). It worked OK. Now I want to have a module pass to traverse the functions, and similarly I want to have to loop information of the functions. When I did the above in runOnModule, and run the pass, the following error popped out: AnalysisType& llvm::Pass::getAnalysis() const [with AnalysisType = llvm::DominatorTree]: Assertion `Resolver && "Pass has not been inserted into a PassManager object!"' failed. Can anyone tell me the correct way to handle this in a module pass? Thanks a lot! LoopInfo is a FunctionPass, so you have to use getAnalysis(F) where F is a pointer to the function that you want analyzed. Note that LoopInfo, in this instance, will be re-run every time you call getAnalysis on it (this is a result of using a FunctionPass within a ModulePass). Be sure to structure you code to only call getAnalysis(F) on each function just once, if possible. Also be sure that F is not a function declaration (i.e., a function with no body). -- John T. Best, --Wenbin ------------------------------------------------------------------------------ _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/0ada6bfa/attachment.html From clattner at apple.com Thu Mar 3 16:34:41 2011 From: clattner at apple.com (Chris Lattner) Date: Thu, 3 Mar 2011 14:34:41 -0800 Subject: [LLVMdev] MachineOperand type In-Reply-To: <4D701669.4090800@gmail.com> References: <4D6F1A4D.80402@gmail.com> <6B872D03-8E98-4FA9-8DD9-A2FBF3C34524@apple.com> <4D701669.4090800@gmail.com> Message-ID: On Mar 3, 2011, at 2:30 PM, Bin Zeng wrote: >>> >>> Thanks a lot in advance. Any advice will be greatly appreciated. >> GlobalAddress is used to refer to something that exists in IR, such as a global variable. ExternalSymbol is used to refer to something that isn't in the IR because it was synthesized by codegen, such as __addsi3. >> >> -Chris > > Thanks for the quick reply. According to your reply, memset and memcpy are synthesized by codegen? I saw memcpy in the source file. Did it get optimized out in the IR and later codegen synthesized memset and memcpy? Yes, codegen does synthesize memcpy in some cases, also memset. -Chris From mmuller at enduden.com Thu Mar 3 16:34:43 2011 From: mmuller at enduden.com (Michael Muller) Date: Thu, 03 Mar 2011 17:34:43 -0500 Subject: [LLVMdev] LLVM IR Type System Rewrite References: <4BA62F76-735A-4CE1-AA37-C9E44EAF72C4@apple.com> Message-ID: <16504.1299191683.137533.1036889011@succubus> Chris Lattner wrote: > Several people have been proding me to write up my thoughts on how to fix the IR type system for LLVM 3.0. Here are some (fairly stream of conscious) thoughts on the matter: > http://nondot.org/sabre/LLVMNotes/TypeSystemRewrite.txt > > Comments welcome! I like this change a lot. I'm not sure if it is feasible or generally desirable, but one thing that would have been very helpful in my use case would have been to be able to construct structure types incrementally and to be able to access slots of the un-finished type, for example: StructType *NewSTy = StructType::get("mylist", TheModule); NewSTy->addField(Type::getInt32Ty(NewSTy->getContext()); Then given a Value of type NewSTy, do a gep on slot 0. > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > ============================================================================= michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- Society in every state is a blessing, but government even in its best state is but a necessary evil; in its worst state an intolerable one... - Thomas Paine ============================================================================= From zhangwen at cse.ohio-state.edu Thu Mar 3 17:03:52 2011 From: zhangwen at cse.ohio-state.edu (Wenbin Zhang) Date: Thu, 3 Mar 2011 18:03:52 -0500 Subject: [LLVMdev] how can I have LoopInfo in a module pass? References: <85078525D9154266BF80CC611DE857F8@osuc90d096e394><4D700787.7080808@illinois.edu> Message-ID: <6C1D329C8625410EB2B8EEF887C3E52F@osuc90d096e394> Sorry for the spaming. I solved that problem, which is caused by my typo.... ----- Original Message ----- From: Wenbin Zhang To: John Criswell Cc: llvmdev at cs.uiuc.edu Sent: Thursday, March 03, 2011 5:26 PM Subject: Re: [LLVMdev] how can I have LoopInfo in a module pass? I think this assertion failure may be caused by the getAnalysisUsage(). Mine is like the following: class myclass : public ModulePass{ ... virtual void getAnalysisUsage(AnalysisUsage &AU) const { AU.addRequired(); } ... } Is it enough? Thanks! Best, --Wenbin ----- Original Message ----- From: Wenbin Zhang To: John Criswell Cc: llvmdev at cs.uiuc.edu Sent: Thursday, March 03, 2011 5:04 PM Subject: Re: [LLVMdev] how can I have LoopInfo in a module pass? Thanks John, I modify my code to like this: bool XXX::ModulePass(Module &M){ .... LoopInfo &li = getAnalysis(fi); .... } Here fi is a Function* pointing to main(). Now when I run the pass, another error shows up: AnalysisType& llvm::Pass::getAnalysisID(const llvm::PassInfo*, llvm::Function&) [with AnalysisType = llvm::LoopInfo]: Assertion `ResultPass && "Unable to find requested analysis info"' failed. Did I miss something? Thanks! Best, --Wenbin ----- Original Message ----- From: John Criswell To: Wenbin Zhang Cc: llvmdev at cs.uiuc.edu Sent: Thursday, March 03, 2011 4:26 PM Subject: Re: [LLVMdev] how can I have LoopInfo in a module pass? On 3/3/11 3:09 PM, Wenbin Zhang wrote: Hi all, I tried to have a LoopInfo object in a function pass, add addRequired in getAnalysisUsage, and then use getAnalysis in runOnFunction(). It worked OK. Now I want to have a module pass to traverse the functions, and similarly I want to have to loop information of the functions. When I did the above in runOnModule, and run the pass, the following error popped out: AnalysisType& llvm::Pass::getAnalysis() const [with AnalysisType = llvm::DominatorTree]: Assertion `Resolver && "Pass has not been inserted into a PassManager object!"' failed. Can anyone tell me the correct way to handle this in a module pass? Thanks a lot! LoopInfo is a FunctionPass, so you have to use getAnalysis(F) where F is a pointer to the function that you want analyzed. Note that LoopInfo, in this instance, will be re-run every time you call getAnalysis on it (this is a result of using a FunctionPass within a ModulePass). Be sure to structure you code to only call getAnalysis(F) on each function just once, if possible. Also be sure that F is not a function declaration (i.e., a function with no body). -- John T. Best, --Wenbin ---------------------------------------------------------------------------- _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev ------------------------------------------------------------------------------ _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/f2055399/attachment.html From Micah.Villmow at amd.com Thu Mar 3 18:07:16 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Thu, 3 Mar 2011 18:07:16 -0600 Subject: [LLVMdev] Full Time LLVM Compiler position Message-ID: Compiler Engineer, Stream Computing We are currently looking for a software engineer as part of the core team developing OpenCL, a new open standard for heterogonous general purpose programming, compilers for multi-core CPU and many-core graphics systems. The engineer will be involved in all aspects of OpenCL compiler features, development and maintenance and will participate in performance tuning for new multi-core x86 and graphics hardware running on multiple operating systems. The position will involve interfacing with ASIC design engineers and architects, OS engineers and peers in related development teams. Knowledge of OpenCL API or other GPGPU programming models is a plus but not needed. The individual will be a member of a team where communication and team skills are highly valued. Requirement: Bachelor of Science or equivalent degree in Electrical Engineering, Computer Science, Engineering or an equivalent field is required, MSc. Or Ph.D. preferred. Specialization in computer science, strong C and C++ programming skills is necessary, along with an understanding of software engineering practices. Exposure to algorithms used for optimizations and code generation for CPU or graphics hardware is a strong plus but not a must, as the primary sought quality is a strong interest and ability in developing high quality software on top of lower level functionality. Experience with software development tools such as source level debugging and code profiler is required. If interested, please send resume to micah dot villmow at amd dot com or apply here: https://www.amd.apply2jobs.com/ProfExt/index.cfm?fuseaction=mExternal.showJob&RID=14120# Thanks, Micah -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110303/6253b21e/attachment.html From jacob.zimmermann at oracle.com Thu Mar 3 19:45:08 2011 From: jacob.zimmermann at oracle.com (Jacob Zimmermann) Date: Fri, 04 Mar 2011 11:45:08 +1000 Subject: [LLVMdev] AllocaInst remapped as NULL in llvm::MapValue Message-ID: <1299203108.11155.14.camel@easteregg> Hello all When using llvm-ld to link several bitcode files produced by LLVM-GCC, I ran into the problem that the resulting linked file had missing dbg entries for AllocaInst values. After some digging I found that llvm::MapValue returns NULL when it encounters an AllocaInst. The attached trivial patch fixes the problem. Is this behaviour intended? Regards, Jacob -- Jacob Zimmermann Oracle Labs, Brisbane, Queensland, Australia jacob.zimmermann at oracle.com -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-dont-drop-allocainst.diff Type: text/x-patch Size: 854 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110304/3a4f538d/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110304/3a4f538d/attachment-0001.bin From nicholas at mxc.ca Thu Mar 3 23:43:52 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Thu, 03 Mar 2011 21:43:52 -0800 Subject: [LLVMdev] metadata to inform the optimizers that some code should stay unchanged In-Reply-To: <31039471.post@talk.nabble.com> References: <31039471.post@talk.nabble.com> Message-ID: <4D707C18.5000209@mxc.ca> Xinfinity wrote: > > > Hello LLVM, > > I am working on some passes that perform code transformations. Since I am > interested in performance, I apply the O3 passes, right after my pass. > However, the optimization passes modify the code inserted by my pass in an > undesirable way. As far I know, there is no way to prevent the optimizers > from optimizing some regions of code. So what I intend to do is to attach > metadata to the instructions contained in basicblocks that I want to remain > unchanged, and to modify some of the optimization passes to be aware of the > metadata. Please don't. Metadata is the wrong tool for anything where the metadata can't be discarded and correctness maintained. Is there any reason you can't run opt -O3 first and your pass second? What is your pass actually doing and why does running the optimizations break it? Nick For now, I am interested only in passes that would affect the > control flow graph, so for a start I modify the Simplify CFG pass and the > jump threading pass, but I will check which other passes might duplicate the > code, merge blocks etc. > > Do you think the performance will drop significantly if some regions of code > are not optimized ? And do you consider this modification would be of any > use to the community? > > Thank you. > Alexandra > From xinfinity_a at yahoo.com Fri Mar 4 03:33:38 2011 From: xinfinity_a at yahoo.com (Xinfinity) Date: Fri, 4 Mar 2011 01:33:38 -0800 (PST) Subject: [LLVMdev] metadata to inform the optimizers that some code should stay unchanged In-Reply-To: <4D707C18.5000209@mxc.ca> References: <31039471.post@talk.nabble.com> <4D707C18.5000209@mxc.ca> Message-ID: <31066087.post@talk.nabble.com> Nick Lewycky wrote: > > Xinfinity wrote: >> >> >> Hello LLVM, >> >> I am working on some passes that perform code transformations. Since I am >> interested in performance, I apply the O3 passes, right after my pass. >> However, the optimization passes modify the code inserted by my pass in >> an >> undesirable way. As far I know, there is no way to prevent the optimizers >> from optimizing some regions of code. So what I intend to do is to attach >> metadata to the instructions contained in basicblocks that I want to >> remain >> unchanged, and to modify some of the optimization passes to be aware of >> the >> metadata. > > Please don't. Metadata is the wrong tool for anything where the metadata > can't be discarded and correctness maintained. > > Is there any reason you can't run opt -O3 first and your pass second? > What is your pass actually doing and why does running the optimizations > break it? > > Nick > > You are right, changing the optimizers is not a good idea, I will modify my pass to make it O3 friendly. If I run opt -O3 first and my pas after, there is a significant drop of performance. I use my pass to create multiple versions of regions of code and to insert a selection mechanism that communicates with a runtime system. Callbacks to the runtime system must be patched and they are inserted as inline asm to have a fixed structure and size. O3 changes the position of the inline asm or duplicates them. But it is better to work more on my pass, than to change all optimization passes in LLVM. Thanks. Alexandra -- View this message in context: http://old.nabble.com/metadata-to-inform-the-optimizers-that-some-code-should-stay-unchanged-tp31039471p31066087.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From j.wilhelmy at arcor.de Fri Mar 4 06:51:07 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Fri, 04 Mar 2011 13:51:07 +0100 Subject: [LLVMdev] Structure Types and ABI sizes In-Reply-To: References: <4D5E4F64.6080300@arcor.de> <4D643B0B.1090809@arcor.de> <4D667582.9050306@arcor.de> <4D6686E6.6030401@arcor.de> Message-ID: <4D70E03B.3060000@arcor.de> >> %I = type { i32, i8 }; // 5 bytes >> %I' = type { %I, tailpad}; // 8 bytes >> %J = type { %I, i8 } // 6 bytes >> > That would break C code (and whatever else relies on alignment). > why would it break C code? of course a C frontend should generate only tailpadded types. > I don't see a way of specifying two structures, but I like the idea of > using a packed structure for inheritance and the "normal" one for > types. > or something like %J = type { inherit %I, i8 } the inherit keyword before %I removes the tailpadding -Jochen From rengolin at systemcall.org Fri Mar 4 07:05:25 2011 From: rengolin at systemcall.org (Renato Golin) Date: Fri, 4 Mar 2011 13:05:25 +0000 Subject: [LLVMdev] Structure Types and ABI sizes In-Reply-To: <4D70E03B.3060000@arcor.de> References: <4D5E4F64.6080300@arcor.de> <4D643B0B.1090809@arcor.de> <4D667582.9050306@arcor.de> <4D6686E6.6030401@arcor.de> <4D70E03B.3060000@arcor.de> Message-ID: On 4 March 2011 12:51, Jochen Wilhelmy wrote: > why would it break C code? of course a C frontend should generate only > tailpadded types. It's not about the size, but the offset. If you had a char field in the inherited class: %I' = type { %I, i8, tailpad}; The offset of that i8 has to be 8, not 5. If all structures are packed, that would be 5, which is correct for non-POD in C++ but wrong for everything else. > %J = type { inherit %I, i8 } > > the inherit keyword before %I removes the tailpadding That's what the packed is for. %Base = type { i32, i8 }; // size = 8 %POSDerived = type { %Base, i8 }; // i8 offset = 8, size 12 %Basep = packed type { i32, i8 }; // size = 5 %nonPOSDerived = type { %Basep, i8 }; // i8 offset = 5, size 8 cheers, --renato From j.wilhelmy at arcor.de Fri Mar 4 08:13:17 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Fri, 04 Mar 2011 15:13:17 +0100 Subject: [LLVMdev] Structure Types and ABI sizes In-Reply-To: References: <4D5E4F64.6080300@arcor.de> <4D643B0B.1090809@arcor.de> <4D667582.9050306@arcor.de> <4D6686E6.6030401@arcor.de> <4D70E03B.3060000@arcor.de> Message-ID: <4D70F37D.9080907@arcor.de> >> why would it break C code? of course a C frontend should generate only >> tailpadded types. >> > It's not about the size, but the offset. If you had a char field in > the inherited class: > > %I' = type { %I, i8, tailpad}; > > The offset of that i8 has to be 8, not 5. If all structures are > packed, that would be 5, which is correct for non-POD in C++ but wrong > for everything else. > I know therefore in this case %I has to tailpadded. but packing and tailpadding are different things, aren't they? in a packet type {i8, i32} the i32 type has offset 1 while in a non-tailpadded type it still has offset 4. >> %J = type { inherit %I, i8 } >> >> the inherit keyword before %I removes the tailpadding >> > That's what the packed is for. > I don't think so because packing removes alignment constraints of all members. -Jochen From rengolin at systemcall.org Fri Mar 4 08:19:52 2011 From: rengolin at systemcall.org (Renato Golin) Date: Fri, 4 Mar 2011 14:19:52 +0000 Subject: [LLVMdev] Structure Types and ABI sizes In-Reply-To: <4D70F37D.9080907@arcor.de> References: <4D5E4F64.6080300@arcor.de> <4D643B0B.1090809@arcor.de> <4D667582.9050306@arcor.de> <4D6686E6.6030401@arcor.de> <4D70E03B.3060000@arcor.de> <4D70F37D.9080907@arcor.de> Message-ID: On 4 March 2011 14:13, Jochen Wilhelmy wrote: > I know therefore in this case %I has to tailpadded. but packing and > tailpadding are different > things, aren't they? in a packet type {i8, i32} the i32 type has offset 1 > while in a non-tailpadded > type it still has offset 4. True. cheers, --renato From jgu222 at gmail.com Fri Mar 4 12:47:50 2011 From: jgu222 at gmail.com (Junjie Gu) Date: Fri, 4 Mar 2011 10:47:50 -0800 Subject: [LLVMdev] configure llvm for 32-bit build on a 64-bit system Message-ID: I have TOT of llvm and it builds 64-bit without issues on my 64-bit ubuntu. My question is how to build 32-bit llvm on my 64-bit ubuntu ? I've not found any configure options to specify that. Thanks Junjie From rafael.espindola at gmail.com Fri Mar 4 13:44:36 2011 From: rafael.espindola at gmail.com (Rafael Avila de Espindola) Date: Fri, 04 Mar 2011 14:44:36 -0500 Subject: [LLVMdev] configure llvm for 32-bit build on a 64-bit system In-Reply-To: References: Message-ID: <4D714124.3090601@gmail.com> On 11-03-04 01:47 PM, Junjie Gu wrote: > I have TOT of llvm and it builds 64-bit without issues on my 64-bit > ubuntu. My question is how to build 32-bit llvm on my 64-bit ubuntu ? > I've not found any configure options to specify that. > CC="gcc -m32" CXX="g++ -m32" should do in. > Thanks > Junjie Cheers, Rafael From peterl95124 at sbcglobal.net Fri Mar 4 14:14:02 2011 From: peterl95124 at sbcglobal.net (Peter Lawrence) Date: Fri, 4 Mar 2011 12:14:02 -0800 Subject: [LLVMdev] Question about Value Range Propagation In-Reply-To: References: Message-ID: <99588BEE-DF18-44C0-B71C-8B366A3D94CC@sbcglobal.net> Chris, one way to look at array bounds check optimization, and the value range propagation that it can be based on, is that it's usefulness is language dependent. Ada and Java benefit from it greatly, C/C++ not at all, but then a "codesafe" version of C/C++ would, as John T was pointing out below. It seems like the software engineering modularity of llvm's various analysis and transform phases lends itself to having phases that fit this description, worthwhile to have, but not necessarily linked in and/or not necessarily invoked, for every front end language. Another approach is to only trigger the analysis/transform if any bounds check instructions have been encountered, which won't happen in traditional C/C++. Are either of these approaches consistent with your design philosophy ? thanks, Peter Lawrence. (ps, the possible performance problem with the previous ABCD should not be a factor in the above, since while I don't yet know why that implementation suffered, or if it even did, I do know from personal experience that VRP/ABCD can be both very efficient and very effective.) On Feb 22, 2011, at 4:46 PM, llvmdev-request at cs.uiuc.edu wrote: >> the big problem with Patterson's VRP is that it is expensive in >> terms of >> compile time. LLVM used to have some passes (ABCD, predsimplify) >> that did >> this kind of thing, but they were removed essentially because >> their compile >> time was too great for the goodness they brought. > > I was under the impression that ABCD was removed because no one was > maintaining and improving it. Is my impression incorrect? > > The SAFECode compiler adds additional run-time checks for array bounds > checking. If the ABCD code was working but just wasn't useful for > regular C code, I'd like to know. It may still have value for > projects > like SAFECode. > > -- John T. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110304/210de728/attachment.html From baldrick at free.fr Fri Mar 4 14:38:03 2011 From: baldrick at free.fr (Duncan Sands) Date: Fri, 04 Mar 2011 21:38:03 +0100 Subject: [LLVMdev] configure llvm for 32-bit build on a 64-bit system In-Reply-To: References: Message-ID: <4D714DAB.9050604@free.fr> Hi Junjie, > I have TOT of llvm and it builds 64-bit without issues on my 64-bit > ubuntu. My question is how to build 32-bit llvm on my 64-bit ubuntu ? > I've not found any configure options to specify that. as well as "gcc -m32" and "g++ -m32" you may want to configure with --build=i686-pc-linux-gnu Ciao, Duncan. From joerg at britannica.bec.de Fri Mar 4 16:13:43 2011 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Fri, 4 Mar 2011 23:13:43 +0100 Subject: [LLVMdev] [MC] Removing relaxation control In-Reply-To: <718E02DF-A9E1-4182-B03D-F37C93A1D38E@apple.com> References: <20110224184004.GA22645@britannica.bec.de> <4D6734A9.1050809@gmail.com> <4D680546.5060204@gmail.com> <718E02DF-A9E1-4182-B03D-F37C93A1D38E@apple.com> Message-ID: <20110304221343.GA19728@britannica.bec.de> On Sat, Feb 26, 2011 at 11:51:48AM -0800, Chris Lattner wrote: > That looks like a 1.5% speedup in realtime and 10% speedup in system > time (though I'm not sure I believe that). I think it should stay on > for -O0 for C files. Turning it off at -O0 for .s files makes perfect > sense to me though. I was looking into this and there is a problem with -save-temps. The attached patch works and does the right thing, if that option is not used. With -save-temps, the problem is that at that point, it is impossible to distingiush between C input and (preprocessed) assembler files. For the former, POLA would dictate that -mrelax-all is used for -O0, for the latter it is definitely not desirable. I can't find a clean way to do without making a mess of the action construction. Joerg -------------- next part -------------- A non-text attachment was scrubbed... Name: Tools.cpp.diff Type: text/x-diff Size: 816 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110304/8824c848/attachment.bin From peterl95124 at sbcglobal.net Fri Mar 4 12:21:30 2011 From: peterl95124 at sbcglobal.net (Peter Lawrence) Date: Fri, 4 Mar 2011 10:21:30 -0800 Subject: [LLVMdev] LLVMdev Digest, Vol 81, Issue 5 In-Reply-To: References: Message-ID: <76EE1A23-3309-4BD2-8767-2BFB2DB7BCFB@sbcglobal.net> Renato, On Mar 4, 2011, at 10:00 AM, llvmdev-request at cs.uiuc.edu wrote: > That's what the packed is for. > > %Base = type { i32, i8 }; // size = 8 > %POSDerived = type { %Base, i8 }; // i8 offset = 8, size 12 > > %Basep = packed type { i32, i8 }; // size = 5 > %nonPOSDerived = type { %Basep, i8 }; // i8 offset = 5, size 8 > > cheers, > --renato does't the %nonPOSDerived type have to be packed for its non-Natural size 5 member to end up with only 5 bytes within its encompassing struct... sure the i8 can be on any boundary, but %Basep can only be accessed within %nonPOSDerived as a non-Natural sized object if %nonPOSDerived is itself declared as packed, otherwise if %nonPOSDerived is not packed then it can contain only Natural sized fields. so it seems that both structs need to be declared packed, one so the size can be known as less than a whole Natural size, and the other so that its fields can be fit together tightly without padding. seems like two different but related meanings for the word "packed". am I confused ? Peter Lawrence. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110304/9b2cd2b6/attachment.html From niuqingpeng at gmail.com Sat Mar 5 01:04:59 2011 From: niuqingpeng at gmail.com (Qingpeng Niu) Date: Sat, 5 Mar 2011 02:04:59 -0500 Subject: [LLVMdev] llvm-config example need update Message-ID: Hi This llvm-config --libs engine bcreader scalaropts in website http://llvm.org/cmds/llvm-config.html But actually bcreader components is not there anymore. The new name of it is bitreader. I thinks this webpage may need to update and also. If i do "llvm-config --help". It will also show wrong component name in examples g++ `llvm-config --cxxflags` -o HowToUseJIT.o -c HowToUseJIT.cpp g++ `llvm-config --ldflags` -o HowToUseJIT HowToUseJIT.o `llvm-config --libs engine bitreader scalaropts` -- Qingpeng Niu Department of Computer Science and Engineering at OSU -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110305/afc87256/attachment.html From niuqingpeng at gmail.com Sat Mar 5 01:10:31 2011 From: niuqingpeng at gmail.com (Qingpeng Niu) Date: Sat, 5 Mar 2011 02:10:31 -0500 Subject: [LLVMdev] llvm-config example need update In-Reply-To: References: Message-ID: Sorry. bitreader may not be the replacement of bcreader niuq at niuq:~/Programming/llvm/MyLli$ llvm-config --libs engine bcreader scalaropts llvm-config: unknown component name: bcreader niuq at niuq:~/Programming/llvm/MyLli$ g++ -o mylli.x `llvm-config --cxxflags --ldflags` mylli.o `llvm-config --libs engine bitreader scalaropts` mylli.o: In function `global constructors keyed to mylli.cpp': mylli.cpp:(.text+0x991): undefined reference to `llvm::createPBQPRegisterAllocator()' mylli.cpp:(.text+0xa01): undefined reference to `LLVMLinkInInterpreter' mylli.o: In function `main': mylli.cpp:(.text+0xb0e): undefined reference to `llvm::MemoryBuffer::getFileOrSTDIN(llvm::StringRef, std::basic_string, std::allocator >*, long, stat*)' collect2: ld returned 1 exit status Anything wrong with my configuration? Why no bcreader components? On Sat, Mar 5, 2011 at 2:04 AM, Qingpeng Niu wrote: > Hi > > This > > llvm-config --libs engine bcreader scalaropts in website > > http://llvm.org/cmds/llvm-config.html > > But actually bcreader components is not there anymore. The new name of it is bitreader. > > I thinks this webpage may need to update and also. If i do "llvm-config --help". It will also show wrong component name in examples > > g++ `llvm-config --cxxflags` -o HowToUseJIT.o -c HowToUseJIT.cpp > > g++ `llvm-config --ldflags` -o HowToUseJIT HowToUseJIT.o `llvm-config > --libs engine bitreader scalaropts` > > -- > Qingpeng Niu > Department of Computer Science and Engineering at OSU > -- Qingpeng Niu Department of Computer Science and Engineering at OSU -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110305/0d0a8318/attachment.html From andrew.pennebaker at gmail.com Fri Mar 4 18:33:52 2011 From: andrew.pennebaker at gmail.com (Andrew Pennebaker) Date: Fri, 4 Mar 2011 19:33:52 -0500 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 Message-ID: When I try to compile LLVM, a dialog pops up halfway through the process: "tblgen.exe has stopped working" and the build quits. I would really appreciate a binary pack for llvm-as, llc, etc. Specs: GCC 4.5.2 MinGW-Get 0.1-alpha-5.1 Windows 7 Professional x64 Steps: Install MinGW ( http://sourceforge.net/projects/mingw/files/Automated%20MinGW%20Installer/mingw-get-inst/mingw-get-inst-20110211/ ). Open Start -> Programs -> MinGW -> MinGW Shell. Run mingw-get install binutils. Run mingw-get install gcc. Download LLVM-GCC Front End Binaries for Mingw32/x86 ( http://llvm.org/releases/download.html#2.8). Move llvm-gcc*.tar.bz2 to C:\MinGW. Run bunzip2 llvm-*.bz2 Run tar xvf llvm-*.tar Install msysGit (http://code.google.com/p/msysgit/). Command Prompt: cd C:\Users\andrew\Desktop git clone http://llvm.org/git/llvm.git MinGW Shell: cd c:/users/andrew/desktop ./configure make ... A dialog pops up: "tblgen.exe has stopped working" make[1]: Building Intrinsics.gen.tmp from Intrinsics.td make[1]: *** [/c/users/andrew/desktop/src/llvm/lib/VMCore/Debug+Asserts/Intrinics.gen.tmp] Error 255 make[1]: Leaving directory '/c/users/andrew/desktop/src/llvm/lib/VMCore' make: *** [all] Error 1 Close program Cheers, Andrew Pennebaker www.yellosoft.us -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110304/0c397d78/attachment.html From rdivacky at freebsd.org Sat Mar 5 02:55:11 2011 From: rdivacky at freebsd.org (Roman Divacky) Date: Sat, 5 Mar 2011 09:55:11 +0100 Subject: [LLVMdev] Ask for help with FreeBSD bootloader, aka #6627 Message-ID: <20110305085511.GA26794@freebsd.org> Hi, In FreeBSD, we aim for replacing gcc with clang. We can compile all of the base system except the loader which does not fit within the hardware limits (7680 bytes). Currently (r127066) clang/llvm is missing the target by 5 bytes. Thus it's 5 bytes from us being able to compile all of FreeBSD with clang (and replace gcc with clang). There's #6627 (http://llvm.org/bugs/show_bug.cgi?id=6627) that if fixed would get us well over the limit (by saving ~30 bytes). Can someone please help us and fix this bug? thank you! roman From rengolin at systemcall.org Sat Mar 5 04:28:32 2011 From: rengolin at systemcall.org (Renato Golin) Date: Sat, 5 Mar 2011 10:28:32 +0000 Subject: [LLVMdev] LLVMdev Digest, Vol 81, Issue 5 In-Reply-To: <76EE1A23-3309-4BD2-8767-2BFB2DB7BCFB@sbcglobal.net> References: <76EE1A23-3309-4BD2-8767-2BFB2DB7BCFB@sbcglobal.net> Message-ID: On 4 March 2011 18:21, Peter Lawrence wrote: > seems like two different but related meanings for the word "packed". > > am I confused ? Hi Peter, You're absolutely right. Using the packed attribute was an idea presented by John in the beginning of the thread and I found elegant and simple, and since then I'm trying to stress the boundaries of it to make sure if I ever go in that direction, I'll not end up with yet another load of kludge as I have now. Some people have shown counter-examples that some kludge will be required (including having non-packed structures with packed structures inside, or a special type of packed that only packs the last member). I'm losing confidence in this idea already. ;) Jochen proposed a keyword "inherit" to stress the packing of the tail-pad only, that would solve both problems, but introducing new keywords to LLVM IR is always a dangerous enterprise. While this is obviously a benefit for C++ (and I'm biased to request that keyword), it might be used by other languages with slightly different semantics and it'll be difficult to not add kludge to whatever part that does the magic. The Itanium C+ ABI is full of magic and we're bound to produce kludge somewhere if we are to support C++ along with other languages. Today we all do that kludge in the front-end, the idea was to simplify it, but at what cost? It depends on what the IR is for in the long run. cheers, --renato From rengolin at systemcall.org Sat Mar 5 04:42:31 2011 From: rengolin at systemcall.org (Renato Golin) Date: Sat, 5 Mar 2011 10:42:31 +0000 Subject: [LLVMdev] Two languages in the same IR Message-ID: Hi all, Is it possible to merge two different languages in the same IR? With Java, JNI specifies a whole lot of rules to make C structures and PCS work with Java classes, if we were to do the same thing in IR, would that work? Is there anyone doing this today (with any language)? -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From baldrick at free.fr Sat Mar 5 04:48:35 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 05 Mar 2011 11:48:35 +0100 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 In-Reply-To: References: Message-ID: <4D721503.2000302@free.fr> Hi Andrew, > When I try to compile LLVM, a dialog pops up halfway through the process: > "tblgen.exe has stopped working" and the build quits. > > I would really appreciate a binary pack for llvm-as, llc, etc. indeed windows is the only platform for which the LLVM core tools like llvm-as, llc etc are not provided in the binary distribution. I think it would be better if they were included. Ciao, Duncan. > > Specs: > > GCC 4.5.2 > MinGW-Get 0.1-alpha-5.1 > Windows 7 Professional x64 > > Steps: > > Install MinGW > (http://sourceforge.net/projects/mingw/files/Automated%20MinGW%20Installer/mingw-get-inst/mingw-get-inst-20110211/). > > Open Start -> Programs -> MinGW -> MinGW Shell. > > Run mingw-get install binutils. > > Run mingw-get install gcc. > > Download LLVM-GCC Front End Binaries for Mingw32/x86 > (http://llvm.org/releases/download.html#2.8). > > Move llvm-gcc*.tar.bz2 to C:\MinGW. > > Run bunzip2 llvm-*.bz2 > > Run tar xvf llvm-*.tar > > Install msysGit (http://code.google.com/p/msysgit/). > > Command Prompt: > > cd C:\Users\andrew\Desktop > > git clone http://llvm.org/git/llvm.git > > MinGW Shell: > > cd c:/users/andrew/desktop > > ./configure > > make > > ... > > A dialog pops up: "tblgen.exe has stopped working" > > make[1]: Building Intrinsics.gen.tmp from Intrinsics.td > make[1]: *** > [/c/users/andrew/desktop/src/llvm/lib/VMCore/Debug+Asserts/Intrinics.gen.tmp] > Error 255 > make[1]: Leaving directory '/c/users/andrew/desktop/src/llvm/lib/VMCore' > make: *** [all] Error 1 > > Close program > > Cheers, > > Andrew Pennebaker > www.yellosoft.us > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From baldrick at free.fr Sat Mar 5 05:00:42 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 05 Mar 2011 12:00:42 +0100 Subject: [LLVMdev] Two languages in the same IR In-Reply-To: References: Message-ID: <4D7217DA.8090601@free.fr> Hi Renato, > Is it possible to merge two different languages in the same IR? it is perfectly possible to compile C++ and Ada (and other languages) to bitcode and link the modules together, resulting in a mixed language module. The only thing I know of that doesn't work well is if exception handling constructs from different languages get inlined into the same function. But maybe this is not what you mean? > With Java, JNI specifies a whole lot of rules to make C structures and > PCS work with Java classes, if we were to do the same thing in IR, > would that work? I'm not entirely sure what you are imagining but it sounds like a job for the front-end to me. > Is there anyone doing this today (with any language)? In Ada you can import functions from other languages. I regularly do this to use C, C++ and Fortran functions from Ada. The frontend takes care of ensuring that the right calling conventions etc are used, so everything is sorted out before it gets to the LLVM IR generation stage. Ciao, Duncan. From Chareos at gmx.de Sat Mar 5 05:26:59 2011 From: Chareos at gmx.de (Ralf Karrenberg) Date: Sat, 05 Mar 2011 12:26:59 +0100 Subject: [LLVMdev] Two languages in the same IR In-Reply-To: References: Message-ID: <4D721E03.3080600@gmx.de> Hey, we are intermixing LLVM IR generated from C code with IR generated from our custom RenderMan frontend. I am not sure whether this is relevant for you, but in our restricted setting (the C code does not change frequently and we have full control at link time) this works flawless. Best, Ralf Am 05.03.2011 11:42, schrieb Renato Golin: > Hi all, > > Is it possible to merge two different languages in the same IR? > > With Java, JNI specifies a whole lot of rules to make C structures and > PCS work with Java classes, if we were to do the same thing in IR, > would that work? > > Is there anyone doing this today (with any language)? > From rengolin at systemcall.org Sat Mar 5 05:37:00 2011 From: rengolin at systemcall.org (Renato Golin) Date: Sat, 5 Mar 2011 11:37:00 +0000 Subject: [LLVMdev] Two languages in the same IR In-Reply-To: <4D7217DA.8090601@free.fr> References: <4D7217DA.8090601@free.fr> Message-ID: On 5 March 2011 11:00, Duncan Sands wrote: > it is perfectly possible to compile C++ and Ada (and other languages) to > bitcode and link the modules together, resulting in a mixed language module. > The only thing I know of that doesn't work well is if exception handling > constructs from different languages get inlined into the same function. ?But > maybe this is not what you mean? Hi Duncan, Exception handling is another beast, not concern about it now... ;) > In Ada you can import functions from other languages. ?I regularly do this > to use C, C++ and Fortran functions from Ada. ?The frontend takes care of > ensuring that the right calling conventions etc are used, so everything is > sorted out before it gets to the LLVM IR generation stage. So the Ada front-end "understands" C? And generate IR from both? I was thinking more along the lines of linking IR generated by two (or more) different front-ends. One would need some kind of ABI (like JNI) to make sure types and PCS were retained. cheers, --renato From rengolin at systemcall.org Sat Mar 5 05:48:07 2011 From: rengolin at systemcall.org (Renato Golin) Date: Sat, 5 Mar 2011 11:48:07 +0000 Subject: [LLVMdev] Two languages in the same IR In-Reply-To: <4D721E03.3080600@gmx.de> References: <4D721E03.3080600@gmx.de> Message-ID: On 5 March 2011 11:26, Ralf Karrenberg wrote: > we are intermixing LLVM IR generated from C code with IR generated from > our custom RenderMan frontend. I am not sure whether this is relevant > for you, but in our restricted setting (the C code does not change > frequently and we have full control at link time) this works flawless. Hi Ralf, This is more along the lines of what I was thinking... But as you say, you have full control at link time and you can enforce the "ABI" yourself. I was wondering how relevant is this intermixing of languages and how many people were doing this. Mapping between any two (or more) languages' ABIs would be a fun exercise... ;) cheers, --renato From ofv at wanadoo.es Sat Mar 5 08:01:37 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Sat, 05 Mar 2011 15:01:37 +0100 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 References: <4D721503.2000302@free.fr> Message-ID: <87pqq5wtry.fsf@wanadoo.es> Duncan Sands writes: >> I would really appreciate a binary pack for llvm-as, llc, etc. > > indeed windows is the only platform for which the LLVM core tools like > llvm-as, llc etc are not provided in the binary distribution. I think > it would be better if they were included. The only binary distribution related to LLVM 2.8 and Windows on the Download page is "LLVM-GCC 4.2 Front End Binaries for Mingw32/x86". It doesn't include llvm-as & co. but the other LLVM-GCC binary distributions doesn't either (I checked LLVM-GCC 4.2 Front End Binaries for Linux/x86_64). From anton at korobeynikov.info Sat Mar 5 08:21:22 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sat, 5 Mar 2011 17:21:22 +0300 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 In-Reply-To: References: Message-ID: > A dialog pops up: "tblgen.exe has stopped working" > make[1]: Building Intrinsics.gen.tmp from Intrinsics.td > make[1]: *** Looks like tablegen was miscompiled by your system gcc. Try other version. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From baldrick at free.fr Sat Mar 5 08:22:02 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 05 Mar 2011 15:22:02 +0100 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 In-Reply-To: <87pqq5wtry.fsf@wanadoo.es> References: <4D721503.2000302@free.fr> <87pqq5wtry.fsf@wanadoo.es> Message-ID: <4D72470A.2050209@free.fr> On 05/03/11 15:01, ?scar Fuentes wrote: > Duncan Sands writes: > >>> I would really appreciate a binary pack for llvm-as, llc, etc. >> >> indeed windows is the only platform for which the LLVM core tools like >> llvm-as, llc etc are not provided in the binary distribution. I think >> it would be better if they were included. > > The only binary distribution related to LLVM 2.8 and Windows on the > Download page is "LLVM-GCC 4.2 Front End Binaries for Mingw32/x86". It > doesn't include llvm-as& co. but the other LLVM-GCC binary > distributions doesn't either (I checked LLVM-GCC 4.2 Front End Binaries > for Linux/x86_64). They are bundled in with clang. Ciao, Duncan. From ofv at wanadoo.es Sat Mar 5 08:30:24 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Sat, 05 Mar 2011 15:30:24 +0100 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 References: <4D721503.2000302@free.fr> <87pqq5wtry.fsf@wanadoo.es> <4D72470A.2050209@free.fr> Message-ID: <87lj0twsfz.fsf@wanadoo.es> Duncan Sands writes: >>> indeed windows is the only platform for which the LLVM core tools like >>> llvm-as, llc etc are not provided in the binary distribution. I think >>> it would be better if they were included. >> >> The only binary distribution related to LLVM 2.8 and Windows on the >> Download page is "LLVM-GCC 4.2 Front End Binaries for Mingw32/x86". It >> doesn't include llvm-as& co. but the other LLVM-GCC binary >> distributions doesn't either (I checked LLVM-GCC 4.2 Front End Binaries >> for Linux/x86_64). > > They are bundled in with clang. There is no binary distribution of clang 2.8 for Windows. From baldrick at free.fr Sat Mar 5 10:15:13 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 05 Mar 2011 17:15:13 +0100 Subject: [LLVMdev] Building LLVM on MinGW32 / Windows 7 Professional x64 In-Reply-To: <87lj0twsfz.fsf@wanadoo.es> References: <4D721503.2000302@free.fr> <87pqq5wtry.fsf@wanadoo.es> <4D72470A.2050209@free.fr> <87lj0twsfz.fsf@wanadoo.es> Message-ID: <4D726191.6080106@free.fr> >>>> indeed windows is the only platform for which the LLVM core tools like >>>> llvm-as, llc etc are not provided in the binary distribution. I think >>>> it would be better if they were included. >>> >>> The only binary distribution related to LLVM 2.8 and Windows on the >>> Download page is "LLVM-GCC 4.2 Front End Binaries for Mingw32/x86". It >>> doesn't include llvm-as& co. but the other LLVM-GCC binary >>> distributions doesn't either (I checked LLVM-GCC 4.2 Front End Binaries >>> for Linux/x86_64). >> >> They are bundled in with clang. > > There is no binary distribution of clang 2.8 for Windows. Exactly, that is the problem! Ciao, Duncan. From jgu222 at gmail.com Sat Mar 5 10:38:51 2011 From: jgu222 at gmail.com (Junjie Gu) Date: Sat, 5 Mar 2011 08:38:51 -0800 Subject: [LLVMdev] configure llvm for 32-bit build on a 64-bit system In-Reply-To: <4D714DAB.9050604@free.fr> References: <4D714DAB.9050604@free.fr> Message-ID: On Fri, Mar 4, 2011 at 12:38 PM, Duncan Sands wrote: > Hi Junjie, > >> I have TOT of llvm and it builds 64-bit without issues on my 64-bit >> ubuntu. ?My question is how to build 32-bit llvm on my 64-bit ubuntu ? >> I've not found any configure options to specify that. > > as well as "gcc -m32" and "g++ -m32" you may want to configure with > --build=i686-pc-linux-gnu > make CC="gcc -m32" CXX="g++ -m32" works, but configure ... --build==i686-pc-linux-gnu does not. Thanks Junjie From viridia at gmail.com Sat Mar 5 11:42:42 2011 From: viridia at gmail.com (Talin) Date: Sat, 5 Mar 2011 09:42:42 -0800 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Mon, Feb 21, 2011 at 1:50 AM, nicolas geoffray < nicolas.geoffray at gmail.com> wrote: > Hi Talin, > > On Fri, Feb 18, 2011 at 5:50 PM, Talin wrote: >> >> >> In the current scheme, the way you tell LLVM that a root is no longer >> needed is by assigning NULL to it. However, that assumes that all roots are >> pointers, which is not true in my world - a root can be a struct containing >> pointers inside of it. (In my current frontend, a non-pointer root is >> indicated by passing a non-NULL metadata argument to llvm.gcroot, which >> contains information about which fields in the struct are roots. This is >> especially important in the case of tagged unions, where the garbage >> collector may have to examine the union tag field in order to determine if >> the pointer field is indeed a pointer - passing the pointer alone would be >> insufficient to determine this.) >> > > For a tagged union, I guess you are currently using the second argument of > llvm.gcroot to provde the information? I guess keeping an intrinsic for this > kind of code is the best way to go. > > >> Putting GC roots in a different address space works OK for me, as long as >> I can have SSA values that are structs that have pointers embedded in them >> that are in this different address space. In other words, if I have an SSA >> value that is a struct containing pointers which are roots, I need for the >> garbage collector to see the entire struct, not just the pointers. >> > > That's entirely fine with a different address space. The roots given by the > LLVM GC pass should contain the location of these embedded pointers. > > >> >> What I'm primarily asking for is to have the LLVM code generator >> automatically spill roots from SSA values to memory during a sync point, and >> reload them afterward, >> > > > I don't think that's even needed: long term, LLVM should return the > location of all roots for a given sync point (typically method call). By all > roots, I mean register roots and stack roots. The frontend should then be > responsible for updating those roots. > > >> instead of my frontend having to generate code to do this. As I mentioned, >> the current scheme results in the frontend having to generate very >> inefficient IR because of the need to be conservative about root liveness. >> > > Agree. > > >> The frontend can't know anything about the optimization passes that LLVM >> will perform on the function. >> > > Sure. And I think the way to go is to remove the llvm.gcroot intrinsic (and > the hackish way it currently works: right now, because we take the address > of the alloca, the LLVM optimiziers won't try to optimize an alloca that may > escape through the llvm.gcroot function call). By having an address space > for GC roots, optimizers don't need to care about anything. After the > optimizers and the register allocator, a final LLVM pass should compute the > root lists of all sync points. > > Nicolas > So I've been thinking about your proposal, that of using a special address space to indicate garbage collection roots instead of intrinsics. I want to point out some of the downsides of this approach: 1) The biggest drawback that I see is that it still requires frontends to signal that a root is no longer being used by assigning NULL to the pointer. This turns out to be hard to do in some cases, for example: while (true) { String s = someFunc(); if (s->equals("foo")) { break; } else { // call some function with s } } In the above example, where would you put the code to zero out the root 's'? In this case, the answer is, after the body of the loop. Now, normally one would zero out roots at the end of the block in which the variable is declared, but in the case of this while loop, it's wasted effort to do that since the root is just going to get assigned again at the top of the loop. However, sometimes we never make it to the end of the block, in this case due to a break statement, and so we have to either null out 's' either just before or just after the break. This example is relatively simple - if we start getting into scenarios involving switch statements, try/catch, and so on, I can easily construct complex examples that would make your head spin. Worse, there are cases where there are multiple code paths exiting from a block, and you end up having to generate the code to null out a given root on each of those paths. Remember what I said about frontends having to generate inefficient code to handle llvm.gcroot? This is one of the reasons. To address this, we need a better way of telling LLVM that a given variable is no longer a root. Of course, in the end it doesn't matter what the lifetime of the variable is, the only thing that matters is the state of the variable at each safe point. If there was a way to tell LLVM 'stop tracking this variable as a root', and then let it worry about whether the value in the variable is live or not, the generated code could be much more efficient. Of course, frontends would still have to deal with multiple exits from a block, but they can afford, I think, to be somewhat more lax about it. For example, assume in the example above that we insert a call to llvm-gcroot-end (or whatever we want to call it) after the while loop. The compiler knows in this case that all paths originating from the definition of s must pass through that point. Further, LLVM knows that there's only one safe point in the while loop, which is in the 'else' block. Thus the only time the GCStrategy ever 'sees' the variable 's' is at that one point, which means that there's no need to zero it out. In other words, LLVM knows that the range over which the variable is live is smaller than the range over which it is declared a root. 2) As I mentioned, my language supports tagged unions and other "value" types. Another example is a tuple type, such as (String, String). Such types are never allocated on the heap by themselves, because they don't have the object header structure that holds the type information needed by the garbage collector. Instead, these values can live in SSA variables, or in allocas, or they can be embedded inside larger types which do live on the heap. The way I currently handle such objects is by passing the trace table for the type as the metadata argument to llvm.gcroot(). A NULL metadata argument means that the root is a simple pointer, a non-NULL argument means that the root is a struct. How do we signal that a struct is no longer a root? Currently I do so by zeroing out the entire structure, but again that's wasteful. It would be better to simply tell LLVM that the struct is no longer a root. 3) I've been following the discussions on llvm-dev about the use of the address-space property of pointers to signal different kinds of memory pools for things like shared address spaces. If we try to use that same variable to indicate garbage collection, now we have to multiplex both meanings onto the same field. We can't just dedicate one special ID for the garbage collected heap, because there could be multiple such heaps. As you add additional orthogonal meanings to the address-space field, you end up with a combinatorial explosion of possible values for it. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110305/31f38bf4/attachment-0001.html From baldrick at free.fr Sat Mar 5 12:09:47 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 05 Mar 2011 19:09:47 +0100 Subject: [LLVMdev] configure llvm for 32-bit build on a 64-bit system In-Reply-To: References: <4D714DAB.9050604@free.fr> Message-ID: <4D727C6B.6030106@free.fr> Hi Junjie, > On Fri, Mar 4, 2011 at 12:38 PM, Duncan Sands wrote: >> Hi Junjie, >> >>> I have TOT of llvm and it builds 64-bit without issues on my 64-bit >>> ubuntu. My question is how to build 32-bit llvm on my 64-bit ubuntu ? >>> I've not found any configure options to specify that. >> >> as well as "gcc -m32" and "g++ -m32" you may want to configure with >> --build=i686-pc-linux-gnu >> > > make CC="gcc -m32" CXX="g++ -m32" works, but configure ... > --build==i686-pc-linux-gnu does not. I meant that this may be needed in addition to -m32. For example, it is needed when doing a 32 bit self-host llvm-gcc build on a 64 bit machine. Ciao, Duncan. From pangan at gmail.com Sat Mar 5 14:41:28 2011 From: pangan at gmail.com (Amir Mofakhar) Date: Sat, 5 Mar 2011 21:41:28 +0100 Subject: [LLVMdev] how to use external function? Message-ID: Hi there, It will be appreciated if you help me : I need to call an external function in LLVM bitcode but don't know how : ;ModuleID = 'm1' define i32 @main() { entry: %tmp0 = call i32 @MyOwnFunction() ret i32 0 } declare i32 @MyOwnFunction() I use below codes to run it $ llvm-as -f m1 -o m1.bc $ lli 1.bc (before it I have compiled MyOwnFunction module) when I run this program i receive this error : LLVM ERROR: Program used external function 'MyOwnFunction' which could not be resolved! I know it can not find this external function but I don't know how to define this function in another module and use it in my modules. i used below code for the other module : ;ModuleID = 'MyOwnFunction' define i32 @MyOwnFunction() { entry: ret i32 55 } should i use any especial switch to compile or what should i chane in my codes? I received error by below command also : $ lli -load MyOwnFunction.bc m1.bc -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110305/23845983/attachment.html From eli.friedman at gmail.com Sat Mar 5 15:02:28 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Sat, 5 Mar 2011 13:02:28 -0800 Subject: [LLVMdev] how to use external function? In-Reply-To: References: Message-ID: On Sat, Mar 5, 2011 at 12:41 PM, Amir Mofakhar wrote: > Hi there, > > It will be appreciated if you help me : > > I need to call an external function in LLVM bitcode but don't know how : > > ;ModuleID = 'm1' > define i32 @main() { > entry: > %tmp0 = call i32 @MyOwnFunction() > ?ret i32 0 > ?} > ?declare i32 @MyOwnFunction() > > I use below codes to run it > > $ llvm-as -f m1 -o m1.bc > $ lli 1.bc > > (before it I have compiled MyOwnFunction module) > > when I run this program i receive this error : > LLVM ERROR: Program used external function 'MyOwnFunction' which could not > be resolved! > > I know it can not find this external function but I don't know how to define > this function in another module and use it in my modules. i used below code > for the other module : > > ;ModuleID = 'MyOwnFunction' > define i32 @MyOwnFunction() { > entry: > ?ret i32 55 > ?} > > should i use any especial switch to compile or what should i chane in my > codes? I received error by below command also : > > $ lli -load MyOwnFunction.bc m1.bc Try something like the following? llvm-link MyOwnFunction.bc m1.bc -o - | lli -Eli From sanjoy at playingwithpointers.com Sun Mar 6 03:11:21 2011 From: sanjoy at playingwithpointers.com (Sanjoy Das) Date: Sun, 06 Mar 2011 14:41:21 +0530 Subject: [LLVMdev] First Patch In-Reply-To: References: Message-ID: <4D734FB9.7030209@playingwithpointers.com> Hi all! I've been tinkering with LLVM's code-base for a few days, hoping to start on one of the ideas mentioned in the "Open Projects" page (I was told 'Improving the current system'/'Miscellaneous Improvements'/5 would be a good start). While I was at it, I also took a stab at finishing up one of the TODOs. I've attached the patch for review. -- Sanjoy Das http://playingwithpointers.com -------------- next part -------------- A non-text attachment was scrubbed... Name: 0.diff Type: text/x-diff Size: 1965 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110306/6d638a72/attachment.bin From fvbommel at gmail.com Sun Mar 6 06:10:35 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Sun, 6 Mar 2011 13:10:35 +0100 Subject: [LLVMdev] First Patch In-Reply-To: <4D734FB9.7030209@playingwithpointers.com> References: <4D734FB9.7030209@playingwithpointers.com> Message-ID: On Sun, Mar 6, 2011 at 10:11 AM, Sanjoy Das wrote: > While I was at it, I also took a stab at finishing up one of the TODOs. I've > attached the patch for review. Comments inline. For those of you following at home, this code is in InstCombiner::WillNotOverflowSignedAdd(), and the first line of the initial comment is: // If one of the operands only has one non-zero bit, and if the other operand > --- lib/Transforms/InstCombine/InstCombineAddSub.cpp (revision 126747) > +++ lib/Transforms/InstCombine/InstCombineAddSub.cpp (working copy) > @@ -77,9 +77,55 @@ > // has a known-zero bit in a more significant place than it (not including the > // sign bit) the ripple may go up to and fill the zero, but won't change the > // sign. For example, (X & ~4) + 1. > - > - // TODO: Implement. > - > + > + int32_t power; > + > + { > + int width = LHS->getType()->getScalarSizeInBits(); This should be an unsigned, like the result type of getScalarSizeInBits(). > + APInt mask(width, 0), zeroes(width, 0), ones(width, 0); > + mask.setAllBits(); An easier way to get an all-one mask would be to construct it as 'APInt mask(width, -1, true)'. > + ComputeMaskedBits(LHS, mask, zeroes, ones); > + zeroes.flipAllBits(); > + > + if ((power = ones.exactLogBase2()) != -1 && zeroes == ones) { It would probably be cleaner to assign 'power' outside the condition. I don't think you're handling the case where the only set bit is the sign bit correctly; in that case the value is the most negative number possible, so you only need to check that the other side is non-negative. See below for how to handle this. This is also the only place where above 'zeroes' is used at the moment. It might be better not change zeroes itself here so it can be reused in the next block (see below). Use something like '~zeroes == ones'. Alternatively something like '(zeroes | ones).isAllOnesValue()' would work too, since they won't share any bits. > + int width = RHS->getType()->getScalarSizeInBits(); This is the same as the previous width since the types of the LHS and RHS are required to be equal for instructions like add. Just delete this line. > + APInt mask(width, 0), zeroes(width, 0), ones(width, 0); These shadow (have the same name as) the ones before. It's better to reuse the mask and rename the other two. The earlier ones should be LHSKnownZero and LHSKnownOne and these should be RHSKnownZero and RHSKnownOne. This is more consistent with the way they're named elsewhere in LLVM. > + mask.clearAllBits(); This is redundant here because it's already been initialized to zero in the constructor. If you'd reuse the old mask variable as suggested above you'd need this, but see below. > + > + // Disregarding the sign bit > + for (int i = (width - 2); i > power; i--) > + mask.setBit(i); I think this is equivalent to if (power < width-2) mask = APInt::getBitsSet(width, power+1, width-2); else mask.clearAllBits(); (This would mean the clearAllBits() above would again be redundant) However, a nice way to handle the signbit-only case would be to wrap that in an extra if as follows: if (power == width - 1) mask = APInt::getSignBit(width); // Alternatively: LHSKnownOne, which should be equivalent. else if // ... the code above ... so that for signbit-only LHS the check below tests whether the RHS is non-negative. > + ComputeMaskedBits(RHS, mask, zeroes, ones); > + > + // At least one 0 > + if (zeroes.countPopulation()) This should be 'if (RHSKnownZero.getBoolValue())' / 'if (!!RHSKnownZero)' or similar. You don't need the actual number of set bits here, you just want to know whether it's zero. > + return true; > + } > + } > + > + { > + int width = RHS->getType()->getScalarSizeInBits(); This has already been calculated in the previous block. Reuse it. > + APInt mask(width, 0), zeroes(width, 0), ones(width, 0); > + mask.setAllBits(); > + ComputeMaskedBits(RHS, mask, zeroes, ones); If you calculated this beforehand and stored it in RHSKnownZero and RHSKnownOne, you'd only need two ComputeMaskedBits calls instead of four in this code. Inside the 'if's you can use e.g. (Mask & RHSKnownZero) to get the values you're currently getting there. > + zeroes.flipAllBits(); > + > + if ((power = ones.exactLogBase2()) != -1 && zeroes == ones) { > + int width = LHS->getType()->getScalarSizeInBits(); > + APInt mask(width, 0), zeroes(width, 0), ones(width, 0); > + mask.clearAllBits(); > + > + // Disregarding the sign bit > + for (int i = (width - 2); i > power; i--) > + mask.setBit(i); > + ComputeMaskedBits(LHS, mask, zeroes, ones); > + > + // At least one 0 > + if (zeroes.countPopulation()) > + return true; > + } > + } Much of the same comments apply to this block as to the one above it since it appears to be duplicated code with LHS and RHS switched. Depending on how much code remains in the end, it might be better to factor it out to a static function taking the changing values as parameters. > + > return false; > } You're missing tests for this new functionality. Add some to test/Transforms/InstCombine/ after searching for 'WillNotOverflowSignedAdd' calls to see the patterns it's used for. From j.wilhelmy at arcor.de Sun Mar 6 09:18:58 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Sun, 06 Mar 2011 16:18:58 +0100 Subject: [LLVMdev] description of llvm::Value correct? Message-ID: <4D73A5E2.40102@arcor.de> Hi! in the detailed description of llvm::Value it says: All _types_ can have a name and they should belong to some Module Is this correct or is it rather All _values_ can have a name and they should belong to some Module? Is it correct to use types across modules (in the same context)? -Jochen From j.wilhelmy at arcor.de Sun Mar 6 12:45:44 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Sun, 06 Mar 2011 19:45:44 +0100 Subject: [LLVMdev] sharing of constants across modules allowed? Message-ID: <4D73D658.5040200@arcor.de> Hi! is it allowed to create a constant and use it as operand for instructions in differend modules (but same context)? -Jochen From fvbommel at gmail.com Sun Mar 6 12:53:09 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Sun, 6 Mar 2011 19:53:09 +0100 Subject: [LLVMdev] sharing of constants across modules allowed? In-Reply-To: <4D73D658.5040200@arcor.de> References: <4D73D658.5040200@arcor.de> Message-ID: On Sun, Mar 6, 2011 at 7:45 PM, Jochen Wilhelmy wrote: > is it allowed to create a constant and use it as operand for > instructions in differend modules (but same context)? As long as it doesn't contain any references to global values (functions, global variables & constants and aliases), yes. For example: 'i32 0' is fine, as is 'i8* inttoptr (i32 1234 to i8*)', but not 'i32(i8*, ...)* @printf'. Basically, if it contains an '@' (i.e. anything with a named memory address) anywhere then it's not allowed, otherwise it's fine. From viridia at gmail.com Sun Mar 6 13:01:15 2011 From: viridia at gmail.com (Talin) Date: Sun, 6 Mar 2011 11:01:15 -0800 Subject: [LLVMdev] _Unwind_Exception and _Unwind_Resume Message-ID: Here's an interesting problem - is it legal to copy the _Unwind_Exception struct to a different address in memory before calling _Unwind_Resume? I'm thinking of the scenario in which a garbage collection run is triggered in the middle of a "finally" block. If it's a copying collector, it might relocate the exception object, which has the _Unwind_Exception structure embedded in the middle of it. I don't see why this wouldn't work, but I thought I'd ask around to be sure. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110306/258b3952/attachment.html From j.wilhelmy at arcor.de Sun Mar 6 13:26:21 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Sun, 06 Mar 2011 20:26:21 +0100 Subject: [LLVMdev] sharing of constants across modules allowed? In-Reply-To: References: <4D73D658.5040200@arcor.de> Message-ID: <4D73DFDD.8010601@arcor.de> thanks From j.wilhelmy at arcor.de Sun Mar 6 14:52:16 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Sun, 06 Mar 2011 21:52:16 +0100 Subject: [LLVMdev] how to zero-init a global Message-ID: <4D73F400.40901@arcor.de> Hi! I have a module containing a constant e.g. @input = global %0 zeroinitializer, align 16 when I copy the global into another module I use newGlobal->copyAttributesFrom(global); but the new module now has @input = external global %0, align 16 i.e. the zeroinitializer is missing. how do I set it or copy it from the other global? -Jochen From fvbommel at gmail.com Sun Mar 6 15:13:09 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Sun, 6 Mar 2011 22:13:09 +0100 Subject: [LLVMdev] how to zero-init a global In-Reply-To: <4D73F400.40901@arcor.de> References: <4D73F400.40901@arcor.de> Message-ID: On Sun, Mar 6, 2011 at 9:52 PM, Jochen Wilhelmy wrote: > I have a module containing a constant e.g. > > @input = global %0 zeroinitializer, align 16 > > when I copy the global into another module I use > newGlobal->copyAttributesFrom(global); > but the new module now has > > @input = external global %0, align 16 > > i.e. the zeroinitializer is missing. how do I set it or > copy it from the other global? The initializer doesn't count as an attribute, I guess. if (global->hasInitializer()) newGlobal->setInitializer(global->getInitializer()); Note that, as I mentioned in response to your earlier question, not all constants are safe to use in a different module. So simply copying the initializer may be unsafe... Note that linkage and constness don't seem to count either. newGlobal->setConstant(global->isConstant()); newGlobal->setLinkage(global->getLinkage()); All three of these can also be passed to the constructor for newGlobal, by the way. According to the comment on GlobalValue::copyAttributesFrom() (as well as its GlobalVariable override) this seems to be the defining characteristic that qualifies them as "attributes"; it considers any property you can't pass to the constructor an attribute, apparently. From j.wilhelmy at arcor.de Sun Mar 6 15:23:33 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Sun, 06 Mar 2011 22:23:33 +0100 Subject: [LLVMdev] how to zero-init a global In-Reply-To: References: <4D73F400.40901@arcor.de> Message-ID: <4D73FB55.5020808@arcor.de> thanks. the reason why I didn't find getInitializer was I was searching in GlobalValue, not GlobalVariable -Jochen From rjmccall at apple.com Sun Mar 6 21:37:46 2011 From: rjmccall at apple.com (John McCall) Date: Sun, 6 Mar 2011 19:37:46 -0800 Subject: [LLVMdev] _Unwind_Exception and _Unwind_Resume In-Reply-To: References: Message-ID: <38063036-F542-4F8B-AC21-64E3B057E143@apple.com> On Mar 6, 2011, at 11:01 AM, Talin wrote: > Here's an interesting problem - is it legal to copy the _Unwind_Exception struct to a different address in memory before calling _Unwind_Resume? > > I'm thinking of the scenario in which a garbage collection run is triggered in the middle of a "finally" block. If it's a copying collector, it might relocate the exception object, which has the _Unwind_Exception structure embedded in the middle of it. I don't see why this wouldn't work, but I thought I'd ask around to be sure. This is really a question about the Itanium EH ABI, so I'm not sure why you're asking it here. That said, the ABI does not permit you to move exception objects in general, because even if you're working in a language where every object can be trivially moved, any given exception might be a "foreign" exception from a language (like C++) which does not provide this guarantee. Also, the ABI basically requires the EH implementation to internally maintain references to active exception objects under certain circumstances; you'll need to treat those as GC roots, which means hard-coding knowledge of the implementation into your collector / runtime. John. From viridia at gmail.com Sun Mar 6 22:01:25 2011 From: viridia at gmail.com (Talin) Date: Sun, 6 Mar 2011 20:01:25 -0800 Subject: [LLVMdev] _Unwind_Exception and _Unwind_Resume In-Reply-To: <38063036-F542-4F8B-AC21-64E3B057E143@apple.com> References: <38063036-F542-4F8B-AC21-64E3B057E143@apple.com> Message-ID: On Sun, Mar 6, 2011 at 7:37 PM, John McCall wrote: > On Mar 6, 2011, at 11:01 AM, Talin wrote: > > Here's an interesting problem - is it legal to copy the _Unwind_Exception > struct to a different address in memory before calling _Unwind_Resume? > > > > I'm thinking of the scenario in which a garbage collection run is > triggered in the middle of a "finally" block. If it's a copying collector, > it might relocate the exception object, which has the _Unwind_Exception > structure embedded in the middle of it. I don't see why this wouldn't work, > but I thought I'd ask around to be sure. > > This is really a question about the Itanium EH ABI, so I'm not sure why > you're asking it here. Only because I couldn't think of where else to ask :) > That said, the ABI does not permit you to move exception objects in > general, because even if you're working in a language where every object can > be trivially moved, any given exception might be a "foreign" exception from > a language (like C++) which does not provide this guarantee. > > Foreign exceptions would be in a different heap, so the garbage collector wouldn't attempt to move them or even be aware of their existence. I can see that there might be a problem if C++ code caught one of *my* exceptions which subsequently moved, although the exception wouldn't be moved until sometime after control passed back to my code, since the garbage collector only moves objects during safe points, and the foreign code wouldn't have safe point calls. This just means that people who call methods written in my language from other languages will have to be careful about exception handling. > Also, the ABI basically requires the EH implementation to internally > maintain references to active exception objects under certain circumstances; > you'll need to treat those as GC roots, which means hard-coding knowledge > of the implementation into your collector / runtime. > > That's not a problem, my code already does that. > John. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110306/bfbc5fbc/attachment.html From rengolin at systemcall.org Mon Mar 7 03:19:05 2011 From: rengolin at systemcall.org (Renato Golin) Date: Mon, 7 Mar 2011 09:19:05 +0000 Subject: [LLVMdev] _Unwind_Exception and _Unwind_Resume In-Reply-To: References: <38063036-F542-4F8B-AC21-64E3B057E143@apple.com> Message-ID: On 7 March 2011 04:01, Talin wrote: > This just means that people who call methods written in my > language from other languages will have to be careful about exception > handling. Always. This is why following the ABI is so important if you want compatibility. However, the ABI doesn't mention GC (that I've seen), so you have to read between the lines... cheers, --renato From blackfin.kang at gmail.com Mon Mar 7 03:57:02 2011 From: blackfin.kang at gmail.com (Michael.Kang) Date: Mon, 7 Mar 2011 17:57:02 +0800 Subject: [LLVMdev] The size of native code for a given JIT function Message-ID: Now I try to compare the native code block size generated by llvm and qemu. But I do not know how to get the native code block size of JIT function in llvm? Which API should be used? Thanks MK -- www.skyeye.org From nicolas.geoffray at gmail.com Mon Mar 7 06:08:12 2011 From: nicolas.geoffray at gmail.com (nicolas geoffray) Date: Mon, 7 Mar 2011 13:08:12 +0100 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: Hi Talin, On Sat, Mar 5, 2011 at 6:42 PM, Talin wrote: > > > So I've been thinking about your proposal, that of using a special address > space to indicate garbage collection roots instead of intrinsics. Great! > > To address this, we need a better way of telling LLVM that a given variable > is no longer a root. > Live variable analysis is already in LLVM and for me that's enough to know whether a given variable is no longer a root. Note that each safe point has its own set of root locations, and these locations all contain live variables. Dead variables may still be in register or stack, but the GC will not visit them. > 2) As I mentioned, my language supports tagged unions and other "value" > types. Another example is a tuple type, such as (String, String). Such types > are never allocated on the heap by themselves, because they don't have the > object header structure that holds the type information needed by the > garbage collector. Instead, these values can live in SSA variables, or in > allocas, or they can be embedded inside larger types which do live on the > heap. > If you know, at compile-time, whether you are dealing with a struct or a heap, what prevents you from emitting code that won't need such tagged unions in the IR. Same for structs: if they contain pointers to heap objects, those will be in that special address space. 3) I've been following the discussions on llvm-dev about the use of the > address-space property of pointers to signal different kinds of memory pools > for things like shared address spaces. If we try to use that same variable > to indicate garbage collection, now we have to multiplex both meanings onto > the same field. We can't just dedicate one special ID for the garbage > collected heap, because there could be multiple such heaps. As you add > additional orthogonal meanings to the address-space field, you end up with a > combinatorial explosion of possible values for it. > > I think there exist already some convention between an ID and some codegen. Having one additional seems fine to me, even if you need to play with bits in case you need different IDs for a single pointer. I'm also fine with the intrinsic way of declaring a GC root. But I think it is cumbersome, and error-prone in the presence of optimizers that may try to move away that intrinsic (I remember similar issues with the current EH intrinsics). Nicolas > -- > -- Talin > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/4b2ba122/attachment.html From eliben at gmail.com Mon Mar 7 06:16:47 2011 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 7 Mar 2011 14:16:47 +0200 Subject: [LLVMdev] DW_TAG_lexical_block structure in debug information Message-ID: Hello, The documentation for debug information (http://llvm.org/docs/SourceLevelDebugging.html) says the structure of block descriptors metadata is: !3 = metadata !{ i32, ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block) metadata,;; Reference to context descriptor i32, ;; Line number i32 ;; Column number } However, looking at the generated metadata, there are 2 extra fields not documented here. From the source code it appears to be a link to the function holding the block, and a unique integer ID: DILexicalBlock DIBuilder::createLexicalBlock(DIDescriptor Scope, DIFile File, unsigned Line, unsigned Col) { // Defeat MDNode uniqing for lexical blocks by using unique id. static unsigned int unique_id = 0; Value *Elts[] = { GetTagConstant(VMContext, dwarf::DW_TAG_lexical_block), Scope, ConstantInt::get(Type::getInt32Ty(VMContext), Line), ConstantInt::get(Type::getInt32Ty(VMContext), Col), File, ConstantInt::get(Type::getInt32Ty(VMContext), unique_id++) }; return DILexicalBlock(MDNode::get(VMContext, &Elts[0], array_lengthof(Elts))); } Is this an error in the documentation? Thanks in advance, Eli From harip at vt.edu Mon Mar 7 07:08:40 2011 From: harip at vt.edu (Hari Pyla) Date: Mon, 7 Mar 2011 08:08:40 -0500 Subject: [LLVMdev] matching function call arguments Message-ID: Hi, I am trying to identify if two functions were called with exactly the same argument. For instance, in the below example, assuming both entry() and exit() functions take a single argument, I would like to know if arg1 in entry() is same as arg1in exit(). int a; struct sa { int b; int c; }; int main () { struct sa s; entry (arg1); ... exit (arg1); return 0; } In instances such as entry(a) and exit (a). I am able to determine that it is the same variable 'a' using '==' on the callinst->getOperand(1)'. However, if I pass a member variable of a structure say (s.b) to both entry and exit, I am unable to use '==' since the operands are GEP instructions. How can I compare such arguments to check if they are identical and also I was wondering as to what is best approach to determine if these arguments are exactly identical. Thanks in advance. Best, --Hari From viridia at gmail.com Mon Mar 7 11:35:51 2011 From: viridia at gmail.com (Talin) Date: Mon, 7 Mar 2011 09:35:51 -0800 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Mon, Mar 7, 2011 at 4:08 AM, nicolas geoffray wrote: > Hi Talin, > > On Sat, Mar 5, 2011 at 6:42 PM, Talin wrote: >> >> >> So I've been thinking about your proposal, that of using a special address >> space to indicate garbage collection roots instead of intrinsics. > > > Great! > > >> >> To address this, we need a better way of telling LLVM that a given >> variable is no longer a root. >> > > Live variable analysis is already in LLVM and for me that's enough to know > whether a given variable is no longer a root. Note that each safe point has > its own set of root locations, and these locations all contain live > variables. Dead variables may still be in register or stack, but the GC will > not visit them. > > >> 2) As I mentioned, my language supports tagged unions and other "value" >> types. Another example is a tuple type, such as (String, String). Such types >> are never allocated on the heap by themselves, because they don't have the >> object header structure that holds the type information needed by the >> garbage collector. Instead, these values can live in SSA variables, or in >> allocas, or they can be embedded inside larger types which do live on the >> heap. >> > > If you know, at compile-time, whether you are dealing with a struct or a > heap, what prevents you from emitting code that won't need such tagged > unions in the IR. Same for structs: if they contain pointers to heap > objects, those will be in that special address space. > I'm not sure what you mean by this. Take for example a union of a String (which is a pointer) and a float. The union is either { i1; String * } or { i1; float }. The garbage collector needs to see that i1 in order to know whether the second field of the struct is a pointer - if it attempted to dereference the pointer when the field actually contains a float, the program would crash. The metadata argument that I pass to llvm.gcroot informs the garbage collector about the structure of the union. > > 3) I've been following the discussions on llvm-dev about the use of the >> address-space property of pointers to signal different kinds of memory pools >> for things like shared address spaces. If we try to use that same variable >> to indicate garbage collection, now we have to multiplex both meanings onto >> the same field. We can't just dedicate one special ID for the garbage >> collected heap, because there could be multiple such heaps. As you add >> additional orthogonal meanings to the address-space field, you end up with a >> combinatorial explosion of possible values for it. >> >> > I think there exist already some convention between an ID and some codegen. > Having one additional seems fine to me, even if you need to play with bits > in case you need different IDs for a single pointer. > > I'm also fine with the intrinsic way of declaring a GC root. But I think it > is cumbersome, and error-prone in the presence of optimizers that may try to > move away that intrinsic (I remember similar issues with the current EH > intrinsics). > > Nicolas > > >> -- >> -- Talin >> > > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/e32535e9/attachment.html From viridia at gmail.com Mon Mar 7 11:46:20 2011 From: viridia at gmail.com (Talin) Date: Mon, 7 Mar 2011 09:46:20 -0800 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Mon, Mar 7, 2011 at 9:35 AM, Talin wrote: > On Mon, Mar 7, 2011 at 4:08 AM, nicolas geoffray < > nicolas.geoffray at gmail.com> wrote: > >> Hi Talin, >> >> On Sat, Mar 5, 2011 at 6:42 PM, Talin wrote: >>> >>> >>> So I've been thinking about your proposal, that of using a special >>> address space to indicate garbage collection roots instead of intrinsics. >> >> >> Great! >> >> >>> >>> To address this, we need a better way of telling LLVM that a given >>> variable is no longer a root. >>> >> >> Live variable analysis is already in LLVM and for me that's enough to know >> whether a given variable is no longer a root. Note that each safe point has >> its own set of root locations, and these locations all contain live >> variables. Dead variables may still be in register or stack, but the GC will >> not visit them. >> >> >>> 2) As I mentioned, my language supports tagged unions and other "value" >>> types. Another example is a tuple type, such as (String, String). Such types >>> are never allocated on the heap by themselves, because they don't have the >>> object header structure that holds the type information needed by the >>> garbage collector. Instead, these values can live in SSA variables, or in >>> allocas, or they can be embedded inside larger types which do live on the >>> heap. >>> >> >> If you know, at compile-time, whether you are dealing with a struct or a >> heap, what prevents you from emitting code that won't need such tagged >> unions in the IR. Same for structs: if they contain pointers to heap >> objects, those will be in that special address space. >> > > I'm not sure what you mean by this. > > Take for example a union of a String (which is a pointer) and a float. The > union is either { i1; String * } or { i1; float }. The garbage collector > needs to see that i1 in order to know whether the second field of the struct > is a pointer - if it attempted to dereference the pointer when the field > actually contains a float, the program would crash. The metadata argument > that I pass to llvm.gcroot informs the garbage collector about the structure > of the union. > Sorry, I left a part out. The way that my garbage collector works currently is that the collector gets a pointer to the enture union struct, not just the pointer field within the union. In other words, the entire union struct is considered a "root". In fact, there might not even be a pointer in the struct. You see, because LLVM doesn't directly support unions, I have to simulate that support by casting pointers. That is, for each different type contained in the union, I have a different struct type, and when I want to extract data from the union I cast the pointer to the appropriate type and then use GEP to get the data out. However, when allocating storage for the union, I have to use the largest data type, which might not be a pointer. For example, suppose I have a type "String or (float, float, float)" - that is, a union of a string and a 3-tuple of floats. Most of the time what LLVM will see is { i1; { float; float; float; } } because that's bigger than { i1; String* }. LLVM won't even know there's a pointer in there, except during those brief times when I'm accessing the pointer field. So tagging the pointer in a different address space won't help at all here. >> 3) I've been following the discussions on llvm-dev about the use of the >>> address-space property of pointers to signal different kinds of memory pools >>> for things like shared address spaces. If we try to use that same variable >>> to indicate garbage collection, now we have to multiplex both meanings onto >>> the same field. We can't just dedicate one special ID for the garbage >>> collected heap, because there could be multiple such heaps. As you add >>> additional orthogonal meanings to the address-space field, you end up with a >>> combinatorial explosion of possible values for it. >>> >>> >> I think there exist already some convention between an ID and some >> codegen. Having one additional seems fine to me, even if you need to play >> with bits in case you need different IDs for a single pointer. >> >> I'm also fine with the intrinsic way of declaring a GC root. But I think >> it is cumbersome, and error-prone in the presence of optimizers that may try >> to move away that intrinsic (I remember similar issues with the current EH >> intrinsics). >> >> Nicolas >> >> >>> -- >>> -- Talin >>> >> >> > > > -- > -- Talin > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/b38365b6/attachment.html From reid.kleckner at gmail.com Mon Mar 7 12:04:39 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Mon, 7 Mar 2011 13:04:39 -0500 Subject: [LLVMdev] matching function call arguments In-Reply-To: References: Message-ID: Could you be more precise about what you mean by "identical"? Would entry(2) and entry(1+1) be considered equivalent? If the same Value* is passed to entry and exit, then pointer equality (==) will detect that. Reid On Mon, Mar 7, 2011 at 8:08 AM, Hari Pyla wrote: > Hi, > ?I am trying to identify if two functions were called with exactly the same argument. For instance, in the below example, assuming both entry() and exit() functions take a single argument, I would like to know if arg1 in entry() is same as arg1in exit(). > > int a; > struct sa > { > ?int b; > ?int c; > }; > > int main () > { > ? struct sa s; > > ? entry (arg1); > ? ?... > ? exit (arg1); > > ? return 0; > } > > In instances such as entry(a) and exit (a). I am able to determine that it is the same variable 'a' using '==' on the callinst->getOperand(1)'. However, if I pass a member variable of a structure say (s.b) to both entry and exit, I am unable to use '==' since the operands are GEP instructions. How can I compare such arguments to check if they are identical and also I was wondering as to what is best approach to determine if these arguments are exactly identical. Thanks in advance. > > Best, > --Hari > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From viridia at gmail.com Mon Mar 7 12:20:03 2011 From: viridia at gmail.com (Talin) Date: Mon, 7 Mar 2011 10:20:03 -0800 Subject: [LLVMdev] File timestamps Message-ID: I notice that the new path functions don't have a way to query the timestamp of a file, and the old path class is deprecated. Are there plans to add this? -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/5fa5e95a/attachment.html From sanjivkumargupta at yahoo.com Mon Mar 7 09:36:22 2011 From: sanjivkumargupta at yahoo.com (Sanjiv Kumar Gupta) Date: Mon, 7 Mar 2011 21:06:22 +0530 (IST) Subject: [LLVMdev] LLVM developer available for short term contract work. Message-ID: <333684.24909.qm@web94907.mail.in2.yahoo.com> Hi, I have been working in compilers since 2001 and have been developing?softwares since 1997. I have strong experience in compiler backends and LLVM especially for microcontrollers. I am open for short term contract work for LLVM related jobs. ? I am based at Bangalore, India. Please get in touch with me if you need more info. ? Thanks, Sanjiv -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/26bbc471/attachment.html From joshuawarner32 at gmail.com Mon Mar 7 12:58:57 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Mon, 7 Mar 2011 11:58:57 -0700 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: Hi Talin, Sorry to interject - > For example, suppose I have a type "String or (float, float, float)" - that > is, a union of a string and a 3-tuple of floats. Most of the time what LLVM > will see is { i1; { float; float; float; } } because that's bigger than { > i1; String* }. LLVM won't even know there's a pointer in there, except > during those brief times when I'm accessing the pointer field. So tagging > the pointer in a different address space won't help at all here. > > I think this is a fairly uncommon use case that will be tricky to deal with no matter what method is used to track GC roots. That said, why not do something like make the pointer representation (the {i1, String*}) the long-term storage format, and only bitcast *just* before loading the floats? You could even use another address space to indicate that something is *sometimes* a pointer, dependent upon some other value (the i1, perhaps indicated with metadata). My vote (not that it really counts for much) would be the address-space method. It seems much more elegant. The only thing that I think would be unusually difficult for the address-space method to handle would be alternative pointer representations, such as those used in the latest version of Hotspot (see http://wikis.sun.com/display/HotSpotInternals/CompressedOops). Essentially, a 64-bit pointer is packed into 32-bits by assuming 8-byte alignment and restricting the heap size to 32GB. I've seen similar object-reference bitfields used in game engines. In this case, there is no "pointer" to attach the address space to. (Yes, I know that Hotspot currently uses CompressedOops ONLY in the heap, decompressing them when stored in locals, but it is not inconceivable to avoid decompressing them if the code is just moving them around, as an optimization.) Just my few thoughts. -Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/868992ab/attachment.html From harip at vt.edu Mon Mar 7 13:03:23 2011 From: harip at vt.edu (Hari Pyla) Date: Mon, 07 Mar 2011 14:03:23 -0500 Subject: [LLVMdev] matching function call arguments In-Reply-To: References: Message-ID: <4D752BFB.7010501@vt.edu> Hi Reid, Thank you for your response. In my analysis, I will always have entry(2) and exit(2). I will not run into cases involving entry (1+1) or entry (fn return values). I am having trouble trying to compare the arguments of entry and exit in the following scenario. #include #include #include struct sa { int a; pthread_mutex_t *mutex1; }; struct sa *s; pthread_mutex_t mutex1; int main() { s = (struct sa *)malloc(sizeof(struct sa)); s->mutex1 = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); entry(s->mutex1); s->a++; exit(s->mutex1); return 0; } Thanks in advance, Best, --Hari On 03/07/2011 01:04 PM, Reid Kleckner wrote: > Could you be more precise about what you mean by "identical"? Would > entry(2) and entry(1+1) be considered equivalent? > > If the same Value* is passed to entry and exit, then pointer equality > (==) will detect that. > > Reid > > On Mon, Mar 7, 2011 at 8:08 AM, Hari Pyla wrote: >> Hi, >> I am trying to identify if two functions were called with exactly the same argument. For instance, in the below example, assuming both entry() and exit() functions take a single argument, I would like to know if arg1 in entry() is same as arg1in exit(). >> >> int a; >> struct sa >> { >> int b; >> int c; >> }; >> >> int main () >> { >> struct sa s; >> >> entry (arg1); >> ... >> exit (arg1); >> >> return 0; >> } >> >> In instances such as entry(a) and exit (a). I am able to determine that it is the same variable 'a' using '==' on the callinst->getOperand(1)'. However, if I pass a member variable of a structure say (s.b) to both entry and exit, I am unable to use '==' since the operands are GEP instructions. How can I compare such arguments to check if they are identical and also I was wondering as to what is best approach to determine if these arguments are exactly identical. Thanks in advance. >> >> Best, >> --Hari >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> From wendling at apple.com Mon Mar 7 13:04:07 2011 From: wendling at apple.com (Bill Wendling) Date: Mon, 7 Mar 2011 11:04:07 -0800 Subject: [LLVMdev] Reminder: LLVM 2.9 Branching in One Week In-Reply-To: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> Message-ID: <27D4C7BA-2E22-4CFE-A1E9-73D6DE6FE6C7@apple.com> Small correction: The branching will happen at 7PM today. Please watch the build bots like a hawk. :-) -bw On Feb 27, 2011, at 5:43 PM, Bill Wendling wrote: > This is a reminder that we will be branching for LLVM 2.9 in one week! > > 07:00:00 p.m. Sunday March 6, 2011 PST / 03:00:00 a.m. Monday March 7, 2011 GMT > > What this means for you: > > Please keep a watch on all of your patches going into mainline. And pay close attention to the buildbots and fix any issues quickly. > > Also, please try to finish up any last minute feature work. While it won't be the last time to submit a patch for a work-in-progress feature, the more work and testing that can be done before the branching means that the release verification process will go that much more smoothly. > > LLVM 2.9 is going to be an exciting release. It marks the end of an era in some respects, but the beginning of many new ones to come! :-) > > -bw > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From ofv at wanadoo.es Mon Mar 7 13:45:14 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Mon, 07 Mar 2011 20:45:14 +0100 Subject: [LLVMdev] Using result of config-ix in custom tool. In-Reply-To: (arrowdodger's message of "Mon, 7 Mar 2011 14:04:30 +0300") References: Message-ID: <8762ruww8l.fsf@wanadoo.es> [CCing llvm-dev] arrowdodger <6yearold at gmail.com> writes: > Hello. Suppose i'm writing a tool, which uses LLVM and i want it to be > buildable like clang - from tools/ subdir or standalone build. Since i'm not > very experinced with CMake, i've took your code from clang and tried to > adapt it. Now i stumbled upon this problem - i can't use result of running > config-ix.cmake if i'm building my tool standalone. > The reason i want do this is that my config.h shares quite a lot of defines > with LLVM's one. But as i can see from your code, clang don't use them too, > just defines some of his own. > So, should i write my own config-ix.cmake for my tool, or i've missed > something? On the long term, I think that the best route is to have your own config-ix.cmake (the -ix part is a remnant from the times where there was a config-w32.cmake too) LLVM's config-ix.cmake can change any time, tests may be added, changed, removed, there is lots of chances for name clashes with your cmake files, etc. OTOH, if you just want to use the macros defined by the platform tests on your C++ source code, #including llvm/Config/config.h will work, if you don't care about the occassional breakage introduced by the changes mentioned above. From viridia at gmail.com Mon Mar 7 13:48:32 2011 From: viridia at gmail.com (Talin) Date: Mon, 7 Mar 2011 11:48:32 -0800 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Mon, Mar 7, 2011 at 10:58 AM, Joshua Warner wrote: > Hi Talin, > > Sorry to interject - > > >> For example, suppose I have a type "String or (float, float, float)" - >> that is, a union of a string and a 3-tuple of floats. Most of the time what >> LLVM will see is { i1; { float; float; float; } } because that's bigger than >> { i1; String* }. LLVM won't even know there's a pointer in there, except >> during those brief times when I'm accessing the pointer field. So tagging >> the pointer in a different address space won't help at all here. >> >> > I think this is a fairly uncommon use case that will be tricky to deal with > no matter what method is used to track GC roots. That said, why not do > something like make the pointer representation (the {i1, String*}) the > long-term storage format, and only bitcast *just* before loading the > floats? You could even use another address space to indicate that something > is *sometimes* a pointer, dependent upon some other value (the i1, perhaps > indicated with metadata). > I don't know if it's an uncommon use case or not, but it is something that I handle already in my frontend. (I suppose it's uncommon in the sense that almost no one uses the garbage collection features of LLVM, but part of the goal of this discussion is to change that.) The problem with making { i1, String* } the long-term storage format is that it isn't large enough in the example I gave, so you'll overwrite other fields if you try to store the three floats. The more general issue is that the concepts we're talking about simply aren't expressible in IR as it exists today. > > My vote (not that it really counts for much) would be the address-space > method. It seems much more elegant. > I agree that the current solution isn't the best. The problem I have is that the solutions that are being suggested are going to break my code badly, and with no way to fix it. The *real* solution is to make root-ness a function of type. In other words, you can mark any type as being a root, which exposes the base address of all objects of that type to the garbage collector. This is essentially the same as the pointer-address-space suggestion, except that it's not limited to pointers. (In practice, it would only ever apply to pointers and structs.) (Heck, I'd even be willing to go with a solution where only structs and not pointers could be roots - it means I'd have to wrap every pointer in a struct, which would be a royal pain, but it would at least work.) > > The only thing that I think would be unusually difficult for the > address-space method to handle would be alternative pointer representations, > such as those used in the latest version of Hotspot (see > http://wikis.sun.com/display/HotSpotInternals/CompressedOops). > Essentially, a 64-bit pointer is packed into 32-bits by assuming 8-byte > alignment and restricting the heap size to 32GB. I've seen similar > object-reference bitfields used in game engines. In this case, there is no > "pointer" to attach the address space to. > > (Yes, I know that Hotspot currently uses CompressedOops ONLY in the heap, > decompressing them when stored in locals, but it is not inconceivable to > avoid decompressing them if the code is just moving them around, as an > optimization.) > > Just my few thoughts. > > -Joshua > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/a6cd922f/attachment.html From fvbommel at gmail.com Mon Mar 7 14:02:44 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Mon, 7 Mar 2011 21:02:44 +0100 Subject: [LLVMdev] matching function call arguments In-Reply-To: <4D752BFB.7010501@vt.edu> References: <4D752BFB.7010501@vt.edu> Message-ID: On Mon, Mar 7, 2011 at 8:03 PM, Hari Pyla wrote: > ?Thank you for your response. In my analysis, I will always have > entry(2) and exit(2). I will not run into cases involving entry (1+1) or > entry (fn return values). I am having trouble trying to compare the > arguments of entry and exit in the following scenario. > > #include > #include > #include > > struct sa > { > ? int a; > ? pthread_mutex_t *mutex1; > }; > struct sa *s; > > pthread_mutex_t mutex1; > int main() > { > ? s = (struct sa *)malloc(sizeof(struct sa)); > ? s->mutex1 = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); > > ?entry(s->mutex1); > > ? s->a++; > > ? exit(s->mutex1); > > ? return 0; > } Have you tried just running something like -earlycse before your pass? It might be overkill, but it'd probably get the job done. If you want a "lazy" check (as in, just for the values you want to compare) though, AFAIK there's no public interface for that kind of functionality. A pass like -earlycse or -mergefunc might have something you can factor out though. Other than that, you can probably hack up something with a recursive function using Instruction::isSameOperationAs(), which checks everything except operand values (so you'd need to check equivalence of operands recursively if you want to go more than one instruction deep). You'll probably want to keep a mapping of already-checked equivalences (for performance), and you should definitely be careful not to go into infinite recursion in the face of PHI nodes. Also, be careful of side effects, both in the instructions you're comparing and in anything in between; for example you don't want two loads to be considered equal if there's a store to that memory in between them. Instruction::mayHaveSideEffects() and Instruction::mayReadFromMemory() are probably useful here. If you go with either of the last two options and create a function to figure out whether two values are equivalent, it might be interesting if you submitted a patch for your changes so others can use them too. Also, you'd likely get some feedback on whether you've made any mistakes :). From arushi987 at gmail.com Mon Mar 7 15:51:07 2011 From: arushi987 at gmail.com (Arushi Aggarwal) Date: Mon, 7 Mar 2011 15:51:07 -0600 Subject: [LLVMdev] 64 bit MRV problem; Missed optimizations. Message-ID: Hi, I was tracking the issue discussed earlier, and I was wondering if a bug for the missed optimizations, was ever filed, and if it has been fixed since? If so in which llvm version, and more specifically which optimization pass. http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028877.html Thanks, Regards, Arushi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/2be61c20/attachment.html From joshuawarner32 at gmail.com Mon Mar 7 16:05:08 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Mon, 7 Mar 2011 15:05:08 -0700 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Mon, Mar 7, 2011 at 12:48 PM, Talin wrote: > On Mon, Mar 7, 2011 at 10:58 AM, Joshua Warner wrote: > >> Hi Talin, >> >> Sorry to interject - >> >> >>> For example, suppose I have a type "String or (float, float, float)" - >>> that is, a union of a string and a 3-tuple of floats. Most of the time what >>> LLVM will see is { i1; { float; float; float; } } because that's bigger than >>> { i1; String* }. LLVM won't even know there's a pointer in there, except >>> during those brief times when I'm accessing the pointer field. So tagging >>> the pointer in a different address space won't help at all here. >>> >>> >> I think this is a fairly uncommon use case that will be tricky to deal >> with no matter what method is used to track GC roots. That said, why not do >> something like make the pointer representation (the {i1, String*}) the >> long-term storage format, and only bitcast *just* before loading the >> floats? You could even use another address space to indicate that something >> is *sometimes* a pointer, dependent upon some other value (the i1, perhaps >> indicated with metadata). >> > > I don't know if it's an uncommon use case or not, but it is something that > I handle already in my frontend. (I suppose it's uncommon in the sense that > almost no one uses the garbage collection features of LLVM, but part of the > goal of this discussion is to change that.) > I actually meant uncommon in the sense of having stack-allocated unions that participate in garbage collection. Off the top of my head, I could only name one language (ML) that might use a feature like that. Even then, I suspect most ML implementations would actually push that stuff onto the heap. > The problem with making { i1, String* } the long-term storage format is > that it isn't large enough in the example I gave, so you'll overwrite other > fields if you try to store the three floats. The more general issue is that > the concepts we're talking about simply aren't expressible in IR as it > exists today. > Good catch - what I actually intended to indicate was the String "half" of the union, properly padded - so something more like {i1, String*, float} (for 64-bit pointers). > >> My vote (not that it really counts for much) would be the address-space >> method. It seems much more elegant. >> > > I agree that the current solution isn't the best. The problem I have is > that the solutions that are being suggested are going to break my code > badly, and with no way to fix it. > > The *real* solution is to make root-ness a function of type. In other > words, you can mark any type as being a root, which exposes the base address > of all objects of that type to the garbage collector. This is essentially > the same as the pointer-address-space suggestion, except that it's not > limited to pointers. (In practice, it would only ever apply to pointers and > structs.) > > (Heck, I'd even be willing to go with a solution where only structs and not > pointers could be roots - it means I'd have to wrap every pointer in a > struct, which would be a royal pain, but it would at least work.) > Hmm... do you mean something like a "marked" bit (or maybe a vector of mark_ids) in every type, where you could query a function for values of "marked" types at particular safe points? This sounds like something that might solve the problem described below with compressed pointers (not that I am actually encountering this problem) - but in the near-term, it seems to me that everything that you could conceivably mark as a GC root would somehow contain a pointer value. In this case, union support in LLVM would make the generated IR cleaner, but not necessarily any more correct. Being able to make a "marked" version of every type seems unnecessary, and in some cases, somewhat non-intuitive. Take for instance, making a "marked" float type - which I can't think of any good use for. I like the idea of using address spaces because it keeps the concepts in IR largely orthogonal, rather than having features that overlap in purpose in many cases. That, and IMO it just makes sense for pointers into the (or, in general, a) heap be considered in a different address space from "normal" pointers. This could extend well to tracking pointers onto the stack (as seen in C# out/ref) for the purpose of generating closures (in .NET - which doesn't currently have this feature). > >> The only thing that I think would be unusually difficult for the >> address-space method to handle would be alternative pointer representations, >> such as those used in the latest version of Hotspot (see >> http://wikis.sun.com/display/HotSpotInternals/CompressedOops). >> Essentially, a 64-bit pointer is packed into 32-bits by assuming 8-byte >> alignment and restricting the heap size to 32GB. I've seen similar >> object-reference bitfields used in game engines. In this case, there is no >> "pointer" to attach the address space to. >> >> (Yes, I know that Hotspot currently uses CompressedOops ONLY in the heap, >> decompressing them when stored in locals, but it is not inconceivable to >> avoid decompressing them if the code is just moving them around, as an >> optimization.) >> >> Just my few thoughts. >> >> -Joshua >> > > > > -- > -- Talin > -Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/9d1e8706/attachment.html From eli.friedman at gmail.com Mon Mar 7 16:24:00 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 7 Mar 2011 17:24:00 -0500 Subject: [LLVMdev] 64 bit MRV problem; Missed optimizations. In-Reply-To: References: Message-ID: On Mon, Mar 7, 2011 at 1:51 PM, Arushi Aggarwal wrote: > Hi, > > I was tracking the issue discussed earlier, and I was wondering if a bug for > the missed optimizations, was ever filed, and if it has been fixed since? If > so in which llvm version, and more specifically which optimization pass. > > http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028877.html clang has been fixed to generate code which is generally more friendly to the optimizers for situations like that. -Eli From harip at vt.edu Mon Mar 7 17:10:39 2011 From: harip at vt.edu (Hari Pyla) Date: Mon, 07 Mar 2011 18:10:39 -0500 Subject: [LLVMdev] matching function call arguments In-Reply-To: References: <4D752BFB.7010501@vt.edu> Message-ID: <4D7565EF.7020003@vt.edu> Hi, I downloaded the latest llvm (dev version) and I tried -earlycse with opt before my pass. However, it still does not solve my problem. I will try to explore the recursive function option and I will certainly send you the code once I implement it. Thanks, --Hari On 03/07/2011 03:02 PM, Frits van Bommel wrote: > On Mon, Mar 7, 2011 at 8:03 PM, Hari Pyla wrote: >> Thank you for your response. In my analysis, I will always have >> entry(2) and exit(2). I will not run into cases involving entry (1+1) or >> entry (fn return values). I am having trouble trying to compare the >> arguments of entry and exit in the following scenario. >> >> #include >> #include >> #include >> >> struct sa >> { >> int a; >> pthread_mutex_t *mutex1; >> }; >> struct sa *s; >> >> pthread_mutex_t mutex1; >> int main() >> { >> s = (struct sa *)malloc(sizeof(struct sa)); >> s->mutex1 = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); >> >> entry(s->mutex1); >> >> s->a++; >> >> exit(s->mutex1); >> >> return 0; >> } > Have you tried just running something like -earlycse before your pass? > It might be overkill, but it'd probably get the job done. > > If you want a "lazy" check (as in, just for the values you want to > compare) though, AFAIK there's no public interface for that kind of > functionality. A pass like -earlycse or -mergefunc might have > something you can factor out though. > > Other than that, you can probably hack up something with a recursive > function using Instruction::isSameOperationAs(), which checks > everything except operand values (so you'd need to check equivalence > of operands recursively if you want to go more than one instruction > deep). > You'll probably want to keep a mapping of already-checked equivalences > (for performance), and you should definitely be careful not to go into > infinite recursion in the face of PHI nodes. > Also, be careful of side effects, both in the instructions you're > comparing and in anything in between; for example you don't want two > loads to be considered equal if there's a store to that memory in > between them. Instruction::mayHaveSideEffects() and > Instruction::mayReadFromMemory() are probably useful here. > > If you go with either of the last two options and create a function to > figure out whether two values are equivalent, it might be interesting > if you submitted a patch for your changes so others can use them too. > Also, you'd likely get some feedback on whether you've made any > mistakes :). From xerox.time at gmail.com Mon Mar 7 18:49:48 2011 From: xerox.time at gmail.com (Xin Tong) Date: Mon, 7 Mar 2011 19:49:48 -0500 Subject: [LLVMdev] LLVM Static & Dynamic Compiler Benchmarks Message-ID: I come from a Java JIT backgroud, I would like to know what are the typical benchmarks for the static llvm compiler and for the LLVM JIT performance? -- Kind Regards Xin Tong -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/42449da2/attachment.html From x.tong at utoronto.ca Mon Mar 7 19:10:27 2011 From: x.tong at utoronto.ca (Xin Tong Utoronto) Date: Mon, 7 Mar 2011 20:10:27 -0500 Subject: [LLVMdev] LLVM Benchmarks Message-ID: I come from a Java JIT backgroud, I am wondering what kind of benchmarks we use for the LLVM static compiler and JIT ? -- Kind Regards Xin Tong -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/be8efe7b/attachment.html From anton at korobeynikov.info Mon Mar 7 19:30:47 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 8 Mar 2011 04:30:47 +0300 Subject: [LLVMdev] [cfe-dev] Reminder: LLVM 2.9 Branching in One Week In-Reply-To: <87mxlei5be.fsf@smith.obbligato.org> References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> Message-ID: Hi David > I think the trouble with branches is the lockdown of the root repository > directory. Surely not (at the server) > git svn init --stdlayout https://@llvm.org/svn/llvm-project/llvm \ > ?--ignore-paths="^.*(Apple|PowerPC.*|SVA|eh-experimental|ggreif|non-call-eh|parallel|release_.*|vector_llvm|wendling|May2007|checker|cremebrulee|start|RELEASE_1.*|RELEASE_2[0-7])" Several problems here: 1. Bunch of additional branches / tags are created due to multiple branch points. I don't recall for llvm, but for clang we'll end with two tags per each release. Something like: $ git branch -r trunk tags/RELEASE_26 tags/RELEASE_26 at 84939 tags/RELEASE_27 tags/RELEASE_27 at 102415 tags/RELEASE_28 tags/RELEASE_28 at 115869 The problem will be much worse with new release branch scheme, basically we'll need to add each branch by hand, etc... 2. We really don't want to push arbitrary branches to git repository. It's really easy to add branch by an accident, so it will be much better not to ignore stuff, but except - add by some pattern. Unfortunately, git-svn does not allow this yet. So, right now I'm experimenting with various ways of doing stuff, but the results looks not pretty good. If anything would give me a working all-in-one cmdline / .git/config entry - I'd really appreciate this :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From greened at obbligato.org Mon Mar 7 19:57:21 2011 From: greened at obbligato.org (David A. Greene) Date: Mon, 07 Mar 2011 19:57:21 -0600 Subject: [LLVMdev] [cfe-dev] Reminder: LLVM 2.9 Branching in One Week In-Reply-To: (Anton Korobeynikov's message of "Tue, 8 Mar 2011 04:30:47 +0300") References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> Message-ID: Anton Korobeynikov writes: > Hi David > >> I think the trouble with branches is the lockdown of the root repository >> directory. > Surely not (at the server) Yes, but ordinary users can't try to experiment and find a set of options that work as long as the server is locked down. So we have to go through server admins, which seems inefficient. But, I'm ok with that if we're making progress. >> git svn init --stdlayout https://@llvm.org/svn/llvm-project/llvm \ >> ?--ignore-paths="^.*(Apple|PowerPC.*|SVA|eh-experimental|ggreif|non-call-eh|parallel|release_.*|vector_llvm|wendling|May2007|checker|cremebrulee|start|RELEASE_1.*|RELEASE_2[0-7])" > Several problems here: > 1. Bunch of additional branches / tags are created due to multiple > branch points. I don't recall for llvm, but for clang we'll end with > two tags per each release. Something like: > $ git branch -r > trunk > tags/RELEASE_26 > tags/RELEASE_26 at 84939 > tags/RELEASE_27 > tags/RELEASE_27 at 102415 > tags/RELEASE_28 > tags/RELEASE_28 at 115869 Yep. But is this really a problem? All of these tags and branches must be useful, or why create them? If they aren't useful, add them to the --ignore-paths list. > The problem will be much worse with new release branch scheme, > basically we'll need to add each branch by hand, etc... Why? Ultimately the top level of llvm looks like the standard subversion layout: http://llvm.org/svn/llvm-project/llvm/ # branches/ # tags/ # trunk/ > 2. We really don't want to push arbitrary branches to git repository. > It's really easy to add branch by an accident, so it will be much You mean add a brach through the git-svn mirror via "git svn branch?" How can someone do that without having permission on the server? This seems like an svn permissions issue to me. Or are you worried about some svn user adding a branch? Again, this seems like a server permissions issue. The LLVM policy is "no branches" so why don't we enforce it and only allow the release manager to create them? > better not to ignore stuff, but except - add by some pattern. > Unfortunately, git-svn does not allow this yet. That certainly would be a useful feature. It shouldn't be hard to implement as it's just a perl script. Getting it accepted upstream and pushed out to clients is the bigger problem. > So, right now I'm experimenting with various ways of doing stuff, but > the results looks not pretty good. What's not good about them? So far my experience is that git svn does a really good job following svn branches and making them available via git. See for example this posting: http://www.jukie.net/bart/blog/svn-branches-in-git More experimentation is necessary, though. I'm still pretty new at this. > If anything would give me a working all-in-one cmdline / .git/config > entry - I'd really appreciate this :) I just blew mine away to try something else. :) But it did work well before. I'll send the new one to you when it's ready. I'm playing around with some git-svn stuff here with our repositories. If I find anything interesting I'll let you know. I believe the key to making this work is to have a separate "svn commit" branch in the user's git clone so that people who do dcommit always do it from there. That way git svn rebase won't screw up their "normal" git branches. I've tried a bit of this and it seems if one isolates the git svn stuff to its own branch, things work pretty smoothly. Graphically: svn | git svn init/fetch V git-svn clone | git clone V user's clone / \ / \ master commit The user does a "git pull" into their local master, works on it, creates local branches, etc. just as a normal git user would. Once something is ready to go upstream, the user does the following: git commit -a (commit to local git master) git checkout commit git merge master git svn rebase git svn dcommit Then the next git clone to master will pick up the change history and not conflict with the change as it exists in the user's local master. I did this with the existing LLVM git mirror on my most recent commit to verify that there were no issues. It was a simple change to a component that doesn't change much, but I'm going to get some more experience with this in the coming days and weeks. I set things up a suggested by Tobias Grosser: http://permalink.gmane.org/gmane.comp.compilers.clang.devel/12843 It works well for trunk. I think it should work equally well if/when branches are added to the git mirror. Since no one should be committing to branches except the release manager, everything will always go upstream through the local "commit" branch. This keeps the svn metadata sane. >From my point of view, the main reason to add the branches to the git mirror is to *greatly* ease the burden for third parties when they upgrade to a new release AND when they send patches upstream. The git branch/merge/conflict resolution process is just killer for this. I imagine most third parties don't work off of trunk. We certainly don't. -Dave From wendling at apple.com Mon Mar 7 21:01:03 2011 From: wendling at apple.com (Bill Wendling) Date: Mon, 7 Mar 2011 19:01:03 -0800 Subject: [LLVMdev] Announcing LLVM 2.9 Testing! Message-ID: <3BF72DAA-22AA-442A-830F-81630E3BAB39@apple.com> It's that time again! (Well, it was that time yesterday, but I made a mistake.) The LLVM 2.9 release is now underway! 2.9 Will Be The Last llvm-gcc Release! That's right! It's the end of an era. The llvm-gcc front-end has served us very well over the years, but with the advent of Clang and DragonEgg it is starting to suffer from bit rot. Starting with the 3.0 release, Clang will be the main compiler for most people. For those who wish to use a GCC-compatible front-end or who use non-C languages, there is Duncan's DragonEgg project. Release Timeline: ? No new features will be accepted after branch creation. This is a firm requirement for the release. ? Phase 1 testing will start up immediately after branch creation. It will last until the 14th. During phase 1, we will accept only patches for regressions from the 2.8 release and any clean-up work for existing features. All features must be completed before phase 2 starts. If a feature is not completed by the beginning of phase 2, it will be disabled by default. ? Phase 2 testing will start up on the 21st. During phase 2, we will accept only patches for critical bugs. ? There will be a third phase of testing only if phase 2 testing unveils critical bugs or regressions from the 2.8 release. ? The release is scheduled for April 3rd! Developers: Top-of-tree is now open for submissions. The 2.9 release branch and tags are available for you to check out and test. However, please do not commit patches to the 2.9 release branch. All patches must be approved by the "code owners" before they are accepted into the branch. See this website for who to contact regarding a patch you feel is necessary for the release: http://llvm.org/docs/DeveloperPolicy.html#owners Please grab the release sources, build them, and start compiling things. File bugs for any errors you see. Once binaries are available, we will be posting them for people to use. There have been discussions about creating a mirror branch in the git repository. However, my git-fu is sufficiently bad enough that I will cause the moon to fall from the sky if I were to attempt it (and that's if I did nothing wrong). I encourage someone in the community who knows git better than me to create the branches. Testers: If you wish to volunteer to be a tester, please let me know! :-) The release tags are now available for check-out. The branch and tag structure in SVN is detailed on this webpage: http://llvm.org/docs/HowToReleaseLLVM.html There is a script which you can run to build the compiler and have it ready to run the nightly tests: utils/release/test-release.sh Testers need to verify that there are no regressions with respect to LLVM 2.8. Please run the test suite to verify this. (Tanya, I lost the script that does this comparison. Do you have it still?) Bug Reports: If you file a bug report that you think is necessary for the 2.9 release, please tag that bug with the 2.9 version. The code owners will perform triage on the bugs and select severity. Unfortunately, not all bugs may be resolved by the release date, but we should address all critical bugs. Code Owners: Please review any patches you feel are necessary for the release. As we go further into the release process, be more and more conservative in your choices. Also, please keep an eye on the Bugzilla database. If any issues arise they need to be triaged and classified accordingly. Once you've determined that a patch is good for the release, please forward it to me and I will patch the branch. Share and enjoy! -bw -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110307/cd7bf369/attachment-0001.html From grosser at fim.uni-passau.de Mon Mar 7 21:57:37 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 07 Mar 2011 22:57:37 -0500 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> Message-ID: <4D75A931.3030306@fim.uni-passau.de> On 03/07/2011 08:30 PM, Anton Korobeynikov wrote: > Hi David > >> I think the trouble with branches is the lockdown of the root repository >> directory. > Surely not (at the server) > >> git svn init --stdlayout https://@llvm.org/svn/llvm-project/llvm \ >> --ignore-paths="^.*(Apple|PowerPC.*|SVA|eh-experimental|ggreif|non-call-eh|parallel|release_.*|vector_llvm|wendling|May2007|checker|cremebrulee|start|RELEASE_1.*|RELEASE_2[0-7])" > Several problems here: > 1. Bunch of additional branches / tags are created due to multiple > branch points. I don't recall for llvm, but for clang we'll end with > two tags per each release. Something like: > $ git branch -r > trunk > tags/RELEASE_26 > tags/RELEASE_26 at 84939 > tags/RELEASE_27 > tags/RELEASE_27 at 102415 > tags/RELEASE_28 > tags/RELEASE_28 at 115869 > > The problem will be much worse with new release branch scheme, > basically we'll need to add each branch by hand, etc... > 2. We really don't want to push arbitrary branches to git repository. > It's really easy to add branch by an accident, so it will be much > better not to ignore stuff, but except - add by some pattern. > Unfortunately, git-svn does not allow this yet. Why not? As far as I understand --ignore-paths takes a perl regular expression. So we could just provide a regular expression that matches on all paths except the ones we want to keep. The following expression e.g. /^.*(? So, right now I'm experimenting with various ways of doing stuff, but > the results looks not pretty good. > If anything would give me a working all-in-one cmdline / .git/config > entry - I'd really appreciate this :) > I would love to. However, as David pointed out, this is difficult with the blocked svn access. Cheers Tobi From sanjoy at playingwithpointers.com Mon Mar 7 23:19:19 2011 From: sanjoy at playingwithpointers.com (Sanjoy Das) Date: Tue, 08 Mar 2011 10:49:19 +0530 Subject: [LLVMdev] First Patch In-Reply-To: References: <4D734FB9.7030209@playingwithpointers.com> Message-ID: <4D75BC57.9000700@playingwithpointers.com> Hi! I've attached a patch which takes care of the issues mentioned (and adds two tests). -- Sanjoy Das http://playingwithpointers.com -------------- next part -------------- A non-text attachment was scrubbed... Name: ripple-bucket.diff Type: text/x-diff Size: 3318 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/0814e3e8/attachment.bin From baldrick at free.fr Tue Mar 8 01:37:12 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 08 Mar 2011 08:37:12 +0100 Subject: [LLVMdev] 64 bit MRV problem; Missed optimizations. In-Reply-To: References: Message-ID: <4D75DCA8.4040804@free.fr> Hi Arushi, > I was tracking the issue discussed earlier, and I was wondering if a bug for the > missed optimizations, was ever filed, and if it has been fixed since? If so in > which llvm version, and more specifically which optimization pass. > > http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028877.html this doesn't really sound like an LLVM optimizer issue, it looks more like an issue with the llvm-g++ front-end to LLVM. In order to conform to the platform ABI (x86-64 in this case), the front-end has to carefully arrange how parameters are passed to functions and return values handled. This can result in nasty code that picks parameters apart and puts one bit in a float, another in an int etc. It's often hard for the optimizers to do much about this - so the front-end needs to carefully do things in such a way as to help the optimizers as much as possible while maintaining ABI conformance. I don't think this will ever be improved in llvm-g++ (which is now deprecated). Hopefully clang does a better job. I plan to rewrite the ABI stuff completely in dragonegg, so it may end up being fixed there one day. Ciao, Duncan. From cessu at iki.fi Tue Mar 8 02:41:09 2011 From: cessu at iki.fi (Kenneth Oksanen) Date: Tue, 08 Mar 2011 10:41:09 +0200 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: <1299573669.29836.65.camel@salamander> On Mon, 2011-03-07 at 15:05 -0700, Joshua Warner wrote: > I actually meant uncommon in the sense of having stack-allocated > unions that participate in garbage collection. Off the top of my > head, I could only name one language (ML) that might use a feature > like that. Even then, I suspect most ML implementations would > actually push that stuff onto the heap. Common Lisp has (declare (dynamic-extent ..)). But IMHO this is not a language-dependent issue. Rather, whenever any language front-end using LLVM recognizes some (union) object can not outlive the call, it would be a significant optimization if LLVM would support stack-allocating the object. > The *real* solution is to make root-ness a function of type. > In other words, you can mark any type as being a root, which > exposes the base address of all objects of that type to the > garbage collector. This is essentially the same as the > pointer-address-space suggestion, except that it's not limited > to pointers. (In practice, it would only ever apply to > pointers and structs.) Yes, that would be the most intuitive solution. However, note that there may be several garbage collected heaps using different garbage collectors. Therefore the indicator for "rootness" is not merely binary. > Being able to make a "marked" version of every type seems unnecessary, > and in some cases, somewhat non-intuitive. Take for instance, making > a "marked" float type - which I can't think of any good use for. Such cases may sound exotic, but perhaps not non-existing. For example, assume one wants to write a heap that supports generating statistics of all live values, say, for the benefit of testing for memory leaks in long-running servers. Or assume taking a snapshot of the computation onto disk and recovering it in another machine with a different representation of the (non-pointer) data type. Or checking (by checksum exchange) whether the computational states match in a set of mutually replicating computers running in lockstep. (I've actually done all of these, although without the involvement of LLVM.) -- ; mailto:cessu at iki.fi http://www.iki.fi/~cessu http://cessu.blogspot.com ((lambda(a) (a a((lambda(a)(lambda()(set! a(+ a 1))a))1)))(lambda(a c) ((lambda(b) (newline)(write b)(a a((lambda(c)(lambda()(c c)))(lambda(a) ((lambda(c) (if(=(modulo c b)0)(a a)c))(c))))))(c)))) ; Scheme me! From Olaf.Krzikalla at tu-dresden.de Tue Mar 8 03:07:17 2011 From: Olaf.Krzikalla at tu-dresden.de (Olaf Krzikalla) Date: Tue, 08 Mar 2011 10:07:17 +0100 Subject: [LLVMdev] MSVC compiling issue Message-ID: <4D75F1C5.6040900@tu-dresden.de> Hi @llvm, building a debug version under MSVC 9 leads to a compiler error due to a mix of different types in a call to upper_bound. I have attached a hot-fix but I'm rather unsure if it should be applied as it is, since IMHO the reason is a MSVC library bug ("IMHO", because I don't know the requirements imposed to the predicate by the standard). Best regards Olaf Krzikalla Index: lib/CodeGen/LiveInterval.cpp =================================================================== --- lib/CodeGen/LiveInterval.cpp (revision 127221) +++ lib/CodeGen/LiveInterval.cpp (working copy) @@ -39,6 +39,9 @@ bool operator()(const LiveRange &A, SlotIndex B) const { return A.end < B; } + bool operator()(const LiveRange &A, const LiveRange &B) const { + return A.end < B.end; + } }; } From fvbommel at gmail.com Tue Mar 8 03:12:15 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Tue, 8 Mar 2011 10:12:15 +0100 Subject: [LLVMdev] 64 bit MRV problem; Missed optimizations. In-Reply-To: <4D75DCA8.4040804@free.fr> References: <4D75DCA8.4040804@free.fr> Message-ID: On Tue, Mar 8, 2011 at 8:37 AM, Duncan Sands wrote: >> I was tracking the issue discussed earlier, and I was wondering if a bug for the >> missed optimizations, was ever filed, and if it has been fixed since? If so in >> which llvm version, and more specifically which optimization pass. >> >> http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028877.html > > this doesn't really sound like an LLVM optimizer issue, it looks more like an > issue with the llvm-g++ front-end to LLVM. ?In order to conform to the platform > ABI (x86-64 in this case), the front-end has to carefully arrange how parameters > are passed to functions and return values handled. ?This can result in nasty > code that picks parameters apart and puts one bit in a float, another in an > int etc. ?It's often hard for the optimizers to do much about this - so the > front-end needs to carefully do things in such a way as to help the optimizers > as much as possible while maintaining ABI conformance. ?I don't think this will > ever be improved in llvm-g++ (which is now deprecated). ?Hopefully clang does a > better job. ?I plan to rewrite the ABI stuff completely in dragonegg, so it may > end up being fixed there one day. When I wrote some ABI stuff for LDC, I noticed that a double is passed the same way as <2 x float> on x86-64. My guess is that if what you're passing is actually two floats instead of a double (and even if you're passing a union that might be either) then passing it as <2 x float> would result in much nicer code since LLVM should be able to figure out how to take vectors apart if that's needed (or how to bitcast if you want the double member of the union). From baldrick at free.fr Tue Mar 8 03:19:46 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 08 Mar 2011 10:19:46 +0100 Subject: [LLVMdev] Announcing LLVM 2.9 Testing! In-Reply-To: <3BF72DAA-22AA-442A-830F-81630E3BAB39@apple.com> References: <3BF72DAA-22AA-442A-830F-81630E3BAB39@apple.com> Message-ID: <4D75F4B2.5090304@free.fr> Hi Bill, > The release tags are now available for check-out. The branch and tag structure > in SVN is detailed on this webpage: > > http://llvm.org/docs/HowToReleaseLLVM.html can you please explain explicitly where to get hold of the branch and tags. The branch seems to live at http://llvm.org/svn/llvm-project/llvm/branches/release_29 but there seems to be nothing relevant under http://llvm.org/svn/llvm-project/llvm/tags Thanks, Duncan. From wendling at apple.com Tue Mar 8 04:12:43 2011 From: wendling at apple.com (Bill Wendling) Date: Tue, 8 Mar 2011 02:12:43 -0800 Subject: [LLVMdev] Announcing LLVM 2.9 Testing! In-Reply-To: <4D75F4B2.5090304@free.fr> References: <3BF72DAA-22AA-442A-830F-81630E3BAB39@apple.com> <4D75F4B2.5090304@free.fr> Message-ID: On Mar 8, 2011, at 1:19 AM, Duncan Sands wrote: > Hi Bill, > >> The release tags are now available for check-out. The branch and tag structure >> in SVN is detailed on this webpage: >> >> http://llvm.org/docs/HowToReleaseLLVM.html > > can you please explain explicitly where to get hold of the branch and tags. The > branch seems to live at > http://llvm.org/svn/llvm-project/llvm/branches/release_29 > but there seems to be nothing relevant under > http://llvm.org/svn/llvm-project/llvm/tags > Hi Duncan, My apologies for that. I created the branch and was going to do a quick build test to make sure that it was ready for testing before creating the tags. The tags should be available now. However, I had trouble branching the test-suite sources. So there isn't a branch or tag for it just yet. I hope to have that fixed in the near future, but for now it should be fine to use the test-suite ToT. -bw From justin.holewinski at gmail.com Tue Mar 8 06:30:28 2011 From: justin.holewinski at gmail.com (Justin Holewinski) Date: Tue, 8 Mar 2011 07:30:28 -0500 Subject: [LLVMdev] PTX Backend in 2.9 Message-ID: What is the LLVM policy regarding distribution of incomplete back-ends? I ask because I am working on the PTX back-end and know that it is vastly incomplete at the moment, yet it is included in the 2.9 release branch and builds by default. I do not necessarily have a problem with it being distributed, I just want to make sure that the people in charge of the 2.9 release know that it would be a stretch to even call it "experimental." -- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/9c338d5c/attachment.html From fvbommel at gmail.com Tue Mar 8 06:58:49 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Tue, 8 Mar 2011 13:58:49 +0100 Subject: [LLVMdev] First Patch In-Reply-To: <4D75BC57.9000700@playingwithpointers.com> References: <4D734FB9.7030209@playingwithpointers.com> <4D75BC57.9000700@playingwithpointers.com> Message-ID: On Tue, Mar 8, 2011 at 6:19 AM, Sanjoy Das wrote: > Hi! > > I've attached a patch which takes care of the issues mentioned (and adds two > tests). > Index: test/Transforms/InstCombine/sext.ll > =================================================================== > --- test/Transforms/InstCombine/sext.ll (revision 127153) > +++ test/Transforms/InstCombine/sext.ll (working copy) > @@ -126,3 +126,16 @@ > ; CHECK-NEXT: store <2 x i16> > ; CHECK-NEXT: ret > } > + > +define i64 @test12(i32 %x) { > + %a = and i32 %x, -5 > + %b = sext i32 %a to i64 > + %c = add i64 %b, 1 > + ret i64 %c > + > +; CHECK: @test12 > +; CHECK-NEXT: and i32 > +; CHECK-NEXT: add nsw Why not check for 'i32' after the add? Isn't the entire point of this patch to shrink the add in cases like this? > +; CHECK-NEXT: sext i32 > +; CHECK-NEXT: ret i64 > +} > Index: test/Transforms/InstCombine/add-sitofp.ll > =================================================================== > --- test/Transforms/InstCombine/add-sitofp.ll (revision 127153) > +++ test/Transforms/InstCombine/add-sitofp.ll (working copy) > @@ -1,4 +1,5 @@ > ; RUN: opt < %s -instcombine -S | grep {add nsw i32} > +; RUN: opt < %s -instcombine -S | grep sitofp | count 2 When adding to old tests like this one it's better to migrate them from grep to FileCheck. RUN: opt < %s -instcombine -S | FileCheck > > define double @x(i32 %a, i32 %b) nounwind { CHECK: @x CHECK: add nsw i32 > %m = lshr i32 %a, 24 > @@ -7,3 +8,12 @@ > %p = fadd double %o, 1.0 > ret double %p > } > + > +define double @y(i32 %x, i32 %y) { CHECK: @y CHECK: Note that FileCheck is a nice way to perform more precise checking here :). In particular, if I recall the transformation this is being used for correctly, you want to check this does an integer addition instead of a floating-point one? Though in this case the 'add' later gets optimized to an 'or', so maybe you should replace -4 to -5 to keep the test clearer (and be able to check for the 'nsw' flag). > + %p = and i32 %x, -4 > + %q = and i32 %y, 1 CHECK: add nsw i32 CHECK: sitofp CHECK-NOT: sitofp CHECK-NOT: fadd (Those last two are to confirm you removed the other sitofp and the floating-point add) > + %a = sitofp i32 %p to double > + %b = sitofp i32 %q to double > + %result = fadd double %a, %b > + ret double %result > +} > Index: lib/Transforms/InstCombine/InstCombineAddSub.cpp > =================================================================== > --- lib/Transforms/InstCombine/InstCombineAddSub.cpp (revision 127153) > +++ lib/Transforms/InstCombine/InstCombineAddSub.cpp (working copy) > @@ -57,6 +57,30 @@ > } > > > +static bool RippleBucketExists(APInt &thisKnownZero, APInt &thisKnownOne, > + APInt &otherKnownZero, unsigned width) { > + APInt mask; > + // First try to to take care of the case > + // (X & ~4) + (Y & 1) > + int32_t power = (~thisKnownZero).exactLogBase2(); > + if (power == -1) { > + if (~thisKnownZero == thisKnownOne) > + power = thisKnownOne.exactLogBase2(); Why did you introduce the extra log here? This assignment is a no-op when ~thisKnownZero == thisKnownOne... > + if (power == -1) ... which means this check is redundant. > + return false; > + } > + > + if (power == (width - 1)) Hmm.. I know I said 'width' should be unsigned, but now I get a warning here for comparing a signed integer to an unsigned one (and LLVM is supposed to compile without warnings). So I guess changing width back to int isn't a disaster. (Who needs integers with over 2 million bits anyway?) Alternatively, since you know 'power' is non-negative here you can safely cast it to unsigned for the comparison. > + mask = APInt::getSignBit(width); > + else > + mask = APInt::getBitsSet(width, power + 1, width - 2); You removed the 'if (power < width-2)' case from the code I mentioned as equivalent to your loop. Because of this, if power + 1 > width - 2 (or equivalently in this case, if power == width - 2) this will create a "wrapped" bit set: the upper bit and the lower width-2 bits will be set, leaving only bit width-2 zero. That isn't what you want, right? > + > + if ((mask & otherKnownZero).getBoolValue()) > + return true; > + > + return false; This can be more concisely written as just return (mask & otherKnownZero).getBoolValue(); > +} > + > /// WillNotOverflowSignedAdd - Return true if we can prove that: > /// (sext (add LHS, RHS)) === (add (sext LHS), (sext RHS)) > /// This basically requires proving that the add in the original type would not > @@ -77,9 +101,19 @@ > // has a known-zero bit in a more significant place than it (not including the > // sign bit) the ripple may go up to and fill the zero, but won't change the > // sign. For example, (X & ~4) + 1. > - > - // TODO: Implement. > - > + > + unsigned width = LHS->getType()->getScalarSizeInBits(); > + APInt mask(width, -1, true), LHSKnownZero(width, 0), LHSKnownOne(width, 0), > + RHSKnownZero(width, 0), RHSKnownOne(width, 0); > + > + ComputeMaskedBits(LHS, mask, LHSKnownZero, LHSKnownOne); > + ComputeMaskedBits(RHS, mask, RHSKnownZero, RHSKnownOne); > + > + if (RippleBucketExists(LHSKnownZero, LHSKnownOne, RHSKnownZero, width)) > + return true; > + if (RippleBucketExists(RHSKnownZero, RHSKnownOne, LHSKnownZero, width)) > + return true; > + > return false; > } From fabian.scheler at gmail.com Tue Mar 8 07:29:35 2011 From: fabian.scheler at gmail.com (Fabian Scheler) Date: Tue, 8 Mar 2011 14:29:35 +0100 Subject: [LLVMdev] TargetData::getPreferredAlignment(const GlobalVariable *GV) is strange ... Message-ID: Hello everybody, I am somewhat confused by the following method within the LLVM, especially the lines "confusion starts" -> "confusion ends" are hard to follow. Maybe the idea is that if there are such big data structures one does not waste much memory anyway if they are aligned to a 16-byte boundary. However, my assembler complains here because it only supports 1-, 2-, 4- and 8-byte boundaries :-( I checked the svn log but I didn't find any explanation, the doxygen docu is not very helpful here, too. So, any help on this issue is highly appreciated. Thanks in advance! Ciao, Fabian unsigned TargetData::getPreferredAlignment(const GlobalVariable *GV) const { const Type *ElemType = GV->getType()->getElementType(); unsigned Alignment = getPrefTypeAlignment(ElemType); if (GV->getAlignment() > Alignment) Alignment = GV->getAlignment(); ==================== confusion starts ======================== if (GV->hasInitializer()) { if (Alignment < 16) { // If the global is not external, see if it is large. If so, give it a // larger alignment. if (getTypeSizeInBits(ElemType) > 128) Alignment = 16; // 16-byte alignment. } } ==================== confusion ends ========================= return Alignment; } From ofv at wanadoo.es Tue Mar 8 07:57:05 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Tue, 08 Mar 2011 14:57:05 +0100 Subject: [LLVMdev] MSVC compiling issue References: <4D75F1C5.6040900@tu-dresden.de> Message-ID: <87mxl5vhou.fsf@wanadoo.es> Olaf Krzikalla writes: > Hi @llvm, > > building a debug version under MSVC 9 leads to a compiler error due to a > mix of different types in a call to upper_bound. I have attached a > hot-fix but I'm rather unsure if it should be applied as it is, since > IMHO the reason is a MSVC library bug ("IMHO", because I don't know the > requirements imposed to the predicate by the standard). I think that the original author just missed a `const'. Fixed in r127245. From Jacques.VanDamme at synopsys.com Tue Mar 8 04:14:55 2011 From: Jacques.VanDamme at synopsys.com (Jacques Van Damme) Date: Tue, 8 Mar 2011 11:14:55 +0100 Subject: [LLVMdev] backend question Message-ID: Hi All, I am writing a backend for an architecture that has only 16-bit word addressing (No byte addresses ever. All data are always 16-bit). How can I specify this in the backend? As an example, consider the following instruction: %arrayidx = getelementptr [129 x i16]* @flags, i16 0, i16 %i.043 When I generate assembler code, this now results in %i.043 being multiplied by 2 in the address calculation which result in a shift being emitted. How can I avoid this? Any help would be greatly appreciated. Thanks in advance, Jacques Van Damme. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/e06ac6e9/attachment.html From rengolin at systemcall.org Tue Mar 8 10:15:10 2011 From: rengolin at systemcall.org (Renato Golin) Date: Tue, 8 Mar 2011 16:15:10 +0000 Subject: [LLVMdev] llvm-diff In-Reply-To: References: Message-ID: Hi John, I believe my refactoring went well. I've changed the source considerably to spread out the classes a bit so I can re-use the log builder and consumer. I'll also put the priority queue and the graph I just did on separate headers (as they are templates). I ran all tests and they pass, but I'm not sure how much llvm-diff relies on the check-all tests and how much you have your own tests... Is there anything else I can test on before sending the patches? I haven't implemented the metadata diff yet, but would be good to send the refactoring first, and then the graph and metadata diff. Would also be good to wait until next week to do so, as 2,9 is forking this week. cheers, --renato From viridia at gmail.com Tue Mar 8 10:26:01 2011 From: viridia at gmail.com (Talin) Date: Tue, 8 Mar 2011 08:26:01 -0800 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Mon, Mar 7, 2011 at 2:05 PM, Joshua Warner wrote: > > > On Mon, Mar 7, 2011 at 12:48 PM, Talin wrote: > >> On Mon, Mar 7, 2011 at 10:58 AM, Joshua Warner wrote: >> >>> Hi Talin, >>> >>> Sorry to interject - >>> >>> >>>> For example, suppose I have a type "String or (float, float, float)" - >>>> that is, a union of a string and a 3-tuple of floats. Most of the time what >>>> LLVM will see is { i1; { float; float; float; } } because that's bigger than >>>> { i1; String* }. LLVM won't even know there's a pointer in there, except >>>> during those brief times when I'm accessing the pointer field. So tagging >>>> the pointer in a different address space won't help at all here. >>>> >>>> >>> I think this is a fairly uncommon use case that will be tricky to deal >>> with no matter what method is used to track GC roots. That said, why not do >>> something like make the pointer representation (the {i1, String*}) the >>> long-term storage format, and only bitcast *just* before loading the >>> floats? You could even use another address space to indicate that something >>> is *sometimes* a pointer, dependent upon some other value (the i1, perhaps >>> indicated with metadata). >>> >> >> I don't know if it's an uncommon use case or not, but it is something that >> I handle already in my frontend. (I suppose it's uncommon in the sense that >> almost no one uses the garbage collection features of LLVM, but part of the >> goal of this discussion is to change that.) >> > > I actually meant uncommon in the sense of having stack-allocated unions > that participate in garbage collection. Off the top of my head, I could > only name one language (ML) that might use a feature like that. Even then, > I suspect most ML implementations would actually push that stuff onto the > heap. > > >> The problem with making { i1, String* } the long-term storage format is >> that it isn't large enough in the example I gave, so you'll overwrite other >> fields if you try to store the three floats. The more general issue is that >> the concepts we're talking about simply aren't expressible in IR as it >> exists today. >> > > Good catch - what I actually intended to indicate was the String "half" of > the union, properly padded - so something more like {i1, String*, float} > (for 64-bit pointers). > > >> >>> My vote (not that it really counts for much) would be the address-space >>> method. It seems much more elegant. >>> >> >> I agree that the current solution isn't the best. The problem I have is >> that the solutions that are being suggested are going to break my code >> badly, and with no way to fix it. >> >> The *real* solution is to make root-ness a function of type. In other >> words, you can mark any type as being a root, which exposes the base address >> of all objects of that type to the garbage collector. This is essentially >> the same as the pointer-address-space suggestion, except that it's not >> limited to pointers. (In practice, it would only ever apply to pointers and >> structs.) >> >> (Heck, I'd even be willing to go with a solution where only structs and >> not pointers could be roots - it means I'd have to wrap every pointer in a >> struct, which would be a royal pain, but it would at least work.) >> > > Hmm... do you mean something like a "marked" bit (or maybe a vector of > mark_ids) in every type, where you could query a function for values of > "marked" types at particular safe points? This sounds like something that > might solve the problem described below with compressed pointers (not that I > am actually encountering this problem) - but in the near-term, it seems to > me that everything that you could conceivably mark as a GC root would > somehow contain a pointer value. In this case, union support in LLVM would > make the generated IR cleaner, but not necessarily any more correct. > > Being able to make a "marked" version of every type seems unnecessary, and > in some cases, somewhat non-intuitive. Take for instance, making a "marked" > float type - which I can't think of any good use for. I like the idea of > using address spaces because it keeps the concepts in IR largely orthogonal, > rather than having features that overlap in purpose in many cases. That, > and IMO it just makes sense for pointers into the (or, in general, a) heap > be considered in a different address space from "normal" pointers. This > could extend well to tracking pointers onto the stack (as seen in C# > out/ref) for the purpose of generating closures (in .NET - which doesn't > currently have this feature). > > Let me ask a question before we go too much further. Currently the argument to llvm.gcroot must be an alloca instruction. You cannot GEP an internal field within the alloca and pass it to the gcroot intrinsic. So the entire alloca is considered a root, even if it has non-pointer fields. My question is, in this new address-space proposal, are we talking about changing this so that the garbage collector only "sees" the internal pointer fields within the alloca, or will it still be able to "see" the entire alloca? This is the crucial point for me - I've written my GC strategy to deal with complex allocas, and there are several data types - such as unions - which depend on this. I can probably work around the union issue using methods like you suggest - that is building some "dummy" type containing a pointer as the long-term storage format - as long as the GC can still see the entire struct. It's ugly because it means that my frontend has to know about padding and alignment and such, issues which I prefer to leave to LLVM to figure out. But if we change it so that the GC only sees pointers, then I'm dead in the water. As far as my suggestion of marking types go, you are right, it doesn't make sense for most types. It really only matters for structs and pointers. Imagine if structs had an "isRoot" flag that lived next to "isPacked", which makes the struct a distinct type. This would be written in IR as "gcroot { i1, float }" or something like that. The presence of this flag has the same effect as marking a pointer in the GC address space. Combine that with the ability to mark SSA values as roots, and my life would get vastly simpler and my generated code would get about 20% faster :) > >>> The only thing that I think would be unusually difficult for the >>> address-space method to handle would be alternative pointer representations, >>> such as those used in the latest version of Hotspot (see >>> http://wikis.sun.com/display/HotSpotInternals/CompressedOops). >>> Essentially, a 64-bit pointer is packed into 32-bits by assuming 8-byte >>> alignment and restricting the heap size to 32GB. I've seen similar >>> object-reference bitfields used in game engines. In this case, there is no >>> "pointer" to attach the address space to. >>> >>> (Yes, I know that Hotspot currently uses CompressedOops ONLY in the heap, >>> decompressing them when stored in locals, but it is not inconceivable to >>> avoid decompressing them if the code is just moving them around, as an >>> optimization.) >>> >>> Just my few thoughts. >>> >>> -Joshua >>> >> >> >> >> -- >> -- Talin >> > > -Joshua > > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/11823f0f/attachment.html From dpatel at apple.com Tue Mar 8 10:30:12 2011 From: dpatel at apple.com (Devang Patel) Date: Tue, 8 Mar 2011 08:30:12 -0800 Subject: [LLVMdev] DW_TAG_lexical_block structure in debug information In-Reply-To: References: Message-ID: On Mar 7, 2011, at 4:16 AM, Eli Bendersky wrote: > Hello, > > The documentation for debug information > (http://llvm.org/docs/SourceLevelDebugging.html) says the structure of > block descriptors metadata is: > > !3 = metadata !{ > i32, ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block) > metadata,;; Reference to context descriptor > i32, ;; Line number > i32 ;; Column number > } > > However, looking at the generated metadata, there are 2 extra fields > not documented here. From the source code it appears to be a link to > the function holding the block, and a unique integer ID: > > DILexicalBlock DIBuilder::createLexicalBlock(DIDescriptor Scope, DIFile File, > unsigned Line, unsigned Col) { > // Defeat MDNode uniqing for lexical blocks by using unique id. > static unsigned int unique_id = 0; > Value *Elts[] = { > GetTagConstant(VMContext, dwarf::DW_TAG_lexical_block), > Scope, > ConstantInt::get(Type::getInt32Ty(VMContext), Line), > ConstantInt::get(Type::getInt32Ty(VMContext), Col), > File, > ConstantInt::get(Type::getInt32Ty(VMContext), unique_id++) > }; > return DILexicalBlock(MDNode::get(VMContext, &Elts[0], array_lengthof(Elts))); > } > > Is this an error in the documentation? > These two fields were added to support scopes inside template functions (r107919). I updated the docs today in r127249. Thanks! - Devang From clattner at apple.com Tue Mar 8 11:01:02 2011 From: clattner at apple.com (Chris Lattner) Date: Tue, 8 Mar 2011 09:01:02 -0800 Subject: [LLVMdev] PTX Backend in 2.9 In-Reply-To: References: Message-ID: <90481C49-A0CD-4874-9E0B-2FF36EB5F68A@apple.com> On Mar 8, 2011, at 4:30 AM, Justin Holewinski wrote: > What is the LLVM policy regarding distribution of incomplete back-ends? I ask because I am working on the PTX back-end and know that it is vastly incomplete at the moment, yet it is included in the 2.9 release branch and builds by default. I do not necessarily have a problem with it being distributed, I just want to make sure that the people in charge of the 2.9 release know that it would be a stretch to even call it "experimental." Hi Justin, Don't worry about it, we'll just add an appropriate comment to the release notes. -Chris From clattner at apple.com Tue Mar 8 11:03:41 2011 From: clattner at apple.com (Chris Lattner) Date: Tue, 8 Mar 2011 09:03:41 -0800 Subject: [LLVMdev] description of llvm::Value correct? In-Reply-To: <4D73A5E2.40102@arcor.de> References: <4D73A5E2.40102@arcor.de> Message-ID: <03263913-E5EC-4EC8-A762-0849B8356A88@apple.com> On Mar 6, 2011, at 7:18 AM, Jochen Wilhelmy wrote: > Hi! > > in the detailed description of llvm::Value it says: > > All _types_ can have a name and they should belong to some Module > > Is this correct or is it rather > > All _values_ can have a name and they should belong to some Module? Yes, that was wrong. I updated the comment in r127252. Also, ConstantInt can't have a name, so the comment was wrong for other reasons as well. > Is it correct to use types across modules (in the same context)? Yep, that is correct. -Chris From clattner at apple.com Tue Mar 8 11:28:36 2011 From: clattner at apple.com (Chris Lattner) Date: Tue, 8 Mar 2011 09:28:36 -0800 Subject: [LLVMdev] Full Time LLVM Compiler position In-Reply-To: References: Message-ID: <200C21B6-72FD-438A-AF18-0E410F962AD2@apple.com> On Mar 3, 2011, at 4:07 PM, Villmow, Micah wrote: > Compiler Engineer, Stream Computing > > > We are currently looking for a software engineer as part of the core team developing OpenCL, a new open standard for heterogonous general purpose programming, compilers for multi-core CPU and many-core graphics systems. Hi Micah, Job postings on this list are ok, but only if they are specifically related to LLVM. If you post any more in the future, please make it clear how any future postings are related to LLVM. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/0c9aced5/attachment.html From Micah.Villmow at amd.com Tue Mar 8 11:32:40 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Tue, 8 Mar 2011 11:32:40 -0600 Subject: [LLVMdev] Full Time LLVM Compiler position In-Reply-To: <200C21B6-72FD-438A-AF18-0E410F962AD2@apple.com> References: <200C21B6-72FD-438A-AF18-0E410F962AD2@apple.com> Message-ID: Ok, will do. Just to clarify. AMD uses LLVM for our OpenCL compiler for everything except the frontend, so it is a heavily LLVM reliant position. Micah From: Chris Lattner [mailto:clattner at apple.com] Sent: Tuesday, March 08, 2011 9:29 AM To: Villmow, Micah Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Full Time LLVM Compiler position On Mar 3, 2011, at 4:07 PM, Villmow, Micah wrote: Compiler Engineer, Stream Computing We are currently looking for a software engineer as part of the core team developing OpenCL, a new open standard for heterogonous general purpose programming, compilers for multi-core CPU and many-core graphics systems. Hi Micah, Job postings on this list are ok, but only if they are specifically related to LLVM. If you post any more in the future, please make it clear how any future postings are related to LLVM. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/4d2338c2/attachment-0001.html From rjmccall at apple.com Tue Mar 8 11:39:43 2011 From: rjmccall at apple.com (John McCall) Date: Tue, 8 Mar 2011 09:39:43 -0800 Subject: [LLVMdev] llvm-diff In-Reply-To: References: Message-ID: <6A787350-DA57-442D-B24F-C36F6D1746CB@apple.com> On Mar 8, 2011, at 8:15 AM, Renato Golin wrote: > I believe my refactoring went well. I've changed the source > considerably to spread out the classes a bit so I can re-use the log > builder and consumer. I'll also put the priority queue and the graph I > just did on separate headers (as they are templates). Excellent. > I ran all tests and they pass, but I'm not sure how much llvm-diff > relies on the check-all tests and how much you have your own tests... > Is there anything else I can test on before sending the patches? I'm sad to say that there isn't any formalized testing yet at all; I just use it and see whether it still works. > I haven't implemented the metadata diff yet, but would be good to send > the refactoring first, and then the graph and metadata diff. Would > also be good to wait until next week to do so, as 2,9 is forking this > week. 2.9 has forked, feel free to send it. John. From rengolin at systemcall.org Tue Mar 8 12:03:34 2011 From: rengolin at systemcall.org (Renato Golin) Date: Tue, 8 Mar 2011 18:03:34 +0000 Subject: [LLVMdev] llvm-diff In-Reply-To: <6A787350-DA57-442D-B24F-C36F6D1746CB@apple.com> References: <6A787350-DA57-442D-B24F-C36F6D1746CB@apple.com> Message-ID: On 8 March 2011 17:39, John McCall wrote: > I'm sad to say that there isn't any formalized testing yet at all; ?I just use > it and see whether it still works. Ok, I'll send you first, then if all is well, commit. Would be good to have some tests, though... ;) cheers, --renato From kd at kendyck.com Tue Mar 8 12:59:04 2011 From: kd at kendyck.com (Ken Dyck) Date: Tue, 8 Mar 2011 13:59:04 -0500 Subject: [LLVMdev] backend question In-Reply-To: References: Message-ID: On Tue, Mar 8, 2011 at 5:14 AM, Jacques Van Damme wrote: > I am writing a backend for an architecture that has only 16-bit word > addressing (No byte addresses ever.? All data are always 16-bit). > > How can I specify this in the backend? In short, you can't. Word-addressable memory is not currently supported in LLVM (or Clang, for that matter). > As an example, consider the following instruction: > > ?? %arrayidx = getelementptr [129 x i16]* @flags, i16 0, i16 %i.043 > > When I generate assembler code, this now results in %i.043 being multiplied > by 2 in the address calculation which result in a shift being emitted. > > How can I avoid this? You'll need to modify LLVM (and Clang, if that's what you are using as your front end). If you are interested, I can send you a patch of the changes that I made to the 2.8 release for a backend that targets a 24-bit word-addressable DSP, but it is quite rough and it includes changes in which you probably aren't interested (support for non-power-of-2 integer sizes and some other bug fixes). FWIW, I'm working (albeit at a glacial pace) on improving support for word-addressable memory in Clang, with the plan of eventually working my way down to LLVM. But I expect it will be a while until it is ready for production use. -Ken From Jacques.VanDamme at synopsys.com Tue Mar 8 13:09:31 2011 From: Jacques.VanDamme at synopsys.com (Jacques Van Damme) Date: Tue, 8 Mar 2011 20:09:31 +0100 Subject: [LLVMdev] backend question In-Reply-To: References: Message-ID: Hi Ken, Thanks for the quick reply. Since I always emit assembly code, and I own the assembler, I will look into solving the problem there. Thanks anyway, Jacques. -----Original Message----- From: kjdyck at gmail.com [mailto:kjdyck at gmail.com] On Behalf Of Ken Dyck Sent: Tuesday, March 08, 2011 7:59 PM To: Jacques Van Damme Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] backend question On Tue, Mar 8, 2011 at 5:14 AM, Jacques Van Damme wrote: > I am writing a backend for an architecture that has only 16-bit word > addressing (No byte addresses ever.? All data are always 16-bit). > > How can I specify this in the backend? In short, you can't. Word-addressable memory is not currently supported in LLVM (or Clang, for that matter). > As an example, consider the following instruction: > > ?? %arrayidx = getelementptr [129 x i16]* @flags, i16 0, i16 %i.043 > > When I generate assembler code, this now results in %i.043 being multiplied > by 2 in the address calculation which result in a shift being emitted. > > How can I avoid this? You'll need to modify LLVM (and Clang, if that's what you are using as your front end). If you are interested, I can send you a patch of the changes that I made to the 2.8 release for a backend that targets a 24-bit word-addressable DSP, but it is quite rough and it includes changes in which you probably aren't interested (support for non-power-of-2 integer sizes and some other bug fixes). FWIW, I'm working (albeit at a glacial pace) on improving support for word-addressable memory in Clang, with the plan of eventually working my way down to LLVM. But I expect it will be a while until it is ready for production use. -Ken From bijoy123_8 at yahoo.com Tue Mar 8 13:11:21 2011 From: bijoy123_8 at yahoo.com (akramul azim) Date: Tue, 8 Mar 2011 11:11:21 -0800 (PST) Subject: [LLVMdev] Clang Static Analyzer Message-ID: <727100.68715.qm@web121710.mail.ne1.yahoo.com> Hi, ??? I have installed Clang in Windows XP 32 bit using MinGW. I want to use the Clang Static Analyzer i.e., scan-build. Can anyone please guide me how to make scan-build executable for Windows and use it. Thanks, Akramul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/4589235d/attachment.html From carl.norum at apple.com Tue Mar 8 13:12:21 2011 From: carl.norum at apple.com (Carl Norum) Date: Tue, 8 Mar 2011 11:12:21 -0800 Subject: [LLVMdev] static analyzer & ubigraph visualization Message-ID: <06DE3D7A-4D92-4FD9-A233-1407508C4708@apple.com> I updated our project recently to use a newer version of clang (we're at r127188 now). That version made our modified ccc-analyze script stop working, so updated that from TOT clang as well. I noticed in the new script a new environment variable check "CCC_UBI" that uses Ubigraph to visualize something... but what? I tried making a simple project with a few kinds of static analyzer errors, but nothing ever shows up in my ubigraph window. What's this feature for and what should I see graphed? -- Carl From stoklund at 2pi.dk Tue Mar 8 13:14:11 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 8 Mar 2011 11:14:11 -0800 Subject: [LLVMdev] MSVC compiling issue In-Reply-To: <4D75F1C5.6040900@tu-dresden.de> References: <4D75F1C5.6040900@tu-dresden.de> Message-ID: On Mar 8, 2011, at 1:07 AM, Olaf Krzikalla wrote: > Hi @llvm, > > building a debug version under MSVC 9 leads to a compiler error due to a > mix of different types in a call to upper_bound. I have attached a > hot-fix but I'm rather unsure if it should be applied as it is, since > IMHO the reason is a MSVC library bug ("IMHO", because I don't know the > requirements imposed to the predicate by the standard). I hoped the symmetric methods would be enough to trick MSVC into compiling it. Is that extra method getting called? What happens if you stick assert(0) in there? > > Index: lib/CodeGen/LiveInterval.cpp > =================================================================== > --- lib/CodeGen/LiveInterval.cpp (revision 127221) > +++ lib/CodeGen/LiveInterval.cpp (working copy) > @@ -39,6 +39,9 @@ > bool operator()(const LiveRange &A, SlotIndex B) const { > return A.end < B; > } > + bool operator()(const LiveRange &A, const LiveRange &B) const { > + return A.end < B.end; > + } > }; > } > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kremenek at apple.com Tue Mar 8 13:21:56 2011 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 8 Mar 2011 11:21:56 -0800 Subject: [LLVMdev] Clang Static Analyzer In-Reply-To: <727100.68715.qm@web121710.mail.ne1.yahoo.com> References: <727100.68715.qm@web121710.mail.ne1.yahoo.com> Message-ID: <1F8D981E-EA56-4DB1-AAE0-125BE123932E@apple.com> Hi Akramul, For future reference, the correct list for questions on the analyzer is cfe-dev, not llvmdev. There is no active maintainer of scan-build (or the analyzer) on Windows. I personally have never tried to make scan-build work on Windows, although I believe some have gotten it to work in a cygwin environment. scan-build depends on Perl, and possibly makes assumptions that are not valid on Windows. If you manage to get it to work, I'd appreciate hearing about it. Ted On Mar 8, 2011, at 11:11 AM, akramul azim wrote: > Hi, > I have installed Clang in Windows XP 32 bit using MinGW. I want to use the Clang Static Analyzer i.e., scan-build. Can anyone please guide me how to make scan-build executable for Windows and use it. > > Thanks, > Akramul > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/2e343972/attachment.html From wendling at apple.com Tue Mar 8 13:24:47 2011 From: wendling at apple.com (Bill Wendling) Date: Tue, 8 Mar 2011 11:24:47 -0800 Subject: [LLVMdev] [cfe-dev] Announcing LLVM 2.9 Testing! In-Reply-To: References: <3BF72DAA-22AA-442A-830F-81630E3BAB39@apple.com> Message-ID: <81F9A133-1C57-4B03-8E77-02953F56B17B@apple.com> On Mar 8, 2011, at 10:34 AM, Michel Alexandre Salim wrote: > On 03/08/11 04:01, Bill Wendling wrote: >> It's that time again! (Well, it was that time yesterday, but I made a mistake.) The LLVM 2.9 release is now underway! > >> Developers: >> >> Top-of-tree is now open for submissions. The 2.9 release branch and tags are available for you to check out and test. However, please do not commit patches to the 2.9 release branch. All patches must be approved by the "code owners" before they are accepted into the branch. See this website for who to contact regarding a patch you feel is necessary for the release: >> >> http://llvm.org/docs/DeveloperPolicy.html#owners >> >> Please grab the release sources, build them, and start compiling things. File bugs for any errors you see. Once binaries are available, we will be posting them for people to use. >> > Will there be official pre-release tarballs, like there was for 2.8 (and > IIRC, 2.7)? I'd be tracking the pre-releases for Fedora's development > branch (and if they look stable, push them to our branch for the > upcoming Fedora 15 release in May). > > Otherwise, I can create them from SVN tag. > Either way is fine. For convenience, I will put up source tar balls for RC1 soon. When we get binary tar balls, we'll post them as well. -bw From kremenek at apple.com Tue Mar 8 13:26:55 2011 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 8 Mar 2011 11:26:55 -0800 Subject: [LLVMdev] static analyzer & ubigraph visualization In-Reply-To: <06DE3D7A-4D92-4FD9-A233-1407508C4708@apple.com> References: <06DE3D7A-4D92-4FD9-A233-1407508C4708@apple.com> Message-ID: <39C9A1A8-AA28-4F00-8E7E-D263241F7B1D@apple.com> Hi Carl, For future reference, the correct list for questions on the analyzer is cfe-dev, not llvmdev. CCC_UBI is meant to display the analysis path graph (aka "exploded graph") as it gets explored by the analyzer. I haven't run it in a while, so it may be broken. I will investigate. Ted On Mar 8, 2011, at 11:12 AM, Carl Norum wrote: > I updated our project recently to use a newer version of clang (we're at r127188 now). That version made our modified ccc-analyze script stop working, so updated that from TOT clang as well. I noticed in the new script a new environment variable check "CCC_UBI" that uses Ubigraph to visualize something... but what? I tried making a simple project with a few kinds of static analyzer errors, but nothing ever shows up in my ubigraph window. What's this feature for and what should I see graphed? > > -- Carl > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From stoklund at 2pi.dk Tue Mar 8 13:42:55 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 8 Mar 2011 11:42:55 -0800 Subject: [LLVMdev] MSVC compiling issue In-Reply-To: <4D75F1C5.6040900@tu-dresden.de> References: <4D75F1C5.6040900@tu-dresden.de> Message-ID: <4E9DB728-AB90-44B7-A4F7-191FDD55E47A@2pi.dk> On Mar 8, 2011, at 1:07 AM, Olaf Krzikalla wrote: > Hi @llvm, > > building a debug version under MSVC 9 leads to a compiler error due to a > mix of different types in a call to upper_bound. I have attached a > hot-fix but I'm rather unsure if it should be applied as it is, since > IMHO the reason is a MSVC library bug ("IMHO", because I don't know the > requirements imposed to the predicate by the standard). Thanks, applied as r127264. Does it pass the unit tests with the patch? /jakob From nadav.rotem at intel.com Tue Mar 8 13:46:45 2011 From: nadav.rotem at intel.com (Rotem, Nadav) Date: Tue, 8 Mar 2011 21:46:45 +0200 Subject: [LLVMdev] Vector select/compare support in LLVM Message-ID: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> Hello, I started working on adding vector support for the SELECT and CMP instructions in the codegen (bugs: 3384, 1784, 2314).? Currently, the codegen scalarizes vector CMPs into multiple scalar CMPs. ?It is easy to add similar scalarization support to the SELECT instruction. ?However, using multiple scalar operations is slower than using vector operations. In LLVM, vector-compare operations generate a vector of i1s, and the vector-select instruction uses these vectors. In between, these values (masks) can be manipulated (xor-ed, and-ed, etc). For x86, I would like the codegen to generate the ?pcmpeq? and ?blend? family of instructions. ?SSE masks are implemented using a 32bit word per item, where the MSB bit is used as a predicate and the rest of the bits are ignored. ?I believe that ?PPC Altivec and ARM Neon are also implemented this way. I can think of two ways to represent masks in x86: sparse and packed. In the sparse method, the masks are kept in <4 x 32bit> registers, which are mapped to xmm registers. This is the ?native? way of using masks. In the second representation, the packed method, the MSB bits are collected from the xmm register into a packed general purpose register. Luckily, SSE has the MOVMSKPS instruction, which converts sparse masks to packed masks. I am not sure which representation is better, but both are reasonable. The former may cause register pressure in some cases, while the latter may add the packing-unpacking overhead. _Sparse_ After my discussion with Duncan, last week, I started working on the promotion of type <4 x i1> to <4 x i32>, and I ran into a problem. ?It looks like the codegen term ?promote? is overloaded. ?For scalars, the ?promote? operation converts scalars to larger bit-width scalars. ?For vectors, the ?promote? operation widens the vector to the next power of two. ?This is reasonable for types such as ?<3 x float>?. ?Maybe we need to add another legalization operation which will mean widening the vectors? ?In any case, I estimated that implementing this per-element promotion would require major changes and decided that this is not the way to go. _Packed_ I followed Duncan?s original suggestion which was packing vectors of i1s into general purpose registers. I started by adding several new types to ValueTypes (td and h). ?I added ?4vi1, 8vi1, 16vi1 ? 64vi1?. ?For x86, I mapped the v8i1 .. v8i64 to general purpose x86 registers. I started playing with a small program, which performed a vector CMP on 4 elements. The legalizer promoted the v4i1 to the next legal pow-of-two type, which was v8i1. I changed WidenVecRes_SETCC and added a new method WidenVecOp_Select to handle the legalization of these types. The widening of the Select and SETCC ops was simple since I only widened the operands which needed widening. I am not sure if this is correct, but I ran into more problems before I could test it.? Another problem that I had was that i1 types are still promoted to i8 types. So if I have a vector such as ?4 x i1: <0, 0, 1, 1>?, ?it will be mapped to DAG node ?BUILD_VECTOR? which accepts 4 i8s and returns a single v4i1. ?This fails somewhere because the cast is illegal. ?The desired result should be that the above vector would be translated to the (packed) scalar value ?3?. I hacked TargetLowering::ReplaceNodeResults and added a minimal support for BUILD_VECTOR. I?d be interested in hearing your suggestions in which direction/s to proceed. Thank you, Nadav --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From joshuawarner32 at gmail.com Tue Mar 8 14:11:01 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Tue, 8 Mar 2011 13:11:01 -0700 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: Hi Talin, Let me ask a question before we go too much further. Currently the argument > to llvm.gcroot must be an alloca instruction. You cannot GEP an internal > field within the alloca and pass it to the gcroot intrinsic. So the entire > alloca is considered a root, even if it has non-pointer fields. My question > is, in this new address-space proposal, are we talking about changing this > so that the garbage collector only "sees" the internal pointer fields within > the alloca, or will it still be able to "see" the entire alloca? This is the > crucial point for me - I've written my GC strategy to deal with complex > allocas, and there are several data types - such as unions - which depend on > this. > > I can probably work around the union issue using methods like you suggest - > that is building some "dummy" type containing a pointer as the long-term > storage format - as long as the GC can still see the entire struct. It's > ugly because it means that my frontend has to know about padding and > alignment and such, issues which I prefer to leave to LLVM to figure out. > Correct me if I am wrong, but to use unions without IR support means you already have to worry about padding. > > But if we change it so that the GC only sees pointers, then I'm dead in the > water. > In the end, the GC should only be seeing pointers anyway - some of whose "pointer-ness" depends on other values (as in the tagged union). I think your method could still work with the GC only seeing pointers (albeit with a little modification) - the only requirement I see that your method imposes on the design of a address-space based GC strategy is to maintain information about the structure (union) containing the pointer, next to the pointer. For this, metadata should work fine. While it is not particularly elegant, I don't see why you would be "dead in the water" - because it could be made to work. > > As far as my suggestion of marking types go, you are right, it doesn't make > sense for most types. It really only matters for structs and pointers. > Imagine if structs had an "isRoot" flag that lived next to "isPacked", which > makes the struct a distinct type. This would be written in IR as "gcroot { > i1, float }" or something like that. The presence of this flag has the same > effect as marking a pointer in the GC address space. Combine that with the > ability to mark SSA values as roots, and my life would get vastly simpler > and my generated code would get about 20% faster :) > > I'm not saying this approach wouldn't work or that it is in any way worse than the address-space method, but I think it would require many more changes to how LLVM handles types. One problem with how you are envisioning it (though not with the idea itself) is that it will probably be beneficial to be able to track multiple, independent types of roots - for example, roots for a long-term heap (where Method, Class, etc. might live) and the normal heap. The address-space method would handle this, but the isRoot() method would have to be extended to handle distinct roots - more like isRoot(int rootId) - which would *really* complicate the type system. -Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/1e74213c/attachment.html From listiges at arcor.de Tue Mar 8 14:14:20 2011 From: listiges at arcor.de (Nico) Date: Tue, 8 Mar 2011 21:14:20 +0100 Subject: [LLVMdev] Unnamed structure types Message-ID: Hello, is there a method to access unnamed structure types? Maybe something similar to 'TypeSymbolTable'? 'CBackend' uses 'FindUsedTypes'-pass and a lot of glue code around it, but I don't want to use the 'Passmanager' etc. Thank you. Kind regards, Nico From joshuawarner32 at gmail.com Tue Mar 8 14:15:20 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Tue, 8 Mar 2011 13:15:20 -0700 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: <1299573669.29836.65.camel@salamander> References: <1299573669.29836.65.camel@salamander> Message-ID: On Tue, Mar 8, 2011 at 1:41 AM, Kenneth Oksanen wrote: > On Mon, 2011-03-07 at 15:05 -0700, Joshua Warner wrote: > > I actually meant uncommon in the sense of having stack-allocated > > unions that participate in garbage collection. Off the top of my > > head, I could only name one language (ML) that might use a feature > > like that. Even then, I suspect most ML implementations would > > actually push that stuff onto the heap. > > Common Lisp has (declare (dynamic-extent ..)). > > But IMHO this is not a language-dependent issue. Rather, whenever any > language front-end using LLVM recognizes some (union) object can not > outlive the call, it would be a significant optimization if LLVM would > support stack-allocating the object. > Point taken - but I don't think there is anything in the address-space method that would inherently prevent this. > However, note that there may be several garbage collected heaps using > different garbage collectors. Therefore the indicator for "rootness" is > not merely binary. > Exactly. > > > Being able to make a "marked" version of every type seems unnecessary, > > and in some cases, somewhat non-intuitive. Take for instance, making > > a "marked" float type - which I can't think of any good use for. > > Such cases may sound exotic, but perhaps not non-existing. For example, > assume one wants to write a heap that supports generating statistics of > all live values, say, for the benefit of testing for memory leaks in > long-running servers. Or assume taking a snapshot of the computation > onto disk and recovering it in another machine with a different > representation of the (non-pointer) data type. Or checking (by checksum > exchange) whether the computational states match in a set of mutually > replicating computers running in lockstep. (I've actually done all of > these, although without the involvement of LLVM.) > > Sounds reasonable, but does LLVM really need to support all of these cases with one big overhaul? -Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/ce506d0d/attachment.html From kremenek at apple.com Tue Mar 8 14:30:49 2011 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 8 Mar 2011 12:30:49 -0800 Subject: [LLVMdev] static analyzer & ubigraph visualization In-Reply-To: <39C9A1A8-AA28-4F00-8E7E-D263241F7B1D@apple.com> References: <06DE3D7A-4D92-4FD9-A233-1407508C4708@apple.com> <39C9A1A8-AA28-4F00-8E7E-D263241F7B1D@apple.com> Message-ID: Hi Carl, I think the trick is to use a debug build of Clang. The ubigraph support is not enabled in a release build as it is in a hot path of the analyzer. I switched to using a debug build and the ubigraph support worked for me. You also need the 'ubiviz' script in your path. Ted On Mar 8, 2011, at 11:26 AM, Ted Kremenek wrote: > Hi Carl, > > For future reference, the correct list for questions on the analyzer is cfe-dev, not llvmdev. > > CCC_UBI is meant to display the analysis path graph (aka "exploded graph") as it gets explored by the analyzer. I haven't run it in a while, so it may be broken. I will investigate. > > Ted > > On Mar 8, 2011, at 11:12 AM, Carl Norum wrote: > >> I updated our project recently to use a newer version of clang (we're at r127188 now). That version made our modified ccc-analyze script stop working, so updated that from TOT clang as well. I noticed in the new script a new environment variable check "CCC_UBI" that uses Ubigraph to visualize something... but what? I tried making a simple project with a few kinds of static analyzer errors, but nothing ever shows up in my ubigraph window. What's this feature for and what should I see graphed? >> >> -- Carl >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From krys at polarlights.net Tue Mar 8 14:47:38 2011 From: krys at polarlights.net (krys at polarlights.net) Date: Tue, 8 Mar 2011 15:47:38 -0500 (EST) Subject: [LLVMdev] Using LLVM to convert a language to another language Message-ID: <1299617258.918423143@192.168.4.58> Hi, Sorry for my newbies questions... but here it is... My goal is to have a "shading language", it is very similar to the "C" language but with special tokens. I have a parser and a lexer done with Lex/bison, once I have lexed/parsed my "shading language" I must create 3 new "source code" in OpenCL. It mean that by example for the following : shader matte(float Kd, color c) { Ci = diffuse(Kd * c); } I have to create 3 methods in OpenCL, by example 1 - matte_sampling 2 - matte_pdf 3 - matte_f Each of theses method is in OpenCL (It is like C too). So, I would like to use LLVM to create an optimized version of my shader code. But once I have the LLVM byte code... how can I parse it "easily" to create my shader code ? NB: also, you can tell me if it is really a good idea ! Thanks From carl.norum at apple.com Tue Mar 8 15:41:05 2011 From: carl.norum at apple.com (Carl Norum) Date: Tue, 8 Mar 2011 13:41:05 -0800 Subject: [LLVMdev] static analyzer & ubigraph visualization In-Reply-To: References: <06DE3D7A-4D92-4FD9-A233-1407508C4708@apple.com> <39C9A1A8-AA28-4F00-8E7E-D263241F7B1D@apple.com> Message-ID: On Mar 8, 2011, at 12:30 PM, Ted Kremenek wrote: > Hi Carl, > > I think the trick is to use a debug build of Clang. The ubigraph support is not enabled in a release build as it is in a hot path of the analyzer. I switched to using a debug build and the ubigraph support worked for me. You also need the 'ubiviz' script in your path. > > Ted Thanks - I'll give it a try. -- Carl From Micah.Villmow at amd.com Tue Mar 8 16:53:47 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Tue, 8 Mar 2011 16:53:47 -0600 Subject: [LLVMdev] Using LLVM to convert a language to another language In-Reply-To: <1299617258.918423143@192.168.4.58> References: <1299617258.918423143@192.168.4.58> Message-ID: Probably the easiest way is to take the C backend and modify it to handle the special OpenCL constructs that you generate. > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of krys at polarlights.net > Sent: Tuesday, March 08, 2011 12:48 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] Using LLVM to convert a language to another language > > Hi, > > Sorry for my newbies questions... but here it is... > > My goal is to have a "shading language", it is very similar to the "C" > language but with special tokens. > > I have a parser and a lexer done with Lex/bison, once I have > lexed/parsed my "shading language" I must create 3 new "source code" in > OpenCL. > > It mean that by example for the following : > > shader matte(float Kd, color c) > { > Ci = diffuse(Kd * c); > } > > I have to create 3 methods in OpenCL, by example > > 1 - matte_sampling > 2 - matte_pdf > 3 - matte_f > > Each of theses method is in OpenCL (It is like C too). > > So, I would like to use LLVM to create an optimized version of my > shader code. But once I have the LLVM byte code... how can I parse it > "easily" to create my shader code ? > > NB: also, you can tell me if it is really a good idea ! > > Thanks > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From viridia at gmail.com Tue Mar 8 17:13:30 2011 From: viridia at gmail.com (Talin) Date: Tue, 8 Mar 2011 15:13:30 -0800 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: On Tue, Mar 8, 2011 at 12:11 PM, Joshua Warner wrote: > > Hi Talin, > > Let me ask a question before we go too much further. Currently the argument >> to llvm.gcroot must be an alloca instruction. You cannot GEP an internal >> field within the alloca and pass it to the gcroot intrinsic. So the entire >> alloca is considered a root, even if it has non-pointer fields. My question >> is, in this new address-space proposal, are we talking about changing this >> so that the garbage collector only "sees" the internal pointer fields within >> the alloca, or will it still be able to "see" the entire alloca? This is the >> crucial point for me - I've written my GC strategy to deal with complex >> allocas, and there are several data types - such as unions - which depend on >> this. >> >> I can probably work around the union issue using methods like you suggest >> - that is building some "dummy" type containing a pointer as the long-term >> storage format - as long as the GC can still see the entire struct. It's >> ugly because it means that my frontend has to know about padding and >> alignment and such, issues which I prefer to leave to LLVM to figure out. >> > > Correct me if I am wrong, but to use unions without IR support means you > already have to worry about padding. > Well, I sort of do - I estimate which variant of the union is the largest without knowing its exact size. A long time ago I had hoped to make my frontend generate IR that was completely target-independent, and eventually I had to give up that plan. Unions turned out to be one of the few cases where it's impossible to be target-neutral. However, my frontend is "mostly" target independent in that the only piece of information it currently knows about the target is whether pointers are 32 or 64 bits. I should also mention that tagged unions are surprisingly useful in a statically typed language and I would hope that more languages in the future adopt them. A typical example is how Iterators work in my language: // Iterator that returns the sequence of integers 0..N class CountingIterator : Iterator[int] { var index = 0; var limit; def construct(limit:int) { self.limit = limit; } def next -> int or void { if index < limit { return index++; } // return an int return; // return no value } } > >> But if we change it so that the GC only sees pointers, then I'm dead in >> the water. >> > > In the end, the GC should only be seeing pointers anyway - some of whose > "pointer-ness" depends on other values (as in the tagged union). I think > your method could still work with the GC only seeing pointers (albeit with a > little modification) - the only requirement I see that your method imposes > on the design of a address-space based GC strategy is to maintain > information about the structure (union) containing the pointer, next to the > pointer. For this, metadata should work fine. While it is not particularly > elegant, I don't see why you would be "dead in the water" - because it could > be made to work. > Here's a question - if the only way to identify a root is via pointer address space, then where does the metadata go? Adding a metadata field to the pointer type would also greatly complicate LLVM. My worry is that folks will say "well, since every root is a pointer now, we no longer need the metadata argument to describe it's type." > >> As far as my suggestion of marking types go, you are right, it doesn't >> make sense for most types. It really only matters for structs and pointers. >> Imagine if structs had an "isRoot" flag that lived next to "isPacked", which >> makes the struct a distinct type. This would be written in IR as "gcroot { >> i1, float }" or something like that. The presence of this flag has the same >> effect as marking a pointer in the GC address space. Combine that with the >> ability to mark SSA values as roots, and my life would get vastly simpler >> and my generated code would get about 20% faster :) >> >> > I'm not saying this approach wouldn't work or that it is in any way worse > than the address-space method, but I think it would require many more > changes to how LLVM handles types. One problem with how you are envisioning > it (though not with the idea itself) is that it will probably be beneficial > to be able to track multiple, independent types of roots - for example, > roots for a long-term heap (where Method, Class, etc. might live) and the > normal heap. The address-space method would handle this, but the isRoot() > method would have to be extended to handle distinct roots - more like > isRoot(int rootId) - which would *really* complicate the type system. > > I realize that it has drawbacks. I'm mainly just brainstorming. As far as multiple heaps go: There are two classes of heaps we're talking about. The first class are heaps that have different object lifetime policies. Those kinds of heaps should IMHO be managed entirely by the collector without the need for compiler support. In other words, if there's a permgen heap for permanent objects, the collector can store bits in the object header or it can do address comparisons to determine whether an object is in the permgen heap without needing to use the pointer address space field. In fact, using the pointer-address-space property wouldn't work for this, because that's a function of pointer type, and you want the ability to have instances of the same type with different lifetime policies. Objects such as Method and Class will have internal references to instances of String and List that live in the permgen heap along with them, but other objects would have references to String and List in the regular heap. The other class of heap is where you have different classes of memory - RAM and ROM, or NUMA-style shared memory spaces - or where the address space represents some semantic difference that the compiler or optimization passes need to be aware of. These heaps may be garbage collected in addition to whatever other special properties they have. So the property of being garbage collected is (at least for me) a single bit, and orthogonal to whether the object is in a special heap or not. Let me take a step back for a second and think about this thread as a whole. The current LLVM approach to garbage collection requires a division of responsibility between the LLVM libraries and the compilers that call those libraries. As I see it, it's the frontend's job to understand the semantics and structure of data types, and it's LLVM's job to know things like underlying representations and lifetimes of SSA values. The biggest problem that I have with the current system is that garbage collection roots can only live in allocas, not SSA values, so that I am constantly having to load and store values to memory. The second biggest problem is related to the first - since the scope of an alloca root is the entire function (because calls to llvm.gcroot have to be in the first block) there's no way for me to tell LLVM that a root is confined to a given lexical block, so I have to generate extra code to zero out the root even if it's dead. In fact, most roots get assigned three times - zeroed out at the beginning of the function, then set to a value sometime later, and then set back to zero when I'm done with the value. What I like about the current system is that the responsibility for interpreting roots is entirely in my hands, and LLVM can treat the entire alloca as an opaque blob if it wants to. In my compiler I treat stack roots exactly like I treat fields within a heap object - the compiler-generated trace tables are exactly the same, except that in the case of stack roots the offsets are negative and the object base address is the frame pointer. The same function that traces the fields of an object also traces the variables on the stack, including handling complex types such as unions and variable length arrays. (Although the latter never occurs on the stack since alloca only takes constant arguments. But I could imagine a SmallVector type situation where you have some number of pointers, only the first N of which have been initialized.) So when we talk about which solution is more elegant, I think we need to look at the elegance of the whole, not just of the LLVM part. That being said, I admit that there is one downside to sticking with the current approach going forward, which I will explain. In the current system, a large stack root has the same cost as a small one, since both are access via pointer. However, in a future version of LLVM in which SSA roots are automatically spilled to memory during a safe point and reloaded after, a large root will be more expensive than a small one. This means that there will be pressure to make roots as small as possible, so that if a large struct type has roots confined to just a few fields, it would be more efficient to declare just those fields as roots rather than the entire struct. So I agree with you this much: It would be nice if there was a way for marking of roots to be finer-grained than an entire SSA value, *including* being able to associate metadata with that finer-grained portion. > -Joshua > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/4449b6d1/attachment-0001.html From wendling at apple.com Tue Mar 8 19:51:55 2011 From: wendling at apple.com (Bill Wendling) Date: Tue, 8 Mar 2011 17:51:55 -0800 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs Message-ID: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> There are LLVM 2.9 RC1 pre-release tarballs source available. You can find them here: http://llvm.org/pre-releases/2.9/ Please download them, build them, and compile things to your heart's content. And most importantly file a bunch of bug reports. :-) Share and enjoy! -bw From joshuawarner32 at gmail.com Tue Mar 8 20:45:18 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Tue, 8 Mar 2011 19:45:18 -0700 Subject: [LLVMdev] llvm.gcroot suggestion In-Reply-To: References: Message-ID: Hi Talin, > Well, I sort of do - I estimate which variant of the union is the largest > without knowing its exact size. A long time ago I had hoped to make my > frontend generate IR that was completely target-independent, and eventually > I had to give up that plan. Unions turned out to be one of the few cases > where it's impossible to be target-neutral. However, my frontend is "mostly" > target independent in that the only piece of information it currently knows > about the target is whether pointers are 32 or 64 bits. > > I still think complete platform independence is something that *all* systems, including LLVM should aspire to, but (probably) never attain. > I should also mention that tagged unions are surprisingly useful in a > statically typed language and I would hope that more languages in the future > adopt them. A typical example is how Iterators work in my language: > > I completely agree - I included them, albeit in a slightly different form, in my own pet language. > } > >> >>> But if we change it so that the GC only sees pointers, then I'm dead in >>> the water. >>> >> >> In the end, the GC should only be seeing pointers anyway - some of whose >> "pointer-ness" depends on other values (as in the tagged union). I think >> your method could still work with the GC only seeing pointers (albeit with a >> little modification) - the only requirement I see that your method imposes >> on the design of a address-space based GC strategy is to maintain >> information about the structure (union) containing the pointer, next to the >> pointer. For this, metadata should work fine. While it is not particularly >> elegant, I don't see why you would be "dead in the water" - because it could >> be made to work. >> > > Here's a question - if the only way to identify a root is via pointer > address space, then where does the metadata go? Adding a metadata field to > the pointer type would also greatly complicate LLVM. My worry is that folks > will say "well, since every root is a pointer now, we no longer need the > metadata argument to describe it's type." > I'm not quite sure what you are asking. I don't know enough about LLVM to tell you how it should be done off the top of my head - but it should just be a matter of attaching the metadata to the instruction that produces the value. In order for this to work, LLVM would have to include the IR instruction that produced the value in the generated stack-root data. > I realize that it has drawbacks. I'm mainly just brainstorming. > Likewise. > > As far as multiple heaps go: There are two classes of heaps we're talking > about. The first class are heaps that have different object lifetime > policies. Those kinds of heaps should IMHO be managed entirely by the > collector without the need for compiler support. > Perhaps that was a bad example - but I'm sure there are valid use cases of where you might want to independently track different types of roots. This is something the address-space solution would handle naturally. > > The other class of heap is where you have different classes of memory - RAM > and ROM, or NUMA-style shared memory spaces - or where the address space > represents some semantic difference that the compiler or optimization passes > need to be aware of. These heaps may be garbage collected in addition to > whatever other special properties they have. So the property of being > garbage collected is (at least for me) a single bit, and orthogonal to > whether the object is in a special heap or not. > > Let me take a step back for a second and think about this thread as a > whole. The current LLVM approach to garbage collection requires a division > of responsibility between the LLVM libraries and the compilers that call > those libraries. As I see it, it's the frontend's job to understand the > semantics and structure of data types, and it's LLVM's job to know things > like underlying representations and lifetimes of SSA values. > > The biggest problem that I have with the current system is that garbage > collection roots can only live in allocas, not SSA values, so that I am > constantly having to load and store values to memory. The second biggest > problem is related to the first - since the scope of an alloca root is the > entire function (because calls to llvm.gcroot have to be in the first block) > there's no way for me to tell LLVM that a root is confined to a given > lexical block, so I have to generate extra code to zero out the root even if > it's dead. In fact, most roots get assigned three times - zeroed out at the > beginning of the function, then set to a value sometime later, and then set > back to zero when I'm done with the value. > This is something that LLVM should handle completely, independent of which solution is chosen - just propagate the liveness info into the generated stack maps. Not being very familiar with the internals of LLVM (I've used it a lot, but not hacked on it), I'm not sure how this would work. > > What I like about the current system is that the responsibility for > interpreting roots is entirely in my hands, and LLVM can treat the entire > alloca as an opaque blob if it wants to. In my compiler I treat stack roots > exactly like I treat fields within a heap object - the compiler-generated > trace tables are exactly the same, except that in the case of stack roots > the offsets are negative and the object base address is the frame pointer. > The same function that traces the fields of an object also traces the > variables on the stack, including handling complex types such as unions and > variable length arrays. (Although the latter never occurs on the stack since > alloca only takes constant arguments. But I could imagine a SmallVector type > situation where you have some number of pointers, only the first N of which > have been initialized.) > > So when we talk about which solution is more elegant, I think we need to > look at the elegance of the whole, not just of the LLVM part. > I agree - but when a little effort can get you *most* of what you (in the general sense) want, IMHO it is almost always preferable to putting tons of work into making it *perfect*. > > That being said, I admit that there is one downside to sticking with the > current approach going forward, which I will explain. In the current system, > a large stack root has the same cost as a small one, since both are access > via pointer. However, in a future version of LLVM in which SSA roots are > automatically spilled to memory during a safe point and reloaded after, a > large root will be more expensive than a small one. This means that there > will be pressure to make roots as small as possible, so that if a large > struct type has roots confined to just a few fields, it would be more > efficient to declare just those fields as roots rather than the entire > struct. > I don't think it is a matter of efficiency - that should be completely up to the spiller making good choices, and the stack maps (perhaps optionally) being smart enough to recognize things in registers. > So I agree with you this much: It would be nice if there was a way for > marking of roots to be finer-grained than an entire SSA value, *including* > being able to associate metadata with that finer-grained portion. > > The main benefit that the address-space method has is that it should be a big improvement over the current solution and it should (as far as I can tell) be relatively simple to implement. -Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/8fe7583c/attachment.html From rafael.espindola at gmail.com Tue Mar 8 21:50:40 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Tue, 08 Mar 2011 22:50:40 -0500 Subject: [LLVMdev] A working garbage collector - finally :) In-Reply-To: References: Message-ID: <4D76F910.202@gmail.com> Since you are using a copying collector, may I ask how do you handle registers holding pointers to intermediate values? The example I am considering is something like void foo(void); void f(long long *v, long long n) { long long int i; for (i = 0; i < n; ++i) { v[i] = i; foo(); } } If *v points to gc memory, we can start the function with %v.addr = alloca i64*, align 8 %foobar = bitcast i64** %v.addr to i8** call void @llvm.gcroot(i8** %foobar, i8* null) store i64* %v, i64** %v.addr, align 8 but nothing prevents llvm from putting the call to foo in between the computation of &v[i] and the store by using a callee saved register. If GC moves moves v during the call to foo, the next store will be wrong. > -- Talin > Cheers, Rafael From dongsheng.song at gmail.com Tue Mar 8 21:41:38 2011 From: dongsheng.song at gmail.com (Dongsheng Song) Date: Wed, 9 Mar 2011 11:41:38 +0800 Subject: [LLVMdev] [cfe-dev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> Message-ID: On Wed, Mar 9, 2011 at 09:51, Bill Wendling wrote: > There are LLVM 2.9 RC1 pre-release tarballs source available. You can find > them here: > > http://llvm.org/pre-releases/2.9/ > > Please download them, build them, and compile things to your heart's > content. And most importantly file a bunch of bug reports. :-) > > Share and enjoy! > -bw > > Your clang-2.9rc1.src.tar.gz is bad, it should named to clang-2.9rc1.src.tar.gz.gz !!! oracle at vc:~/tmp$ gzip -cd clang-2.9rc1.src.tar.gz > clang-2.9rc1.src.tar oracle at vc:~/tmp$ file clang-2.9rc1.src.tar clang-2.9rc1.src.tar: gzip compressed data, from Unix, last modified: Wed Mar 9 09:38:38 2011 oracle at vc:~/tmp$ gzip -cd clang-2.9rc1.src.tar > clang-2.9rc1.src.tar2 oracle at vc:~/tmp$ tar xf clang-2.9rc1.src.tar2 -- Dongsheng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/e57b9037/attachment.html From dongsheng.song at gmail.com Tue Mar 8 21:49:51 2011 From: dongsheng.song at gmail.com (Dongsheng Song) Date: Wed, 9 Mar 2011 11:49:51 +0800 Subject: [LLVMdev] [cfe-dev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> Message-ID: On Wed, Mar 9, 2011 at 09:51, Bill Wendling wrote: > There are LLVM 2.9 RC1 pre-release tarballs source available. You can find > them here: > > http://llvm.org/pre-releases/2.9/ > > Please download them, build them, and compile things to your heart's > content. And most importantly file a bunch of bug reports. :-) > > Share and enjoy! > -bw > > Your clang-2.9rc1.src.tar.gz is bad, it should named to clang-2.9rc1.src.tar.gz.gz !!! oracle at vc:~/tmp$ gzip -cd clang-2.9rc1.src.tar.gz > clang-2.9rc1.src.tar oracle at vc:~/tmp$ file clang-2.9rc1.src.tar clang-2.9rc1.src.tar: gzip compressed data, from Unix, last modified: Wed Mar 9 09:38:38 2011 oracle at vc:~/tmp$ gzip -cd clang-2.9rc1.src.tar > clang-2.9rc1.src.tar2 oracle at vc:~/tmp$ tar xf clang-2.9rc1.src.tar2 -- Dongsheng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/76bcf73a/attachment.html From charlie.garrett at gmail.com Wed Mar 9 00:16:35 2011 From: charlie.garrett at gmail.com (Charlie Garrett) Date: Tue, 8 Mar 2011 22:16:35 -0800 Subject: [LLVMdev] Is InstructionSimplify still a good place to contribute? Message-ID: I'm looking for some opportunities to contribute to the LLVM core without too big of a learning curve. The open projects pagesuggests moving logic from the instruction combining transformation to the InstructionSimplify analysis. But it looks like some work has been done there since the projects page was last edited. Is there more benefit to be gained? Thanks, -- Charlie Garrett -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/eb27bda3/attachment.html From sabre at nondot.org Wed Mar 9 00:49:44 2011 From: sabre at nondot.org (Chris Lattner) Date: Tue, 8 Mar 2011 22:49:44 -0800 Subject: [LLVMdev] Writing a compiler to use LLVM for code generation In-Reply-To: <201102141715.23018.jim@jimbocorp.uklinux.net> References: <201102091425.43840.jim@jimbocorp.uklinux.net> <201102141715.23018.jim@jimbocorp.uklinux.net> Message-ID: On Feb 14, 2011, at 9:15 AM, Jim Darby wrote: > Hi Chris, > Thanks for the pointer to the LLVM tutorial, it's helped immensely. > I've been playing around a bit to see how clang generates the IR and it now all makes a lot more sense. However, there's one issue that's causing me a little concern. It all centres around allocating local variables using the alloca operation with block structured code. > Using C as an example language, consider the following example program. > void > bar () > { > extern void called (char *); > { > char x1 [10000]; > called (x1); > } > { > char x2 [15000]; > arse (x2); > } > } > Now in C it is possible to overlap the storage for x1 and x2. In fact the stack frame can be considered something like union { char x1[10000]; char x2[15000]; }. In the general case the blocks of the same lexical level inside a function can all be considered as forming part of a union. However, I notice the clang generates the space for both x1 and x2 at the same time meaning the above function allocates 25000 bytes on the stack when it only need create 15000. > If you put the same code through gcc you'll find it only allocates just over 15000 bytes of local storing (the 15000 plus a little overhead). > This has no effect on performance but does have an effect on the amount of stack space a program uses. Now I know that we shouldn't be allocating vast amounts on the stack but the above program is just an example to prove the point. > Any thoughts on this one? One potential method is to actually form a union to handle the stack frame and use that explicitly. This has the rather novel effect that all functions would only have a single alloca to create all the local variables. > Many thanks for your help, > Jim. Hi Jim, It's best to email llvmdev with general questions, as I get swamped and backlogged frequently. We don't have a good answer at present to the problem above, but I have some thoughts on the matter here: http://nondot.org/sabre/LLVMNotes/MemoryUseMarkers.txt We do currently have a hack in the inliner to reuse stack array memory that are inlined from different callees into a common caller. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110308/e5846085/attachment.html From geek4civic at gmail.com Wed Mar 9 01:22:26 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 9 Mar 2011 16:22:26 +0900 Subject: [LLVMdev] [RC1] Building clang/llvm on Cygwin-1.7 Message-ID: Hello guys, On cygwin-1.7, I can build and test clang successfully by 3-stage. Known issues: - binaries among stage2 and stage3 do not match. (other than timestamp and checksum) investigating. - I met some warnings. I have fixes for them. - [llvm] r127241 - [llvm] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110307/117725.html - [clang] r127283 - [clang] r127308 - It seems generated binaries are too slow to start up. I guess regardless of "rebasing issue". ps. I will abandon cygwin-1.5 any more. ...Takumi CYGWIN_NT-6.1-WOW64 HEAVEN64 1.7.7(0.230/5/3) 2010-08-31 09:58 i686 Cygwin ****stage1 g++ (GCC) 4.3.4 20090804 (release) 1 Copyright (C) 2008 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. llvm config.status 2.9svn configured by ../../llvm/configure, generated by GNU Autoconf 2.60, with options "'-C' '--enable-targets=all' '--enable-optimized' '--prefix=/cygdrive/e/llvm/build/cygwin-static/install/stage1' '--with-optimize-option=-O3 -Werror'" -- Testing: 8767 tests, 1 threads -- Testing Time: 6746.50s Expected Passes : 8149 Expected Failures : 71 Unsupported Tests : 547 ****stage2 llvm config.status 2.9svn configured by ../../../llvm/configure, generated by GNU Autoconf 2.60, with options "'-C' '--enable-targets=all' '--enable-optimized' '--disable-assertions' '--prefix=/cygdrive/e/llvm/build/cygwin-static/stage2/../install/stage2' '--with-optimize-option=-O3 -Werror' 'CC=/cygdrive/e/llvm/build/cygwin-static/stage2/../Release+Asserts/bin/clang.exe' 'CXX=/cygdrive/e/llvm/build/cygwin-static/stage2/../Release+Asserts/bin/clang++.exe'" -- Testing: 8758 tests, 1 threads -- Testing Time: 6047.76s Expected Passes : 8140 Expected Failures : 71 Unsupported Tests : 547 ****stage3 llvm config.status 2.9svn configured by ../../../llvm/configure, generated by GNU Autoconf 2.60, with options "'-C' '--enable-targets=all' '--enable-optimized' '--disable-assertions' '--prefix=/cygdrive/e/llvm/build/cygwin-static/install/stage3' '--with-optimize-option=-O3 -Werror' 'CC=/cygdrive/e/llvm/build/cygwin-static/stage2/Release/bin/clang.exe' 'CXX=/cygdrive/e/llvm/build/cygwin-static/stage2/Release/bin/clang++.exe'" -- Testing: 8758 tests, 1 threads -- Testing Time: 7687.44s Expected Passes : 8140 Expected Failures : 71 Unsupported Tests : 547 From baldrick at free.fr Wed Mar 9 02:04:55 2011 From: baldrick at free.fr (Duncan Sands) Date: Wed, 09 Mar 2011 09:04:55 +0100 Subject: [LLVMdev] Is InstructionSimplify still a good place to contribute? In-Reply-To: References: Message-ID: <4D7734A7.5080804@free.fr> Hi Charlie, > I'm looking for some opportunities to contribute to the LLVM core without too > big of a learning curve. The open projects page > suggests moving logic from the > instruction combining transformation to the InstructionSimplify analysis. But > it looks like some work has been done there since the projects page was last > edited. Is there more benefit to be gained? yes. InstSimplify is for transforms that do not create new instructions, for example X-(X-Y) can be simplified to Y by instsimplify because Y was already present. On the other hand X-(X+Y) cannot be simplified to -Y by instsimplify because it would have to create -Y as it wasn't present in the original expression. If you start looking through instcombine you should quickly find some transforms that can be moved to instsimplify. Sometimes entire transforms cannot be moved, but special cases of the transform can be. Of course a lot of important transforms were moved already. Ciao, Duncan. From geek4civic at gmail.com Wed Mar 9 02:57:28 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 9 Mar 2011 17:57:28 +0900 Subject: [LLVMdev] [cfe-dev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> Message-ID: On Wed, Mar 9, 2011 at 5:42 PM, Dongsheng Song wrote: > The following failures is very strange: See Bug 6745 - .ll output is different on windows. http://llvm.org/bugs/show_bug.cgi?id=6745 On mingw32, PRINTF_EXPONENT_DIGITS affects printf and tests would run with PRINTF_EXPONENT_DIGITS=2. An addition, it seems mingw-w64's printf tends to drop minus sign of "-0.0". (is it same on i686? is it fixed in trunk?) ...Takumi From sanjoy at playingwithpointers.com Wed Mar 9 03:41:16 2011 From: sanjoy at playingwithpointers.com (Sanjoy Das) Date: Wed, 09 Mar 2011 15:11:16 +0530 Subject: [LLVMdev] First Patch In-Reply-To: References: <4D734FB9.7030209@playingwithpointers.com> <4D75BC57.9000700@playingwithpointers.com> Message-ID: <4D774B3C.8000009@playingwithpointers.com> Hi! Thanks for the feedback. Have attached a (smaller / simpler) patch which addresses the issues pointed out. -- Sanjoy Das http://playingwithpointers.com -------------- next part -------------- A non-text attachment was scrubbed... Name: 2.diff Type: text/x-diff Size: 3207 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/a1e62a4f/attachment.bin From Olaf.Krzikalla at tu-dresden.de Wed Mar 9 04:32:46 2011 From: Olaf.Krzikalla at tu-dresden.de (Olaf Krzikalla) Date: Wed, 09 Mar 2011 11:32:46 +0100 Subject: [LLVMdev] MSVC compiling issue In-Reply-To: References: <4D75F1C5.6040900@tu-dresden.de> Message-ID: <4D77574E.6010608@tu-dresden.de> Hi @llvm, Am 08.03.2011 20:14, schrieb Jakob Stoklund Olesen: > Is that extra method getting called? What happens if you stick assert(0) in there? That won't work either (that is, the assert fires). In debug mode the MSVC lib tries to test the ordering of the sequence. And it uses the yielded predicate for this (which in this particular case is a very bad idea). > I hoped the symmetric methods would be enough to trick MSVC into compiling it. Does that mean, that gcc actually only needs bool operator()(const LiveRange&A, SlotIndex B) ? According to C++(2003) 25.0.0.8 the answer is "yes", however that section talks about BinaryPredicate and not Compare. The standard is rather unclear at this point and I'm going over to comp.std.c++ to ask. Best regards Olaf Krzikalla From 6yearold at gmail.com Wed Mar 9 06:18:11 2011 From: 6yearold at gmail.com (arrowdodger) Date: Wed, 9 Mar 2011 15:18:11 +0300 Subject: [LLVMdev] Discrepancies between bin/llvm-config --libs and LLVM_LINK_COMPONENTS in CMake. Message-ID: Hello. When i run llvm-config --libs jit bitreader bitwriter ipo linker engine i get: -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMMCParser -lLLVMX86AsmPrinter -lLLVMX86Utils -lLLVMX86Info -lLLVMLinker -lLLVMArchive -lLLVMipo -lLLVMBitWriter -lLLVMBitReader -lLLVMJIT -lLLVMExecutionEngine -lLLVMCodeGen -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMMC -lLLVMCore -lLLVMSupport Now i use in CMakeLists.txt following line: set(LLVM_LINK_COMPONENTS jit bitreader bitwriter ipo linker engine) After running CMake, target's link.txt contains following: /usr/bin/c++ <...> -lLLVMBitWriter -lLLVMipo -lLLVMLinker -lLLVMJIT -lLLVMArchive -lLLVMCodeGen -lLLVMExecutionEngine -lLLVMBitReader -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMTarget -lLLVMCore -lLLVMMC -lLLVMSupport As you can see, this differs from output of llvm-config tool. This causes linking to fail due to unresolved externals. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/33c5a7ce/attachment.html From ofv at wanadoo.es Wed Mar 9 08:50:14 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Wed, 09 Mar 2011 15:50:14 +0100 Subject: [LLVMdev] Discrepancies between bin/llvm-config --libs and LLVM_LINK_COMPONENTS in CMake. References: Message-ID: <87wrk8tkk9.fsf@wanadoo.es> arrowdodger <6yearold at gmail.com> writes: > llvm-config --libs jit bitreader bitwriter ipo linker engine [snip] > Now i use in CMakeLists.txt following line: > > set(LLVM_LINK_COMPONENTS jit bitreader bitwriter ipo linker engine) [snip] > As you can see, this differs from output of llvm-config tool. This causes > linking to fail due to unresolved externals. See if r127333 fixes the problem for you. From dongsheng.song at gmail.com Wed Mar 9 02:42:03 2011 From: dongsheng.song at gmail.com (Dongsheng Song) Date: Wed, 9 Mar 2011 16:42:03 +0800 Subject: [LLVMdev] [cfe-dev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> Message-ID: On Wed, Mar 9, 2011 at 11:41, Dongsheng Song wrote: > On Wed, Mar 9, 2011 at 09:51, Bill Wendling wrote: > >> There are LLVM 2.9 RC1 pre-release tarballs source available. You can find >> them here: >> >> http://llvm.org/pre-releases/2.9/ >> >> Please download them, build them, and compile things to your heart's >> content. And most importantly file a bunch of bug reports. :-) >> >> Share and enjoy! >> -bw >> >> > Your clang-2.9rc1.src.tar.gz is bad, it should named to > clang-2.9rc1.src.tar.gz.gz !!! > > oracle at vc:~/tmp$ gzip -cd clang-2.9rc1.src.tar.gz > clang-2.9rc1.src.tar > oracle at vc:~/tmp$ file clang-2.9rc1.src.tar > clang-2.9rc1.src.tar: gzip compressed data, from Unix, last modified: Wed > Mar 9 09:38:38 2011 > oracle at vc:~/tmp$ gzip -cd clang-2.9rc1.src.tar > clang-2.9rc1.src.tar2 > oracle at vc:~/tmp$ tar xf clang-2.9rc1.src.tar2 > Test on mingw64-trunk (i686-w64-mingw32, 4.5.3 20110308): Expected Passes : 5114 Expected Failures : 48 Unsupported Tests : 547 Unexpected Failures: 130 The following failures is very strange: C:\var\pool\llvm-2.9rc1\unittests\Support\raw_ostream_test.cpp:100: Failure Value of: printToStringUnbuffered(1.1) Actual: "1.100000e+000" Expected: "1.100000e+00" [ FAILED ] raw_ostreamTest.Types_Unbuffered (0 ms) C:\var\pool\llvm-2.9rc1\unittests\Support\raw_ostream_test.cpp:69: Failure Value of: printToString(1.1) Actual: "1.100000e+000" Expected: "1.100000e+00" [ FAILED ] raw_ostreamTest.Types_Buffered (0 ms) [----------] 1 test from raw_ostreamTest (0 ms total) -- Dongsheng Song -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/559618bc/attachment.html From dongsheng.song at gmail.com Wed Mar 9 03:54:00 2011 From: dongsheng.song at gmail.com (Dongsheng Song) Date: Wed, 9 Mar 2011 17:54:00 +0800 Subject: [LLVMdev] [cfe-dev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> Message-ID: On Wed, Mar 9, 2011 at 16:57, NAKAMURA Takumi wrote: > On Wed, Mar 9, 2011 at 5:42 PM, Dongsheng Song > wrote: > > The following failures is very strange: > > See Bug 6745 - .ll output is different on windows. > http://llvm.org/bugs/show_bug.cgi?id=6745 > On mingw32, PRINTF_EXPONENT_DIGITS affects printf and tests would run > with PRINTF_EXPONENT_DIGITS=2. > > An addition, it seems mingw-w64's printf tends to drop minus sign of > "-0.0". > (is it same on i686? is it fixed in trunk?) > > Yes and Not. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/0f5335b0/attachment.html From stoklund at 2pi.dk Wed Mar 9 09:20:31 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 9 Mar 2011 07:20:31 -0800 Subject: [LLVMdev] MSVC compiling issue In-Reply-To: <4D77574E.6010608@tu-dresden.de> References: <4D75F1C5.6040900@tu-dresden.de> <4D77574E.6010608@tu-dresden.de> Message-ID: <1455E269-7531-43CE-BDF4-E0BDCBBAD369@2pi.dk> On Mar 9, 2011, at 2:32 AM, Olaf Krzikalla wrote: > Hi @llvm, > > Am 08.03.2011 20:14, schrieb Jakob Stoklund Olesen: >> Is that extra method getting called? What happens if you stick assert(0) in there? > That won't work either (that is, the assert fires). In debug mode the MSVC lib tries to test the ordering of the sequence. And it uses the yielded predicate for this (which in this particular case is a very bad idea). I see. I guess that makes sense if it is written assuming symmetric types. > >> I hoped the symmetric methods would be enough to trick MSVC into compiling it. > Does that mean, that gcc actually only needs > > bool operator()(const LiveRange&A, SlotIndex B) ? Actually, the other way around. > According to C++(2003) 25.0.0.8 the answer is "yes", however that section talks about BinaryPredicate and not Compare. The standard is rather unclear at this point and I'm going over to comp.std.c++ to ask. Howard Hinnant was kind enough to clarify this a while back. http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-August/010379.html > From baldrick at free.fr Wed Mar 9 10:40:30 2011 From: baldrick at free.fr (Duncan Sands) Date: Wed, 09 Mar 2011 17:40:30 +0100 Subject: [LLVMdev] TargetData::getPreferredAlignment(const GlobalVariable *GV) is strange ... In-Reply-To: References: Message-ID: <4D77AD7E.8050008@free.fr> Hi Fabian, > I am somewhat confused by the following method within the LLVM, > especially the lines > "confusion starts" -> "confusion ends" are hard to follow. yes, this seems like a wart. It has been there ever since Chris added the getPreferredAlignmentLog method in commit 25978. Maybe he can comment on whether the code to bump up the alignment for big objects is still needed. Ciao, Duncan. Maybe the > idea is that if there > are such big data structures one does not waste much memory anyway if > they are aligned > to a 16-byte boundary. However, my assembler complains here because > it only supports > 1-, 2-, 4- and 8-byte boundaries :-( > > I checked the svn log but I didn't find any explanation, the doxygen > docu is not very helpful > here, too. So, any help on this issue is highly appreciated. > > Thanks in advance! > > Ciao, Fabian > > unsigned TargetData::getPreferredAlignment(const GlobalVariable *GV) const { > const Type *ElemType = GV->getType()->getElementType(); > unsigned Alignment = getPrefTypeAlignment(ElemType); > if (GV->getAlignment()> Alignment) > Alignment = GV->getAlignment(); > > ==================== confusion starts ======================== > if (GV->hasInitializer()) { > if (Alignment< 16) { > // If the global is not external, see if it is large. If so, give it a > // larger alignment. > if (getTypeSizeInBits(ElemType)> 128) > Alignment = 16; // 16-byte alignment. > } > } > ==================== confusion ends ========================= > > return Alignment; > } > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From grosbach at apple.com Wed Mar 9 10:55:50 2011 From: grosbach at apple.com (Jim Grosbach) Date: Wed, 9 Mar 2011 08:55:50 -0800 Subject: [LLVMdev] Using LLVM to convert a language to another language In-Reply-To: <1299617258.918423143@192.168.4.58> References: <1299617258.918423143@192.168.4.58> Message-ID: <937D110F-9966-4E8D-AF1B-DE27A45D00AC@apple.com> On Mar 8, 2011, at 12:47 PM, krys at polarlights.net wrote: > Hi, > > Sorry for my newbies questions... but here it is... > > My goal is to have a "shading language", it is very similar to the "C" language but with special tokens. > > I have a parser and a lexer done with Lex/bison, once I have lexed/parsed my "shading language" I must create 3 new "source code" in OpenCL. > > It mean that by example for the following : > > shader matte(float Kd, color c) > { > Ci = diffuse(Kd * c); > } > > I have to create 3 methods in OpenCL, by example > > 1 - matte_sampling > 2 - matte_pdf > 3 - matte_f > > Each of theses method is in OpenCL (It is like C too). > > So, I would like to use LLVM to create an optimized version of my shader code. But once I have the LLVM byte code... how can I parse it "easily" to create my shader code ? At a high level, this sounds like like a task better suited for Clang. I'd suggest asking on cfe-dev about source-to-source rewriting. -Jim > > NB: also, you can tell me if it is really a good idea ! > > Thanks > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From sabre at nondot.org Wed Mar 9 12:03:18 2011 From: sabre at nondot.org (Chris Lattner) Date: Wed, 9 Mar 2011 10:03:18 -0800 Subject: [LLVMdev] TargetData::getPreferredAlignment(const GlobalVariable *GV) is strange ... In-Reply-To: <4D77AD7E.8050008@free.fr> References: <4D77AD7E.8050008@free.fr> Message-ID: <10F0D572-D33D-425D-9124-6DB392462C9E@nondot.org> On Mar 9, 2011, at 8:40 AM, Duncan Sands wrote: > Hi Fabian, > >> I am somewhat confused by the following method within the LLVM, >> especially the lines >> "confusion starts" -> "confusion ends" are hard to follow. > > yes, this seems like a wart. It has been there ever since Chris added the > getPreferredAlignmentLog method in commit 25978. Maybe he can comment on > whether the code to bump up the alignment for big objects is still needed. Ah, I really vaguely remember this. IIRC, there was some fortran benchmark (that was running through f2c) which had a large array of doubles. On X86, double is only 4-byte aligned, and the huge array was getting put at a "mod 16=4" offset. This caused really really awful performance. A reasonable solution to this was to bump up the alignment of stuff proactively, but you don't want to do this for everything, because this ends up wasting lots of memory. GCC has a similar policy IIRC. Is there some problem that this is causing? -Chris > Maybe the >> idea is that if there >> are such big data structures one does not waste much memory anyway if >> they are aligned >> to a 16-byte boundary. However, my assembler complains here because >> it only supports >> 1-, 2-, 4- and 8-byte boundaries :-( >> >> I checked the svn log but I didn't find any explanation, the doxygen >> docu is not very helpful >> here, too. So, any help on this issue is highly appreciated. >> >> Thanks in advance! >> >> Ciao, Fabian >> >> unsigned TargetData::getPreferredAlignment(const GlobalVariable *GV) const { >> const Type *ElemType = GV->getType()->getElementType(); >> unsigned Alignment = getPrefTypeAlignment(ElemType); >> if (GV->getAlignment()> Alignment) >> Alignment = GV->getAlignment(); >> >> ==================== confusion starts ======================== >> if (GV->hasInitializer()) { >> if (Alignment< 16) { >> // If the global is not external, see if it is large. If so, give it a >> // larger alignment. >> if (getTypeSizeInBits(ElemType)> 128) >> Alignment = 16; // 16-byte alignment. >> } >> } >> ==================== confusion ends ========================= >> >> return Alignment; >> } >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From wendling at apple.com Wed Mar 9 13:06:41 2011 From: wendling at apple.com (Bill Wendling) Date: Wed, 9 Mar 2011 11:06:41 -0800 Subject: [LLVMdev] 2.9: Umbrella PR Message-ID: <511987EB-B952-418F-B19F-9774337F3798@apple.com> Hey all, I filed this umbrella bug report to track issues we found in the 2.9 release: http://llvm.org/bugs/show_bug.cgi?id=9441 All 2.9 bugs will be listed in there. Also, if there is an open PR, which you think should be fixed in 2.9, please list it in there so that we can review it. Cheers! -bw From johnw at boostpro.com Wed Mar 9 13:27:21 2011 From: johnw at boostpro.com (John Wiegley) Date: Wed, 09 Mar 2011 14:27:21 -0500 Subject: [LLVMdev] Unable to build latest with Visual Studio 2008 Message-ID: Hello, I've been building Clang under Windows 7 and Visual Studio 2008 for a while now, but had not touched it in a few months. Last night I wiped my build tree to do a full rebuild with the latest version, and got the identical error as David Shipman was seeing last September. Are others able to build under VS9 right now? Thanks, John > Subject: Re: [LLVMdev] MS VS2008 build fails - X86AsmParser > From: Chris Lattner ("cla... at apple.com) > Date: Sep 6, 2010 11:11:46 pm > > On Sep 6, 2010, at 10:50 PM, David Shipman wrote: > > > Hi all, > > > > Just tried to build from svn sources with Visual Studio 2008, mostly > > OK but fails > > building the X86AsmParser lib - > > > > I see a few commits from yesterday that may have something to do with it, but no > > idea what the solution is. > > Wow, that's a pretty terrible diagnostic. Does r113198 help? > > -Chris > > > See MSVC's beautiful and concise output below; > > > > Compiling... > > X86AsmParser.cpp > > C:\dev\MSVisualStudio\VC\include\xutility(313) : error C2664: 'bool > > `anonymous-namespace'::LessOpcode::operator ()(llvm::StringRef,const > > `anonymous-namespace'::MatchEntry &)' : cannot convert parameter 1 > > from 'const `anonymous-namespace'::MatchEntry' to 'llvm::StringRef' > > No user-defined-conversion operator available that can perform > > this conversion, or the operator cannot be called > > C:\dev\MSVisualStudio\VC\include\xutility(1699) : see > > reference to function template instantiation 'bool > > std::_Debug_lt_pred<_Pr,`anonymous-namespace'::MatchEntry,`anonymous-namespace'::MatchEntry>(_Pr,const > > _Ty1 &,const _Ty2 &,const wchar_t *,unsigned int)' being compiled > > with > > [ > > _Pr=`anonymous-namespace'::LessOpcode, > > _Ty1=`anonymous-namespace'::MatchEntry, > > _Ty2=`anonymous-namespace'::MatchEntry > > ] > > C:\dev\MSVisualStudio\VC\include\xutility(1709) : see > > reference to function template instantiation 'void > > std::_Debug_order_single2<_InIt,_Pr>(_FwdIt,_FwdIt,_Pr,bool,const > > wchar_t *,unsigned int,std::forward_iterator_tag)' being compiled > > with > > [ > > _InIt=const `anonymous-namespace'::MatchEntry *, > > _Pr=`anonymous-namespace'::LessOpcode, > > _FwdIt=const `anonymous-namespace'::MatchEntry * > > ] > > C:\dev\MSVisualStudio\VC\include\algorithm(2444) : see > > reference to function template instantiation 'void > > std::_Debug_order_single<_FwdIt,_Pr>(_InIt,_InIt,_Pr,bool,const > > wchar_t *,unsigned int)' being compiled > > with > > [ > > _FwdIt=const `anonymous-namespace'::MatchEntry *, > > _Pr=`anonymous-namespace'::LessOpcode, > > _InIt=const `anonymous-namespace'::MatchEntry * > > ] > > C:\dev\MSVisualStudio\VC\include\algorithm(2480) : see > > reference to function template instantiation 'std::pair<_Ty1,_Ty2> > > std::_Equal_range<_FwdIt,_Ty,std::iterator_traits<_Iter>::difference_type,_Pr>(_FwdIt,_FwdIt,const > > _Ty &,_Pr,_Diff *)' being compiled > > with > > [ > > _Ty1=const `anonymous-namespace'::MatchEntry *, > > _Ty2=const `anonymous-namespace'::MatchEntry *, > > _FwdIt=const `anonymous-namespace'::MatchEntry *, > > _Ty=llvm::StringRef, > > _Iter=const `anonymous-namespace'::MatchEntry *, > > _Pr=`anonymous-namespace'::LessOpcode, > > _Diff=std::iterator_traits::difference_type > > ] > > C:\dev\src\llvm\lib\Target\X86\X86GenAsmMatcher.inc(4583) : > > see reference to function template instantiation 'std::pair<_Ty1,_Ty2> > > std::equal_range(_FwdIt,_FwdIt,const _Ty &,_Pr)' being compiled > > with > > [ > > _Ty1=const `anonymous-namespace'::MatchEntry *, > > _Ty2=const `anonymous-namespace'::MatchEntry *, > > _FwdIt=const `anonymous-namespace'::MatchEntry *, > > _Ty=llvm::StringRef, > > _Pr=`anonymous-namespace'::LessOpcode > > ] > > C:\dev\MSVisualStudio\VC\include\xutility(315) : error C2664: 'bool > > `anonymous-namespace'::LessOpcode::operator ()(llvm::StringRef,const > > `anonymous-namespace'::MatchEntry &)' : cannot convert parameter 1 > > from 'const `anonymous-namespace'::MatchEntry' to 'llvm::StringRef' > > No user-defined-conversion operator available that can perform > > this conversion, or the operator cannot be called From reid.kleckner at gmail.com Wed Mar 9 13:34:29 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Wed, 9 Mar 2011 14:34:29 -0500 Subject: [LLVMdev] A working garbage collector - finally :) In-Reply-To: <4D76F910.202@gmail.com> References: <4D76F910.202@gmail.com> Message-ID: Where's the problem? A pointer to v's alloca escapes to llvm.gcroot, so the optimizers should know that foo could modify the value it holds. foo() might also read v[i] through the escaped pointer, so the store will have to happen before the call. Reid 2011/3/8 Rafael ?vila de Esp?ndola : > Since you are using a copying collector, may I ask how do you handle > registers holding pointers to intermediate values? > > The example I am considering is something like > > void foo(void); > void f(long long *v, long long n) { > ? long long int i; > ? for (i = 0; i < n; ++i) { > ? ? v[i] = i; > ? ? foo(); > ? } > } > > If *v points to gc memory, we can start the function with > > %v.addr = alloca i64*, align 8 > %foobar = bitcast i64** %v.addr to i8** > call void @llvm.gcroot(i8** %foobar, i8* null) > store i64* %v, i64** %v.addr, align 8 > > but nothing prevents llvm from putting the call to foo in between the > computation of &v[i] and the store by using a callee saved register. If > GC moves moves v during the call to foo, the next store will be wrong. > >> -- Talin >> > > Cheers, > Rafael > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From pangan at gmail.com Wed Mar 9 13:07:49 2011 From: pangan at gmail.com (Amir Mofakhar) Date: Wed, 9 Mar 2011 19:07:49 +0000 (UTC) Subject: [LLVMdev] how to use external function? References: Message-ID: Eli Friedman gmail.com> writes: > > Try something like the following? > > llvm-link MyOwnFunction.bc m1.bc -o - | lli > > -Eli > same error again! From greened at obbligato.org Wed Mar 9 13:58:45 2011 From: greened at obbligato.org (David A. Greene) Date: Wed, 09 Mar 2011 13:58:45 -0600 Subject: [LLVMdev] Vector select/compare support in LLVM In-Reply-To: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> (Nadav Rotem's message of "Tue, 8 Mar 2011 21:46:45 +0200") References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> Message-ID: "Rotem, Nadav" writes: > I can think of two ways to represent masks in x86: sparse and > packed. In the sparse method, the masks are kept in <4 x 32bit> > registers, which are mapped to xmm registers. This is the ?native? way > of using masks. This argues for the sparse representation, I think. > _Sparse_ After my discussion with Duncan, last week, I started working > on the promotion of type <4 x i1> to <4 x i32>, and I ran into a > problem. ?It looks like the codegen term ?promote? is overloaded. Heavily. :-/ > ?For scalars, the ?promote? operation converts scalars to larger > bit-width scalars. ?For vectors, the ?promote? operation widens the > vector to the next power of two. ?This is reasonable for types such as > ?<3 x float>?. ?Maybe we need to add another legalization operation which > will mean widening the vectors? You mean widening the element type, correct? Yes, that's definitely a useful concept. > ?In any case, I estimated that implementing this per-element promotion > would require major changes and decided that this is not the way to > go. What major changes? I think this will end up giving much better code in the end. The pack/unpack operations could be very expensive. There is another huge cost in using GPRs to hold masks. There will be fewer GPRs to hold addresses, which is a precious resource. We should avoid doing anything that uses more of that resource unnecessarily. -Dave From alex_rosenberg at playstation.sony.com Wed Mar 9 14:14:13 2011 From: alex_rosenberg at playstation.sony.com (alex_rosenberg at playstation.sony.com) Date: Wed, 9 Mar 2011 12:14:13 -0800 Subject: [LLVMdev] Announcement: another Summer 2011 Internship with Sony PlayStation Message-ID: <0810929A-9D9F-498D-9C45-BF29810214AF@playstation.sony.com> Sony Computer Entertainment America is looking for an intern to work on LLVM and/or Clang. The responses to our earlier internship posting were amazing and we've decided to open this additional internship. Please see the official listing for more details: ----------------------------------------- Alex Rosenberg Manager, Platform Architecture Sony Computer Entertainment America, Inc. From rafael.espindola at gmail.com Wed Mar 9 15:03:52 2011 From: rafael.espindola at gmail.com (Rafael Avila de Espindola) Date: Wed, 09 Mar 2011 16:03:52 -0500 Subject: [LLVMdev] A working garbage collector - finally :) In-Reply-To: References: <4D76F910.202@gmail.com> Message-ID: <4D77EB38.4070805@gmail.com> On 11-03-09 02:34 PM, Reid Kleckner wrote: > Where's the problem? A pointer to v's alloca escapes to llvm.gcroot, > so the optimizers should know that foo could modify the value it > holds. foo() might also read v[i] through the escaped pointer, so the > store will have to happen before the call. It is fine, sorry. I had a bug in the example IL I wrote, I had added the call after the load of 'v' but before the getelement pointer. If I put it in the right place I get a load of 'v' in the correct side of the call. > Reid Thanks, Rafael From king19880326 at gmail.com Wed Mar 9 16:13:28 2011 From: king19880326 at gmail.com (Lu Mitnick) Date: Thu, 10 Mar 2011 06:13:28 +0800 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. Message-ID: Hello all, I have some question about usage of TableGen when adding a new LLVM Backend. There are three place to use TableGen in basic steps of document "Writing an LLVM Compiler Backend": 2. Describe the register set of the target. Use "TableGen" to generate code for register definition, register aliases, and register classes from a target-specific RegisterInfo.td input file. 3. Describe the instruction set of the target. Use "TableGen" to generate code for target-specific instructions from target-specific versions of TargetInstrFormats.td andTargetInstrInfo.td. 4. Describe the selection and conversion of the LLVM IR from a Directed Acyclic Graph (DAG) representation of instructions to native target-specific instructions. Use "TableGen" to generate code that matches patterns and selects instructions based on additional information in a target-specific version of TargetInstrInfo.td. I have already read the document "TableGen Fundamentals" and write correspond .td files in each steps. However I don't know which TableGen options should I use in 2, 3 or 4 steps as above. Would anyone mind to tell me?? thanks a lot yi-hong -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110310/dec7192e/attachment.html From stoklund at 2pi.dk Wed Mar 9 16:44:10 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 9 Mar 2011 14:44:10 -0800 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: References: Message-ID: On Mar 9, 2011, at 2:13 PM, Lu Mitnick wrote: > Hello all, > > I have some question about usage of TableGen when adding a new LLVM Backend. There are three place to use TableGen in basic steps of document "Writing an LLVM Compiler > Backend": > > 2. Describe the register set of the target. Use "TableGen" to generate code for register definition, register aliases, and register classes from a target-specific RegisterInfo.td input file. > > 3. Describe the instruction set of the target. Use "TableGen" to generate code for target-specific instructions from target-specific versions of TargetInstrFormats.td andTargetInstrInfo.td. > > 4. Describe the selection and conversion of the LLVM IR from a Directed Acyclic Graph (DAG) representation of instructions to native target-specific instructions. Use "TableGen" to generate code that matches patterns and selects instructions based on additional information in a target-specific version of TargetInstrInfo.td. > > I have already read the document "TableGen Fundamentals" and write correspond .td files in each steps. However I don't know which TableGen options should I use in 2, 3 or 4 steps as above. Would anyone mind to tell me?? Look at Makefile.rules in the LLVM top-level directory. There is a bunch of targets executing $(TableGen). You can also run 'make VERBOSE=1' to see the commands used to build the existing targets. /jakob From eli.friedman at gmail.com Wed Mar 9 17:53:28 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 9 Mar 2011 18:53:28 -0500 Subject: [LLVMdev] how to use external function? In-Reply-To: References: Message-ID: On Wed, Mar 9, 2011 at 2:07 PM, Amir Mofakhar wrote: > Eli Friedman gmail.com> writes: > >> >> Try something like the following? >> >> llvm-link MyOwnFunction.bc m1.bc -o - | lli >> >> -Eli >> > > same error again! Strange... it works for me. Maybe try reading the docs for llvm-link? $ llvm-as -o m1.bc define i32 @main() { entry: %tmp0 = call i32 @MyOwnFunction() ret i32 0 } declare i32 @MyOwnFunction() $ llvm-as -o MyOwnFunction.bc ;ModuleID = 'MyOwnFunction' define i32 @MyOwnFunction() { entry: ret i32 55 } $ llvm-link m1.bc MyOwnFunction.bc -o - | lli $ -Eli From damien.llvm at gmail.com Wed Mar 9 20:32:43 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Wed, 9 Mar 2011 18:32:43 -0800 Subject: [LLVMdev] Parsing dwarf debug info of an GAS assembly file Message-ID: I have a question not strictly related to LLVM: I know there is a tool (libdwarf / dwarfdump) to dump/parse debug information of an object file, but do you know a tool that can parse dwarf sections of a ".s" GAS assembly file ? Thank you, Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110309/0adbfbe8/attachment.html From clchiou at gmail.com Wed Mar 9 21:03:50 2011 From: clchiou at gmail.com (Che-Liang Chiou) Date: Thu, 10 Mar 2011 11:03:50 +0800 Subject: [LLVMdev] [PTX] Should we keep backward-compatibility of PTX? Message-ID: Hi Justin, There are some backward incompatible features of PTX; for example, special registers are redefined as v4i32 (they were v4i16) in PTX 2.0. And CUDA 4.0 was rolled out last week. I heard that some instructions are deprecated. I am not sure how stable (or unstable) PTX specification is. Do you have a rough assessment of its stability? If PTX specification is still fast evolving, I would suggest we keep up with latest specification, and consider backward compatibility later when it is stabilized. What do you think? Regards, Che-Liang From eric at boostpro.com Wed Mar 9 21:55:26 2011 From: eric at boostpro.com (Eric Niebler) Date: Thu, 10 Mar 2011 10:55:26 +0700 Subject: [LLVMdev] host triple for Win64? Message-ID: <4D784BAE.2050109@boostpro.com> What host triple should I be using to specifically target x86_64 / Win64? I notice there is no Win64 in the OSType enum, but there is Win32. And yet I know llvm can target Win64, right? -- Eric Niebler BoostPro Computing http://www.boostpro.com From king19880326 at gmail.com Wed Mar 9 22:15:58 2011 From: king19880326 at gmail.com (Lu Mitnick) Date: Thu, 10 Mar 2011 12:15:58 +0800 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: References: Message-ID: Hello Jakob, Is this means that TableGen execution is handled in Makefile. Porting programmer doesn't need to execute TableGen by hand? thanks 2011/3/10 Jakob Stoklund Olesen > > On Mar 9, 2011, at 2:13 PM, Lu Mitnick wrote: > > > Hello all, > > > > I have some question about usage of TableGen when adding a new LLVM > Backend. There are three place to use TableGen in basic steps of document > "Writing an LLVM Compiler > > Backend": > > > > 2. Describe the register set of the target. Use "TableGen" to generate > code for register definition, register aliases, and register classes from a > target-specific RegisterInfo.td input file. > > > > 3. Describe the instruction set of the target. Use "TableGen" to generate > code for target-specific instructions from target-specific versions of > TargetInstrFormats.td andTargetInstrInfo.td. > > > > 4. Describe the selection and conversion of the LLVM IR from a Directed > Acyclic Graph (DAG) representation of instructions to native target-specific > instructions. Use "TableGen" to generate code that matches patterns and > selects instructions based on additional information in a target-specific > version of TargetInstrInfo.td. > > > > I have already read the document "TableGen Fundamentals" and write > correspond .td files in each steps. However I don't know which TableGen > options should I use in 2, 3 or 4 steps as above. Would anyone mind to tell > me?? > > Look at Makefile.rules in the LLVM top-level directory. There is a bunch of > targets executing $(TableGen). > > You can also run 'make VERBOSE=1' to see the commands used to build the > existing targets. > > /jakob > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110310/44221788/attachment.html From stoklund at 2pi.dk Wed Mar 9 22:35:28 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 9 Mar 2011 20:35:28 -0800 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: References: Message-ID: On Mar 9, 2011, at 8:15 PM, Lu Mitnick wrote: > Hello Jakob, > > Is this means that TableGen execution is handled in Makefile. Porting programmer doesn't need to execute TableGen by hand? That's right. You are going to be editing your .td files a lot, so you want that integrated in the build system. From geek4civic at gmail.com Wed Mar 9 23:07:16 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Thu, 10 Mar 2011 14:07:16 +0900 Subject: [LLVMdev] host triple for Win64? In-Reply-To: <4D784BAE.2050109@boostpro.com> References: <4D784BAE.2050109@boostpro.com> Message-ID: Eric, "x86_64-(pc)-win32" on ToT. AFAIK no one cares vendor on windows. We can know, to ask llvm::Triple, (OS == win32 && (arch == x86 || arch == x86_64)). I don't understand why do you need to distinguish Windows x64 with OSType. (In contrast, I got rid of "-mingw64" and integrated to "-mingw32".) Please let me know your matters. I suppose "Win32" would not present "on 32bit" any more, like a synonym "Windows NT API". :) ...Takumi From kecheng at cecs.pdx.edu Wed Mar 9 23:26:30 2011 From: kecheng at cecs.pdx.edu (kecheng at cecs.pdx.edu) Date: Wed, 09 Mar 2011 21:26:30 -0800 Subject: [LLVMdev] pass statistic Message-ID: <20110309212630.44855csk64fe3dhc@webmail.cecs.pdx.edu> Hi folks, I wonder how to get the statistic of which pass has been "really" applied and which one is not. For instance, I try to apply 20 llvm passes on a single C source code. But since the precondition of each pass may not be satisfied (try loop-unrolling to a source code without loop), some of these pass may not affect the final result. How to know which pass affect and which one is ignored? Does Llvm have this kind of statistic? Thanks. Best, Kecheng From criswell at illinois.edu Wed Mar 9 23:31:29 2011 From: criswell at illinois.edu (John Criswell) Date: Wed, 09 Mar 2011 23:31:29 -0600 Subject: [LLVMdev] pass statistic In-Reply-To: <20110309212630.44855csk64fe3dhc@webmail.cecs.pdx.edu> References: <20110309212630.44855csk64fe3dhc@webmail.cecs.pdx.edu> Message-ID: <4D786231.10405@illinois.edu> On 3/9/2011 11:26 PM, kecheng at cecs.pdx.edu wrote: > Hi folks, > > I wonder how to get the statistic of which pass has been "really" > applied and which one is not. For instance, I try to apply 20 llvm > passes on a single C source code. But since the precondition of each > pass may not be satisfied (try loop-unrolling to a source code without > loop), some of these pass may not affect the final result. How to know > which pass affect and which one is ignored? Does Llvm have this kind > of statistic? Thanks. One option is to use the -stats option with opt and hope that a transform keeps statistics on how many transforms it makes. However, this approach is fragile because an arbitrary LLVM pass may not record any statistics on what it changes. Another approach is to use the -debug-pass=details option in opt. I believe it will tell you when a pass has modified the code and when it has not. That said, some transform passes may tell the PassManager that they've modified the program when, in fact, they haven't simply because it's too much programming work to track, within the pass, whether it has modified anything. A third option might be to write a pass that somehow records the current state of the bitcode and compares it to the state it saw when it last executed. You could then run this pass in between every other pass to detect cases where the module does not change. So, there are some ways to do it, but only the third option (the most time-consuming to do) looks fool-proof. -- John T. > Best, > > Kecheng > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Thu Mar 10 01:53:43 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 10:53:43 +0300 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: <4D75A931.3030306@fim.uni-passau.de> References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> <4D75A931.3030306@fim.uni-passau.de> Message-ID: Hi Tobias, > The following expression e.g. > > /^.*(? > uses lookbehind to matches on: Thanks. Clever trick, but... Variable length lookbehind not implemented in regex m/^.*(? Hi All, With 2.9 starting to make its way out into the world, it is time to start poking at the release notes. I plan to make a pass through llvm-commits to cull some of the major changes into bullets, but am already behind and insanely busy with other stuff over the next week. If you have commit access, I'd really appreciate it if you could take a pass through llvm/docs/ReleaseNotes.html to fill in notes about things that are new in the release (particularly for clang, mc, dragonegg and other subprojects) and important API changes for external clients. Feel free to directly commit to the release notes, I will fill in more details when I get bandwidth and will generally tidy it up and edit it, so don't worry about it all fitting together well at this point. If you have an external project that works with LLVM 2.9, please send me a blurb offlist and I'll include it in the release notes. We've been doing this over the last couple of releases and I think it adds a lot of value to show the different ways that LLVM gets used. Please send me a blurb along the lines of the examples from 2.8: http://llvm.org/releases/2.8/docs/ReleaseNotes.html#externalproj Note that I've zapped *all* of the existing blurbs from the 2.8 release notes, so if you want to be included in the 2.9 notes, please send me updated text. Thanks all, this is going to be a great release! -Chris From anton at korobeynikov.info Thu Mar 10 01:56:22 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 10:56:22 +0300 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: <4D75A931.3030306@fim.uni-passau.de> References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> <4D75A931.3030306@fim.uni-passau.de> Message-ID: > I would love to. However, as David pointed out, this is difficult with the > blocked svn access. I believe that --no-minimize-url might help: --no-minimize-url When tracking multiple directories (using --stdlayout, --branches, or --tags options), git svn will attempt to connect to the root (or highest allowed level) of the Subversion repository. This default allows better tracking of history if entire projects are moved within a repository, but may cause issues on repositories where read access restrictions are in place. Passing --no-minimize-url will allow git svn to accept URLs as-is without attempting to connect to a higher level directory. This option is off by default when only one URL/branch is tracked (it would do little good). -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From nadav.rotem at intel.com Thu Mar 10 03:03:53 2011 From: nadav.rotem at intel.com (Rotem, Nadav) Date: Thu, 10 Mar 2011 11:03:53 +0200 Subject: [LLVMdev] Vector select/compare support in LLVM In-Reply-To: References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> Message-ID: <6594DDFF12B03D4E89690887C2486994027129FB10@hasmsx504.ger.corp.intel.com> Hi David, The MOVMSKPS instruction is cheap (2 cycles). Not to be confused with VMASKMOV, the AVX masked move, which is expensive. One of the arguments for packing masks is that it reduces vector-registers pressure. Auto-vectorizing compilers maintain multiple masks for different execution paths (for each loop nesting, etc). Saving masks in xmm registers may result in vector-register pressure which will cause spilling of these registers. I agree with you that GP registers are also a precious resource. I am not sure what is the best way to store masks. In my private branch, I added the [v4i1 .. v64i1] types. I also implemented a new type of target lowering: "PACK". This lowering packs vectors of i1s into integer registers. For example, the <4 x i1> type would get packed into the i8 type. I modified LegalizeTypes and LegalizeVectorTypes and added legalization for SETCC, XOR, OR, AND, and BUILD_VECTOR. I also changed the x86 lowering of SELECT to prevent lowering of selects with vector condition operand. Next, I am going to add new patterns for SETCC and SELECT which use i8/i16/i32/i64 as a condition value. I also plan to experiment with promoting <4 x i1> to <4 x i32>. At this point I can't really say what needs to be done. Implementing this kind of promotion also requires adding legalization support for strange vector types such as <4 x i65>. -Nadav -----Original Message----- From: David A. Greene [mailto:greened at obbligato.org] Sent: Wednesday, March 09, 2011 21:59 To: Rotem, Nadav Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Vector select/compare support in LLVM "Rotem, Nadav" writes: > I can think of two ways to represent masks in x86: sparse and > packed. In the sparse method, the masks are kept in <4 x 32bit> > registers, which are mapped to xmm registers. This is the ?native? way > of using masks. This argues for the sparse representation, I think. > _Sparse_ After my discussion with Duncan, last week, I started working > on the promotion of type <4 x i1> to <4 x i32>, and I ran into a > problem. ?It looks like the codegen term ?promote? is overloaded. Heavily. :-/ > ?For scalars, the ?promote? operation converts scalars to larger > bit-width scalars. ?For vectors, the ?promote? operation widens the > vector to the next power of two. ?This is reasonable for types such as > ?<3 x float>?. ?Maybe we need to add another legalization operation which > will mean widening the vectors? You mean widening the element type, correct? Yes, that's definitely a useful concept. > ?In any case, I estimated that implementing this per-element promotion > would require major changes and decided that this is not the way to > go. What major changes? I think this will end up giving much better code in the end. The pack/unpack operations could be very expensive. There is another huge cost in using GPRs to hold masks. There will be fewer GPRs to hold addresses, which is a precious resource. We should avoid doing anything that uses more of that resource unnecessarily. -Dave --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From fabian.scheler at gmail.com Thu Mar 10 06:23:54 2011 From: fabian.scheler at gmail.com (Fabian Scheler) Date: Thu, 10 Mar 2011 13:23:54 +0100 Subject: [LLVMdev] TargetData::getPreferredAlignment(const GlobalVariable *GV) is strange ... In-Reply-To: <10F0D572-D33D-425D-9124-6DB392462C9E@nondot.org> References: <4D77AD7E.8050008@free.fr> <10F0D572-D33D-425D-9124-6DB392462C9E@nondot.org> Message-ID: > Ah, I really vaguely remember this. ?IIRC, there was some fortran benchmark (that was running through f2c) which had a large array of doubles. ?On X86, double is only 4-byte aligned, and the huge array was getting put at a "mod 16=4" offset. ?This caused really really awful performance. > > A reasonable solution to this was to bump up the alignment of stuff proactively, but you don't want to do this for everything, because this ends up wasting lots of memory. ?GCC has a similar policy IIRC. > > Is there some problem that this is causing? Well, LLVM generates assembly that is not acepted by the binutils for the TriCore processor I am using if the program contains large structs. If LLVM-targets had a "maxAlign"-property this could be tuned according to the needs of the specific target. Ciao, Fabian From nadav.rotem at intel.com Thu Mar 10 06:44:31 2011 From: nadav.rotem at intel.com (Rotem, Nadav) Date: Thu, 10 Mar 2011 14:44:31 +0200 Subject: [LLVMdev] Vector select/compare support in LLVM References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> Message-ID: <6594DDFF12B03D4E89690887C248699402712F88A3@hasmsx504.ger.corp.intel.com> After I implemented a new type of legalization (the packing of i1 vectors), I found that x86 does not have a way to load packed masks into SSE registers. So, I guess that legalizing of <4 x i1> to <4 x i32> is the way to go. Cheers, Nadav -----Original Message----- From: Rotem, Nadav Sent: Thursday, March 10, 2011 11:04 To: 'David A. Greene' Cc: llvmdev at cs.uiuc.edu Subject: RE: [LLVMdev] Vector select/compare support in LLVM Hi David, The MOVMSKPS instruction is cheap (2 cycles). Not to be confused with VMASKMOV, the AVX masked move, which is expensive. One of the arguments for packing masks is that it reduces vector-registers pressure. Auto-vectorizing compilers maintain multiple masks for different execution paths (for each loop nesting, etc). Saving masks in xmm registers may result in vector-register pressure which will cause spilling of these registers. I agree with you that GP registers are also a precious resource. I am not sure what is the best way to store masks. In my private branch, I added the [v4i1 .. v64i1] types. I also implemented a new type of target lowering: "PACK". This lowering packs vectors of i1s into integer registers. For example, the <4 x i1> type would get packed into the i8 type. I modified LegalizeTypes and LegalizeVectorTypes and added legalization for SETCC, XOR, OR, AND, and BUILD_VECTOR. I also changed the x86 lowering of SELECT to prevent lowering of selects with vector condition operand. Next, I am going to add new patterns for SETCC and SELECT which use i8/i16/i32/i64 as a condition value. I also plan to experiment with promoting <4 x i1> to <4 x i32>. At this point I can't really say what needs to be done. Implementing this kind of promotion also requires adding legalization support for strange vector types such as <4 x i65>. -Nadav -----Original Message----- From: David A. Greene [mailto:greened at obbligato.org] Sent: Wednesday, March 09, 2011 21:59 To: Rotem, Nadav Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Vector select/compare support in LLVM "Rotem, Nadav" writes: > I can think of two ways to represent masks in x86: sparse and > packed. In the sparse method, the masks are kept in <4 x 32bit> > registers, which are mapped to xmm registers. This is the ?native? way > of using masks. This argues for the sparse representation, I think. > _Sparse_ After my discussion with Duncan, last week, I started working > on the promotion of type <4 x i1> to <4 x i32>, and I ran into a > problem. ?It looks like the codegen term ?promote? is overloaded. Heavily. :-/ > ?For scalars, the ?promote? operation converts scalars to larger > bit-width scalars. ?For vectors, the ?promote? operation widens the > vector to the next power of two. ?This is reasonable for types such as > ?<3 x float>?. ?Maybe we need to add another legalization operation which > will mean widening the vectors? You mean widening the element type, correct? Yes, that's definitely a useful concept. > ?In any case, I estimated that implementing this per-element promotion > would require major changes and decided that this is not the way to > go. What major changes? I think this will end up giving much better code in the end. The pack/unpack operations could be very expensive. There is another huge cost in using GPRs to hold masks. There will be fewer GPRs to hold addresses, which is a precious resource. We should avoid doing anything that uses more of that resource unnecessarily. -Dave --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From justin.holewinski at gmail.com Thu Mar 10 07:15:10 2011 From: justin.holewinski at gmail.com (Justin Holewinski) Date: Thu, 10 Mar 2011 08:15:10 -0500 Subject: [LLVMdev] [PTX] Should we keep backward-compatibility of PTX? In-Reply-To: References: Message-ID: On Wed, Mar 9, 2011 at 10:03 PM, Che-Liang Chiou wrote: > Hi Justin, > > There are some backward incompatible features of PTX; for example, > special registers are redefined as v4i32 (they were v4i16) in PTX 2.0. > And CUDA 4.0 was rolled out last week. I heard that some instructions > are deprecated. > I have not checked out CUDA 4.0 yet, but any deprecated instructions would need to be made so as part of a separate PTX version, either 2.2 or 3.0. My suggestion is to stay thorough with the intrinsics. Lets create i16 and i32 versions of both, and emit appropriate cvt instructions if necessary for the target PTX version. This should be easy enough using the existing PTXVersion field in PTXSubtarget. > > I am not sure how stable (or unstable) PTX specification is. Do you > have a rough assessment of its stability? > >From what I can tell, it is fairly stable. When things change, they seem to primarily be additions to the ISA. > > If PTX specification is still fast evolving, I would suggest we keep > up with latest specification, and consider backward compatibility > later when it is stabilized. What do you think? > I'm fine with that, as long as later PTX versions do not require later shader models. I want to maintain compatibility as far back as shader model 1.0 for some older hardware I want to test with. Besides, I think most of the functionality in newer PTX versions can be easily predicated with sub-target flags. > > Regards, > Che-Liang > -- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110310/03133d2c/attachment-0001.html From grosser at fim.uni-passau.de Thu Mar 10 07:48:50 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Thu, 10 Mar 2011 08:48:50 -0500 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> <4D75A931.3030306@fim.uni-passau.de> Message-ID: <4D78D6C2.9070906@fim.uni-passau.de> On 03/10/2011 02:56 AM, Anton Korobeynikov wrote: >> I would love to. However, as David pointed out, this is difficult with the >> blocked svn access. > I believe that --no-minimize-url might help: > > --no-minimize-url > When tracking multiple directories (using --stdlayout, --branches, or > --tags options), git svn will attempt to connect to the root (or > highest allowed level) of the Subversion repository. This default > allows better tracking of history if entire projects are moved within > a repository, but may cause issues on repositories where read access > restrictions are in place. Passing --no-minimize-url will allow git > svn to accept URLs as-is without attempting to connect to a higher > level directory. This option is off by default when only one > URL/branch is tracked (it would do little good). > Sorry I did not get it. git svn clone https://grosser at llvm.org/svn/llvm-project/llvm --branches=branches --no-minimize-url Initialized empty Git repository in /tmp/test3/llvm/.git/ Found possible branch point: https://grosser at llvm.org/svn/llvm-project/llvm/trunk => https://grosser at llvm.org/svn/llvm-project/llvm/branches/llvm, 2 Initializing parent: refs/remotes/llvm at 2 RA layer request failed: Server sent unexpected return value (403 Forbidden) in response to REPORT request for '/svn/llvm-project/!svn/vcc/default' at /usr/lib/git-core/git-svn line 5061 This still gives me the same error. This time its my turn to ask for a command line that works. ;-) Or any other solution that helps me testing this. ;-) From grosser at fim.uni-passau.de Thu Mar 10 07:54:16 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Thu, 10 Mar 2011 08:54:16 -0500 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> <4D75A931.3030306@fim.uni-passau.de> Message-ID: <4D78D808.1000805@fim.uni-passau.de> On 03/10/2011 02:53 AM, Anton Korobeynikov wrote: > Hi Tobias, > >> The following expression e.g. >> >> /^.*(?> >> uses lookbehind to matches on: > Thanks. Clever trick, but... > > Variable length lookbehind not implemented in regex > m/^.*(? > :( I got a new one. Today even with working test case: $cat in.txt tags/SVA tags/eh-experimental tags/ggreif tags/non-call-eh tags/RELEASE_28 at 115869 branches/Apple branches/PowerPC-A branches/PowerPC-B trunk tags/RELEASE_28 tags/RELEASE_29 tags/RELEASE_27 $cat rev_match #!/usr/bin/perl -wn print if /.(? References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> <6594DDFF12B03D4E89690887C248699402712F88A3@hasmsx504.ger.corp.intel.com> Message-ID: <4D78E035.5000308@gmx.de> Hey, I am currently forced to create the BLENDVPS intrinsic as an external call (via Intrinsic::x86_sse41_blendvps) which has the following signature (from IntrinsicsX86.td): def int_x86_sse41_blendvps : GCCBuiltin<"__builtin_ia32_blendvps">, Intrinsic<[llvm_v4f32_ty],[llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty],[IntrNoMem]> Thus, it expects the mask (first operand if i recall correctly) to be a <4 x float>. It would be great to have this mirrored in the IR, meaning one should be able to create a SelectInst with 3 <4 x float> operands which would generate this intrinsic. Is there anything that speaks against this? I think I also recall something similar for ICmp/FCmp instructions... Best, Ralf P.S. I am not up-to-date on the latest status of "direct" support of vector instructions, the corresponding part of my system has been written over a year ago. On 3/10/11 1:44 PM, Rotem, Nadav wrote: > After I implemented a new type of legalization (the packing of i1 vectors), I found that x86 does not have a way to load packed masks into SSE registers. So, I guess that legalizing of<4 x i1> to<4 x i32> is the way to go. > > Cheers, > Nadav > > -----Original Message----- > From: Rotem, Nadav > Sent: Thursday, March 10, 2011 11:04 > To: 'David A. Greene' > Cc: llvmdev at cs.uiuc.edu > Subject: RE: [LLVMdev] Vector select/compare support in LLVM > > Hi David, > > The MOVMSKPS instruction is cheap (2 cycles). Not to be confused with VMASKMOV, the AVX masked move, which is expensive. > > One of the arguments for packing masks is that it reduces vector-registers pressure. Auto-vectorizing compilers maintain multiple masks for different execution paths (for each loop nesting, etc). Saving masks in xmm registers may result in vector-register pressure which will cause spilling of these registers. I agree with you that GP registers are also a precious resource. > I am not sure what is the best way to store masks. > > In my private branch, I added the [v4i1 .. v64i1] types. I also implemented a new type of target lowering: "PACK". This lowering packs vectors of i1s into integer registers. For example, the<4 x i1> type would get packed into the i8 type. I modified LegalizeTypes and LegalizeVectorTypes and added legalization for SETCC, XOR, OR, AND, and BUILD_VECTOR. I also changed the x86 lowering of SELECT to prevent lowering of selects with vector condition operand. Next, I am going to add new patterns for SETCC and SELECT which use i8/i16/i32/i64 as a condition value. > > I also plan to experiment with promoting<4 x i1> to<4 x i32>. At this point I can't really say what needs to be done. Implementing this kind of promotion also requires adding legalization support for strange vector types such as<4 x i65>. > > -Nadav > > > > -----Original Message----- > From: David A. Greene [mailto:greened at obbligato.org] > Sent: Wednesday, March 09, 2011 21:59 > To: Rotem, Nadav > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Vector select/compare support in LLVM > > "Rotem, Nadav" writes: > >> I can think of two ways to represent masks in x86: sparse and >> packed. In the sparse method, the masks are kept in<4 x 32bit> >> registers, which are mapped to xmm registers. This is the ?native? way >> of using masks. > > This argues for the sparse representation, I think. > >> _Sparse_ After my discussion with Duncan, last week, I started working >> on the promotion of type<4 x i1> to<4 x i32>, and I ran into a >> problem. It looks like the codegen term ?promote? is overloaded. > > Heavily. :-/ > >> For scalars, the ?promote? operation converts scalars to larger >> bit-width scalars. For vectors, the ?promote? operation widens the >> vector to the next power of two. This is reasonable for types such as >> ?<3 x float>?. Maybe we need to add another legalization operation which >> will mean widening the vectors? > > You mean widening the element type, correct? Yes, that's definitely a > useful concept. > >> In any case, I estimated that implementing this per-element promotion >> would require major changes and decided that this is not the way to >> go. > > What major changes? I think this will end up giving much better code in > the end. The pack/unpack operations could be very expensive. > > There is another huge cost in using GPRs to hold masks. There will be > fewer GPRs to hold addresses, which is a precious resource. We should > avoid doing anything that uses more of that resource unnecessarily. > > -Dave > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Thu Mar 10 09:28:55 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 18:28:55 +0300 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: <4D78D808.1000805@fim.uni-passau.de> References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> <4D75A931.3030306@fim.uni-passau.de> <4D78D808.1000805@fim.uni-passau.de> Message-ID: Hi Tobias, > I got a new one. Today even with working test case: Thanks, will check :) In the meantime, I added branches / tags to clang.git, but the list right now is edited by hands. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From anton at korobeynikov.info Thu Mar 10 09:31:49 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 18:31:49 +0300 Subject: [LLVMdev] How to make release branch available in git (topic changed) In-Reply-To: References: <2E128B2A-16F5-40D4-B4EC-ADD60A49E015@apple.com> <17885E39-D839-4185-BD04-9C585B1D3CCD@mac.com> <87mxlei5be.fsf@smith.obbligato.org> <4D75A931.3030306@fim.uni-passau.de> <4D78D808.1000805@fim.uni-passau.de> Message-ID: >> I got a new one. Today even with working test case: > Thanks, will check :) In the meantime, I added branches / tags to > clang.git, but the list right now is edited by hands. Just a heads up: this is experimental stuff and subject to be changed /removed :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From anton at korobeynikov.info Thu Mar 10 10:04:14 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 19:04:14 +0300 Subject: [LLVMdev] GIT mirrors Message-ID: Hello Everyone I'm going to rebuild GIT mirrors to provide more consistent tags / branches. Basically, the contents will be the same, but naming scheme will change. So, trunk will go to svn/trunk, branches/* will got to svn/branches/* and tags/* will go to svn/tags/* This might break some scripts, etc. So, if anyone has some objections wrt this - please let me know. PS: First clang.git will be converted. In any case, the conversion will be done offline and then the repos will be changed at once. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From sebastian.redl at getdesigned.at Thu Mar 10 10:37:32 2011 From: sebastian.redl at getdesigned.at (Sebastian Redl) Date: Thu, 10 Mar 2011 17:37:32 +0100 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: References: Message-ID: On 10.03.2011, at 17:04, Anton Korobeynikov wrote: > Hello Everyone > > I'm going to rebuild GIT mirrors to provide more consistent tags / branches. > Basically, the contents will be the same, but naming scheme will change. > > So, trunk will go to svn/trunk, branches/* will got to svn/branches/* > and tags/* will go to svn/tags/* What does that mean for existing clones of the mirror? Sebastian From greened at obbligato.org Thu Mar 10 10:57:07 2011 From: greened at obbligato.org (David A. Greene) Date: Thu, 10 Mar 2011 10:57:07 -0600 Subject: [LLVMdev] Vector select/compare support in LLVM In-Reply-To: <6594DDFF12B03D4E89690887C2486994027129FB10@hasmsx504.ger.corp.intel.com> (Nadav Rotem's message of "Thu, 10 Mar 2011 11:03:53 +0200") References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> <6594DDFF12B03D4E89690887C2486994027129FB10@hasmsx504.ger.corp.intel.com> Message-ID: "Rotem, Nadav" writes: > One of the arguments for packing masks is that it reduces > vector-registers pressure. Auto-vectorizing compilers maintain > multiple masks for different execution paths (for each loop nesting, > etc). Saving masks in xmm registers may result in vector-register > pressure which will cause spilling of these registers. I agree with > you that GP registers are also a precious resource. GPRs are more precious than vector registers in my experience. Spilling a vector register isn't that painful. Spilling a GPR holding an address is disastrous. > In my private branch, I added the [v4i1 .. v64i1] types. I also > implemented a new type of target lowering: "PACK". This lowering packs Is PACK in the X86 namespace? It seems a pretty target-specific thing. > I also plan to experiment with promoting <4 x i1> to <4 x i32>. At > this point I can't really say what needs to be done. Implementing > this kind of promotion also requires adding legalization support for > strange vector types such as <4 x i65>. How often do we see something like that? Baby steps, baby steps... :) -Dave From greened at obbligato.org Thu Mar 10 10:59:22 2011 From: greened at obbligato.org (David A. Greene) Date: Thu, 10 Mar 2011 10:59:22 -0600 Subject: [LLVMdev] Vector select/compare support in LLVM In-Reply-To: <4D78E035.5000308@gmx.de> (Ralf Karrenberg's message of "Thu, 10 Mar 2011 15:29:09 +0100") References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> <6594DDFF12B03D4E89690887C248699402712F88A3@hasmsx504.ger.corp.intel.com> <4D78E035.5000308@gmx.de> Message-ID: Ralf Karrenberg writes: > Hey, > > I am currently forced to create the BLENDVPS intrinsic as an external > call (via Intrinsic::x86_sse41_blendvps) which has the following > signature (from IntrinsicsX86.td): > > def int_x86_sse41_blendvps : > GCCBuiltin<"__builtin_ia32_blendvps">, > Intrinsic<[llvm_v4f32_ty],[llvm_v4f32_ty, llvm_v4f32_ty, > llvm_v4f32_ty],[IntrNoMem]> > > Thus, it expects the mask (first operand if i recall correctly) to be a > <4 x float>. > It would be great to have this mirrored in the IR, meaning one should be > able to create a SelectInst with 3 <4 x float> operands which would > generate this intrinsic. > Is there anything that speaks against this? To me a v4i1 makes more sense as an IR mask type. The fact that on X86 the native mask type is v4i32 should be handled by the X86 codegen, I think. Another option is to rewrite the intrinsic to take a v4i1. Or more correctly, create a new intrinsic to live alongside the existing one, since we want the existing one for gcc compatibility. -Dave From greened at obbligato.org Thu Mar 10 11:00:25 2011 From: greened at obbligato.org (David A. Greene) Date: Thu, 10 Mar 2011 11:00:25 -0600 Subject: [LLVMdev] GIT mirrors In-Reply-To: (Anton Korobeynikov's message of "Thu, 10 Mar 2011 19:04:14 +0300") References: Message-ID: Anton Korobeynikov writes: > Hello Everyone > > I'm going to rebuild GIT mirrors to provide more consistent tags / branches. > Basically, the contents will be the same, but naming scheme will change. > > So, trunk will go to svn/trunk, branches/* will got to svn/branches/* > and tags/* will go to svn/tags/* Thanks Anton! I'm glad you are willing to work this through as we all learn how this stuff works. :) -Dave From dpatel at apple.com Thu Mar 10 11:20:45 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 10 Mar 2011 09:20:45 -0800 Subject: [LLVMdev] Parsing dwarf debug info of an GAS assembly file In-Reply-To: References: Message-ID: <6E7CF385-0401-404C-AEAC-8D4DC93FBE68@apple.com> On Mar 9, 2011, at 6:32 PM, Damien Vincent wrote: > > I have a question not strictly related to LLVM: > I know there is a tool (libdwarf / dwarfdump) to dump/parse debug information of an object file, > but do you know a tool that can parse dwarf sections of a ".s" GAS assembly file ? I do not know any tool other than the assembler itself. What are you trying to do ? - Devang From bob.wilson at apple.com Thu Mar 10 11:28:57 2011 From: bob.wilson at apple.com (Bob Wilson) Date: Thu, 10 Mar 2011 09:28:57 -0800 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: References: Message-ID: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> Is this going to rewrite the entire history of everything on trunk? If I have a local git repo that has been pulling from the existing mirror, what will be the implications of your change? On Mar 10, 2011, at 8:04 AM, Anton Korobeynikov wrote: > Hello Everyone > > I'm going to rebuild GIT mirrors to provide more consistent tags / branches. > Basically, the contents will be the same, but naming scheme will change. > > So, trunk will go to svn/trunk, branches/* will got to svn/branches/* > and tags/* will go to svn/tags/* > > This might break some scripts, etc. So, if anyone has some objections > wrt this - please let me know. > > PS: First clang.git will be converted. In any case, the conversion > will be done offline and then the repos will be changed at once. > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From stoklund at 2pi.dk Thu Mar 10 11:37:09 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Thu, 10 Mar 2011 09:37:09 -0800 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> Message-ID: <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> On Mar 10, 2011, at 9:28 AM, Bob Wilson wrote: > Is this going to rewrite the entire history of everything on trunk? If I have a local git repo that has been pulling from the existing mirror, what will be the implications of your change? The extra branches that Anton added to clang.git didn't rewrite anything. Here's hoping it won't be necessary for llvm.git either. /jakob > On Mar 10, 2011, at 8:04 AM, Anton Korobeynikov wrote: > >> Hello Everyone >> >> I'm going to rebuild GIT mirrors to provide more consistent tags / branches. >> Basically, the contents will be the same, but naming scheme will change. >> >> So, trunk will go to svn/trunk, branches/* will got to svn/branches/* >> and tags/* will go to svn/tags/* >> >> This might break some scripts, etc. So, if anyone has some objections >> wrt this - please let me know. >> >> PS: First clang.git will be converted. In any case, the conversion >> will be done offline and then the repos will be changed at once. >> >> -- >> With best regards, Anton Korobeynikov >> Faculty of Mathematics and Mechanics, Saint Petersburg State University >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Thu Mar 10 12:39:46 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 21:39:46 +0300 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> Message-ID: > The extra branches that Anton added to clang.git didn't rewrite anything. > Here's hoping it won't be necessary for llvm.git either. I was hoping that the rebuilt won't change the sha's and noone will notice anything (well, except those who already branched out of branches / tags). But it seems that after rebuild stuff was changed (I really dunno, why). So, we'll go other way. In any case, it's unwise now to use published branches / tags :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From drb at dneg.com Thu Mar 10 12:43:41 2011 From: drb at dneg.com (Dan Bailey) Date: Thu, 10 Mar 2011 18:43:41 +0000 Subject: [LLVMdev] Alternative to Adding New Intrinsics for Code-Generation? Message-ID: <4D791BDD.9000503@dneg.com> Hi, I've written an IR->IR translation as part of a high-level language I've designed. It replaces my own specific functions in LLVM passes with custom logic. As part of the process, it needs to perform constant propagation and inlining prior to doing any translation. Initially I just used simple function declarations to define these specific functions, but the verification pass requires that each function declaration also has a definition. Without wanting to create dummy function definitions, I found I had to introduce new intrinsics, which does what I want but is obviously not desired. How can I do this without using intrinsics? I looked for function attributes to see if I can flag a function as not requiring a definition, but there doesn't seem to be any. It would be useful to ignore a function as if it was an intrinsic during this prior stage of passes. Any suggestions would be welcome? Thanks, Dan From damien.llvm at gmail.com Thu Mar 10 12:44:39 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Thu, 10 Mar 2011 10:44:39 -0800 Subject: [LLVMdev] Parsing dwarf debug info of an GAS assembly file In-Reply-To: <6E7CF385-0401-404C-AEAC-8D4DC93FBE68@apple.com> References: <6E7CF385-0401-404C-AEAC-8D4DC93FBE68@apple.com> Message-ID: I am working with a different assembly format (in house assembly...). I added a target to LLVM but find it easier to keep a GAS assembly output from LLVM and then convert it to the in-house assembly format (with a standalone tool) Without debugging information, this conversion is pretty straightforward but now comes the time to convert some debugging informations. Thanks, Damien On Thu, Mar 10, 2011 at 9:20 AM, Devang Patel wrote: > > On Mar 9, 2011, at 6:32 PM, Damien Vincent wrote: > > > > > I have a question not strictly related to LLVM: > > I know there is a tool (libdwarf / dwarfdump) to dump/parse debug > information of an object file, > > but do you know a tool that can parse dwarf sections of a ".s" GAS > assembly file ? > > I do not know any tool other than the assembler itself. What are you trying > to do ? > - > Devang > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110310/89086e25/attachment.html From rjmccall at apple.com Thu Mar 10 12:51:41 2011 From: rjmccall at apple.com (John McCall) Date: Thu, 10 Mar 2011 10:51:41 -0800 Subject: [LLVMdev] Alternative to Adding New Intrinsics for Code-Generation? In-Reply-To: <4D791BDD.9000503@dneg.com> References: <4D791BDD.9000503@dneg.com> Message-ID: On Mar 10, 2011, at 10:43 AM, Dan Bailey wrote: > I've written an IR->IR translation as part of a high-level language I've > designed. It replaces my own specific functions in LLVM passes with > custom logic. > > As part of the process, it needs to perform constant propagation and > inlining prior to doing any translation. Initially I just used simple > function declarations to define these specific functions, but the > verification pass requires that each function declaration also has a > definition. Without wanting to create dummy function definitions, I > found I had to introduce new intrinsics, which does what I want but is > obviously not desired. > > How can I do this without using intrinsics? I looked for function > attributes to see if I can flag a function as not requiring a > definition, but there doesn't seem to be any. It would be useful to > ignore a function as if it was an intrinsic during this prior stage of > passes. Any suggestions would be welcome? The verifier certainly doesn't object in general to function declarations. I'm guessing that the problem is that you're giving your declarations internal linkage, which is indeed an error for normal functions. The solution is to not give these "intrinsics" internal linkage when you declare them. John. From drb at dneg.com Thu Mar 10 13:11:57 2011 From: drb at dneg.com (Dan Bailey) Date: Thu, 10 Mar 2011 19:11:57 +0000 Subject: [LLVMdev] Alternative to Adding New Intrinsics for Code-Generation? In-Reply-To: References: <4D791BDD.9000503@dneg.com> Message-ID: <4D79227D.1050203@dneg.com> An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110310/613db994/attachment.html From bob.wilson at apple.com Thu Mar 10 13:17:31 2011 From: bob.wilson at apple.com (Bob Wilson) Date: Thu, 10 Mar 2011 11:17:31 -0800 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> Message-ID: <051C4414-6923-4562-BC2B-91B8E3180416@apple.com> On Mar 10, 2011, at 10:39 AM, Anton Korobeynikov wrote: >> The extra branches that Anton added to clang.git didn't rewrite anything. >> Here's hoping it won't be necessary for llvm.git either. > I was hoping that the rebuilt won't change the sha's and noone will > notice anything (well, except those who already branched out of > branches / tags). > But it seems that after rebuild stuff was changed (I really dunno, > why). So, we'll go other way. Just to be clear: we _really_ do not want all the sha's to change for trunk. > > In any case, it's unwise now to use published branches / tags :) > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University From rjmccall at apple.com Thu Mar 10 13:50:14 2011 From: rjmccall at apple.com (John McCall) Date: Thu, 10 Mar 2011 11:50:14 -0800 Subject: [LLVMdev] Alternative to Adding New Intrinsics for Code-Generation? In-Reply-To: <4D79227D.1050203@dneg.com> References: <4D791BDD.9000503@dneg.com> <4D79227D.1050203@dneg.com> Message-ID: <740A4399-9628-4898-A10A-A0A72FE6E464@apple.com> On Mar 10, 2011, at 11:11 AM, Dan Bailey wrote: > John McCall wrote: >> >> On Mar 10, 2011, at 10:43 AM, Dan Bailey wrote: >> >>> I've written an IR->IR translation as part of a high-level language I've >>> designed. It replaces my own specific functions in LLVM passes with >>> custom logic. >>> >>> As part of the process, it needs to perform constant propagation and >>> inlining prior to doing any translation. Initially I just used simple >>> function declarations to define these specific functions, but the >>> verification pass requires that each function declaration also has a >>> definition. Without wanting to create dummy function definitions, I >>> found I had to introduce new intrinsics, which does what I want but is >>> obviously not desired. >>> >>> How can I do this without using intrinsics? I looked for function >>> attributes to see if I can flag a function as not requiring a >>> definition, but there doesn't seem to be any. It would be useful to >>> ignore a function as if it was an intrinsic during this prior stage of >>> passes. Any suggestions would be welcome? >>> >> >> The verifier certainly doesn't object in general to function declarations. >> I'm guessing that the problem is that you're giving your declarations >> internal linkage, which is indeed an error for normal functions. The >> solution is to not give these "intrinsics" internal linkage when you >> declare them. >> >> John. >> > > That makes sense, but it's not working for me. All the functions are defined as ExternalLinkage, this is (a simplified version of) the pre-optimised ir: > > declare i32 @_function(i32, i32, i32) > > define i32 @Test() { > entry: > %value = call i32 @_function(i32 0, i32 0, i32 0) > } > > Which generates this error in the verifying stage: > > Referencing function in another module! > %value = call i32 @_function(i32 0, i32 0, i32 0) That's asserting that @Test and @_function are in different llvm::Modules. That's a memory well-formedness constraints, not an IR constraint. That should honestly be getting checked for intrinsic functions, too. John. From anton at korobeynikov.info Thu Mar 10 13:53:32 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 10 Mar 2011 22:53:32 +0300 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: <051C4414-6923-4562-BC2B-91B8E3180416@apple.com> References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> <051C4414-6923-4562-BC2B-91B8E3180416@apple.com> Message-ID: > Just to be clear: we _really_ do not want all the sha's to change for trunk. Yes. That's why I said there will be other way :) In any case - please try clang.git once again. It should contain new branch/tag layout. If there will be some problems - let me know and I'll revert to prev. repository. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From johnso87 at crhc.illinois.edu Thu Mar 10 13:31:31 2011 From: johnso87 at crhc.illinois.edu (Matt Johnson) Date: Thu, 10 Mar 2011 13:31:31 -0600 Subject: [LLVMdev] compiler-rt: Infinite loop/stack overflow in __modsi3() Message-ID: <4D792713.7090104@crhc.illinois.edu> Hi All, The default implementation of __modsi3() (signed integer modulus) in compiler-rt/lib/modsi3.c is defined recursively. Thankfully, LLVM is smart enough to do tail call elimination on the recursion, so I got an infinite loop rather than a stack overflow :) Here's the patch, patterned after the correct implementation in umodsi3.c: diff --git a/lib/compiler-rt/lib/modsi3.c b/lib/compiler-rt/lib/modsi3.c index 388418a..3759ce0 100644 --- a/lib/compiler-rt/lib/modsi3.c +++ b/lib/compiler-rt/lib/modsi3.c @@ -16,8 +16,10 @@ /* Returns: a % b */ +su_int __divsi3(si_int a, si_int b); + si_int __modsi3(si_int a, si_int b) { - return a - (a / b) * b; + return a - __divsi3(a, b) * b; } Best, Matt From jan_sjodin at yahoo.com Thu Mar 10 15:06:27 2011 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Thu, 10 Mar 2011 13:06:27 -0800 (PST) Subject: [LLVMdev] Detrimental optimization for reducing relocations. Message-ID: <719316.52005.qm@web55608.mail.re4.yahoo.com> I was looking into the AsmPrinter and the method EmitSectionOffset which contains this code: -------------------------------------------------------------------------------- // If the section in question will end up with an address of 0 anyway, we can // just emit an absolute reference to save a relocation. if (Section.isBaseAddressKnownZero()) { OutStreamer.EmitSymbolValue(Label, 4, 0/*AddrSpace*/); return; } // Otherwise, emit it as a label difference from the start of the section. EmitLabelDifference(Label, SectionLabel, 4); } -------------------------------------------------------------------------------- isBaseAddrfessKnownZero() only returns true for some MCSectionELF sections (always false for MachO and COFF), however emitting a symbol value seems to always cause a relocation entry, but a label difference does not. I compiled the factorial program from the demo page with debug info and dumped the relocation entries, then I commented out the top block so that EmitLabelDifference was always called. Original: objdump -r directsymbol.o directsymbol.o: file format elf64-x86-64 RELOCATION RECORDS FOR [.text]: OFFSET TYPE VALUE 000000000000005a R_X86_64_PC32 atoi-0x0000000000000004 0000000000000066 R_X86_64_32 .rodata.str1.1 000000000000006f R_X86_64_PC32 printf-0x0000000000000004 RELOCATION RECORDS FOR [.debug_frame]: OFFSET TYPE VALUE 0000000000000018 R_X86_64_32 .debug_frame 000000000000001c R_X86_64_64 .text 0000000000000040 R_X86_64_32 .debug_frame 0000000000000044 R_X86_64_64 .text+0x0000000000000040 RELOCATION RECORDS FOR [.debug_info]: OFFSET TYPE VALUE 0000000000000006 R_X86_64_32 .debug_abbrev 0000000000000097 R_X86_64_64 .text 000000000000009f R_X86_64_64 .text+0x0000000000000034 00000000000000c9 R_X86_64_64 .text+0x0000000000000040 00000000000000d1 R_X86_64_64 .text+0x000000000000007c RELOCATION RECORDS FOR [.debug_line]: OFFSET TYPE VALUE 000000000000002f R_X86_64_64 .text RELOCATION RECORDS FOR [.debug_pubnames]: OFFSET TYPE VALUE 0000000000000006 R_X86_64_32 .debug_info RELOCATION RECORDS FOR [.debug_pubtypes]: OFFSET TYPE VALUE 0000000000000006 R_X86_64_32 .debug_info Then with the reduced code, without the optimization: objdump -r labeldiff.o labeldiff.o: file format elf64-x86-64 RELOCATION RECORDS FOR [.text]: OFFSET TYPE VALUE 000000000000005a R_X86_64_PC32 atoi-0x0000000000000004 0000000000000066 R_X86_64_32 .rodata.str1.1 000000000000006f R_X86_64_PC32 printf-0x0000000000000004 RELOCATION RECORDS FOR [.debug_frame]: OFFSET TYPE VALUE 000000000000001c R_X86_64_64 .text 0000000000000044 R_X86_64_64 .text+0x0000000000000040 RELOCATION RECORDS FOR [.debug_info]: OFFSET TYPE VALUE 0000000000000097 R_X86_64_64 .text 000000000000009f R_X86_64_64 .text+0x0000000000000034 00000000000000c9 R_X86_64_64 .text+0x0000000000000040 00000000000000d1 R_X86_64_64 .text+0x000000000000007c RELOCATION RECORDS FOR [.debug_line]: OFFSET TYPE VALUE 000000000000002f R_X86_64_64 .text So, clearly the optimization is making things worse. Would it be okay to delete this code and eliminate the isBaseAddressKnownZero? I would like to get rid of it. - Jan From rafael.espindola at gmail.com Thu Mar 10 15:22:32 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Thu, 10 Mar 2011 16:22:32 -0500 Subject: [LLVMdev] Detrimental optimization for reducing relocations. In-Reply-To: <719316.52005.qm@web55608.mail.re4.yahoo.com> References: <719316.52005.qm@web55608.mail.re4.yahoo.com> Message-ID: <4D794118.3010703@gmail.com> > So, clearly the optimization is making things worse. Would it be okay to delete > this code and eliminate the isBaseAddressKnownZero? I would like to get rid of > it. I think it is OK. I can see ld/gdb expecting a relocation, but if that is the case we should just have a flag saying it is needed. If you are really motivated to check it, run the gdb testsuite with your patch, but on ELF our debug info is still too big to be usable. > - Jan Cheers, Rafael From debio264 at gmail.com Thu Mar 10 15:32:48 2011 From: debio264 at gmail.com (Andrew Wiley) Date: Thu, 10 Mar 2011 15:32:48 -0600 Subject: [LLVMdev] Building VMKit Message-ID: I tried to build VMKit on an ARM device today (a Sheevaplug - armv5te) (native, not cross compiled), and got this error: llvm[3]: Building LLVM assembly with /home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime/LLVMAssembly.ll /home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime/LLVMAssembly64.ll ExpandIntegerResult #0: 0x16fbf88: i64,ch = AtomicCmpSwap 0x16e8d84, 0x16fbf00, 0x16fc3c8, 0x16fc1a8 [ORD=4] [ID=0] Do not know how to expand the result of this operator! UNREACHABLE executed at LegalizeIntegerTypes.cpp:982! Stack dump: 0. Program arguments: /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llc -o LLVMAssembly.s 1. Running pass 'Function Pass Manager' on module ''. 2. Running pass 'ARM Instruction Selection' on function '@llvm_atomic_cmp_swap_i64' /bin/sh: line 1: 16944 Done /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llvm-as -f LLVMAssembly.gen.ll -o - 16945 Aborted | /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llc -o LLVMAssembly.s make[3]: *** [LLVMAssembly.s] Error 134 make[3]: Leaving directory `/home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime' make[2]: *** [all] Error 1 make[2]: Leaving directory `/home/debio/build/vmkit-build/vmkit/lib/Mvm' make[1]: *** [Mvm/.makeall] Error 2 make[1]: Leaving directory `/home/debio/build/vmkit-build/vmkit/lib' make: *** [all] Error 1 I was following the instructions from http://vmkit.llvm.org/get_started.html although I didn't build my own classpath, as I have a packaged version installed and I was waiting to see whether configure would pick it up automatically. From what I can tell, that doesn't seem relevant to this error. Is this something that just isn't supported on my platform, is the trunk build currently broken somehow, or am I just doing it wrong? Thanks, Andrew Wiley From nicolas.geoffray at gmail.com Thu Mar 10 15:48:28 2011 From: nicolas.geoffray at gmail.com (nicolas geoffray) Date: Thu, 10 Mar 2011 22:48:28 +0100 Subject: [LLVMdev] Building VMKit In-Reply-To: References: Message-ID: Hi Andrew, Note that I never tried compiling vmkit on ARM. From the error message you get, it looks to me that the LLVMAssembly64.ll is wrongly being compiled. You should change the configure script to not include it in the list of files to compile (or you could also just remove the code in the file). Let me know if it helps! Nicolas On Thu, Mar 10, 2011 at 10:32 PM, Andrew Wiley wrote: > I tried to build VMKit on an ARM device today (a Sheevaplug - armv5te) > (native, not cross compiled), and got this error: > > llvm[3]: Building LLVM assembly with > /home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime/LLVMAssembly.ll > /home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime/LLVMAssembly64.ll > ExpandIntegerResult #0: 0x16fbf88: i64,ch = AtomicCmpSwap 0x16e8d84, > 0x16fbf00, 0x16fc3c8, 0x16fc1a8 [ORD=4] [ID=0] > > Do not know how to expand the result of this operator! > UNREACHABLE executed at LegalizeIntegerTypes.cpp:982! > Stack dump: > 0. Program arguments: > /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llc -o > LLVMAssembly.s > 1. Running pass 'Function Pass Manager' on module ''. > 2. Running pass 'ARM Instruction Selection' on function > '@llvm_atomic_cmp_swap_i64' > /bin/sh: line 1: 16944 Done > /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llvm-as > -f LLVMAssembly.gen.ll -o - > 16945 Aborted | > /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llc -o > LLVMAssembly.s > make[3]: *** [LLVMAssembly.s] Error 134 > make[3]: Leaving directory > `/home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime' > make[2]: *** [all] Error 1 > make[2]: Leaving directory `/home/debio/build/vmkit-build/vmkit/lib/Mvm' > make[1]: *** [Mvm/.makeall] Error 2 > make[1]: Leaving directory `/home/debio/build/vmkit-build/vmkit/lib' > make: *** [all] Error 1 > > > I was following the instructions from > http://vmkit.llvm.org/get_started.html although I didn't build my own > classpath, as I have a packaged version installed and I was waiting to > see whether configure would pick it up automatically. From what I can > tell, that doesn't seem relevant to this error. > Is this something that just isn't supported on my platform, is the > trunk build currently broken somehow, or am I just doing it wrong? > > Thanks, > Andrew Wiley > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110310/d53d8afd/attachment.html From clattner at apple.com Thu Mar 10 16:16:00 2011 From: clattner at apple.com (Chris Lattner) Date: Thu, 10 Mar 2011 14:16:00 -0800 Subject: [LLVMdev] compiler-rt: Infinite loop/stack overflow in __modsi3() In-Reply-To: <4D792713.7090104@crhc.illinois.edu> References: <4D792713.7090104@crhc.illinois.edu> Message-ID: On Mar 10, 2011, at 11:31 AM, Matt Johnson wrote: > Hi All, > The default implementation of __modsi3() (signed integer modulus) > in compiler-rt/lib/modsi3.c is defined recursively. Thankfully, LLVM is > smart enough to do tail call elimination on the recursion, so I got an > infinite loop rather than a stack overflow :) Looks good, applied in r127429, thanks! -Chris > > Here's the patch, patterned after the correct implementation in > umodsi3.c: > > diff --git a/lib/compiler-rt/lib/modsi3.c b/lib/compiler-rt/lib/modsi3.c > index 388418a..3759ce0 100644 > --- a/lib/compiler-rt/lib/modsi3.c > +++ b/lib/compiler-rt/lib/modsi3.c > @@ -16,8 +16,10 @@ > > /* Returns: a % b */ > > +su_int __divsi3(si_int a, si_int b); > + > si_int > __modsi3(si_int a, si_int b) > { > - return a - (a / b) * b; > + return a - __divsi3(a, b) * b; > } > > > Best, > Matt > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From lg at larrygritz.com Thu Mar 10 12:59:13 2011 From: lg at larrygritz.com (Larry Gritz) Date: Thu, 10 Mar 2011 10:59:13 -0800 Subject: [LLVMdev] Linux clang toolchain targets Message-ID: <94C12D06-D881-41DF-95A5-CC58FE4FEC6B@larrygritz.com> Is it too late to include "x86_64-unknown-linux" as a supported toolchain name in clang? We tend to need to hand-patch clang source to find include files properly on our systems, the patch looks something like this (just posting as a guideline, I'm not sure if this is the best approach or not, but it seems to work for us): Index: lib/Frontend/InitHeaderSearch.cpp =================================================================== --- lib/Frontend/InitHeaderSearch.cpp (revision 115972) +++ lib/Frontend/InitHeaderSearch.cpp (working copy) @@ -661,6 +661,19 @@ "x86_64-redhat-linux", "32", "", triple); AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.2", "i686-redhat-linux","", "", triple); +// FIXME + AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.2", + "x86_64-unknown-linux", "32", "", triple); + AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.3", + "x86_64-unknown-linux", "32", "", triple); + AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.4", + "x86_64-unknown-linux", "32", "", triple); + AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.5", + "x86_64-unknown-linux", "32", "", triple); + AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.6", + "x86_64-unknown-linux", "32", "", triple); +// END FIXME + // Fedora 11 AddGnuCPlusPlusIncludePaths("/usr/include/c++/4.4.1", "x86_64-redhat-linux", "32", "", triple); -- Larry Gritz lg at larrygritz.com From jan_sjodin at yahoo.com Thu Mar 10 17:43:21 2011 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Thu, 10 Mar 2011 15:43:21 -0800 (PST) Subject: [LLVMdev] Detrimental optimization for reducing relocations. In-Reply-To: <4D794118.3010703@gmail.com> References: <719316.52005.qm@web55608.mail.re4.yahoo.com> <4D794118.3010703@gmail.com> Message-ID: <730562.67234.qm@web55607.mail.re4.yahoo.com> ----- Original Message ---- > From: Rafael ?vila de Esp?ndola > To: llvmdev at cs.uiuc.edu > Sent: Thu, March 10, 2011 4:22:32 PM > Subject: Re: [LLVMdev] Detrimental optimization for reducing relocations. > > > So, clearly the optimization is making things worse. Would it be okay to >delete > > this code and eliminate the isBaseAddressKnownZero? I would like to get rid >of > > it. > > I think it is OK. I can see ld/gdb expecting a relocation, but if that > is the case we should just have a flag saying it is needed. > > If you are really motivated to check it, run the gdb testsuite with your > patch, but on ELF our debug info is still too big to be usable. Will the testsuite work on ELF? The patch does not make any functional change for the other formats. I know that gdb is okay with the example, but that doesn't say very much. - Jan From rafael.espindola at gmail.com Thu Mar 10 20:02:27 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Thu, 10 Mar 2011 21:02:27 -0500 Subject: [LLVMdev] Detrimental optimization for reducing relocations. In-Reply-To: <730562.67234.qm@web55607.mail.re4.yahoo.com> References: <719316.52005.qm@web55608.mail.re4.yahoo.com> <4D794118.3010703@gmail.com> <730562.67234.qm@web55607.mail.re4.yahoo.com> Message-ID: <4D7982B3.2050309@gmail.com> > Will the testsuite work on ELF? The patch does not make any functional change > for the other formats. I know that gdb is okay with the example, but that > doesn't say very much. The patch is probably OK then. The gdb testsuite works with clang on ELF. There used to be a lot of silly failures like it not expecting clang warnings, but I think most of the current ones are real. > - Jan > Cheers, Rafael From fiterman at gmail.com Fri Mar 11 01:14:34 2011 From: fiterman at gmail.com (Yuli Fiterman) Date: Fri, 11 Mar 2011 02:14:34 -0500 Subject: [LLVMdev] LLVM vs GCC binary performance Message-ID: Dear LLVM Team, As a developer I'm very excited and interested in the LLVM project. Though my knowledge of the details is cursory my general understanding is that the SSA code that LLVM front ends produce is supposed to allow for optimizations that are unfeasible in GCC. I also expect most important optimizations from GCC would have been incorporated into LLVM by now since GCC code is open for everyone to see. Therefore I'm surprised to see that in most benchmarks LLVM produces binaries are 10-15% slower than their GCC counterparts. Would you mind explaining the main reasons for why this is the case? Also, what remains to be done for LLVM to surpass GCC in terms of binary performance? Thanks, Yuli -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/6bb3a72b/attachment.html From drb at dneg.com Fri Mar 11 03:43:25 2011 From: drb at dneg.com (Dan Bailey) Date: Fri, 11 Mar 2011 09:43:25 +0000 Subject: [LLVMdev] Alternative to Adding New Intrinsics for Code-Generation? In-Reply-To: <740A4399-9628-4898-A10A-A0A72FE6E464@apple.com> References: <4D791BDD.9000503@dneg.com> <4D79227D.1050203@dneg.com> <740A4399-9628-4898-A10A-A0A72FE6E464@apple.com> Message-ID: <4D79EEBD.1000405@dneg.com> An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/a46b7b18/attachment.html From judison at gmail.com Fri Mar 11 04:48:28 2011 From: judison at gmail.com (Judison) Date: Fri, 11 Mar 2011 07:48:28 -0300 Subject: [LLVMdev] Unnamed temporaries Message-ID: Hi, I hope this is the right place to ask it, sorry if I'm wrong... My compiler is generating this code: (line numbers included) (Please ignore the extra br label %b0 and the whole b0) ... 54 define i32 @std_lang__rest() { 55 entry: 56 %ret = alloca i32 ; int* 57 %0 = icmp eq i32 4, 5 ; boolean 58 br i1 %0, label %b0_t, label %b0_f 59 b0_t: 60 %1 = add i32 5, 2 ; int 61 store i32 %1, i32* %ret 62 br label %return 63 br label %b0 64 b0_f: 65 store i32 5, i32* %ret 66 br label %return 67 br label %b0 68 b0: 69 store i32 0, i32* %ret 70 br label %return 71 return: 72 %2 = load i32* %ret ; int 73 ret i32 %2 74 } ... llvm-as std_lang.ll llvm-as: std_lang.ll:72:5: error: instruction expected to be numbered '%4' %2 = load i32* %ret ; int ^ Why %4 ??? what I did wrong? -- Judison judison at gmail.com "O ignorante que procura se instruir ? como um s?bio; o s?bio que fala sem discernimento se assemelha a um ignorante." Imam Ali (as) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/c97db470/attachment.html From fvbommel at gmail.com Fri Mar 11 05:11:45 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Fri, 11 Mar 2011 12:11:45 +0100 Subject: [LLVMdev] Unnamed temporaries In-Reply-To: References: Message-ID: On Fri, Mar 11, 2011 at 11:48 AM, Judison wrote: > I hope this is the right place to ask it, sorry if I'm wrong... It's the right place, though the IRC channel would have been good too. > My compiler is generating this code: > > (line numbers included) (Please ignore the extra br label %b0 and the whole > b0) Sorry, but the extra branches can't be ignored since they are exactly your problem. > 54 define i32 @std_lang__rest() { > 55 entry: > 56 %ret = alloca i32 ; int* > > 57 %0 = icmp eq i32 4, 5 ; boolean > > 58 br i1 %0, label %b0_t, label %b0_f > > 59 b0_t: > 60 %1 = add i32 5, 2 ; int > > 61 store i32 %1, i32* %ret > 62 br label %return This "br label %return" ended block %b0_t and automatically started a new one. Since you didn't provide a label, it's named %2. > > 63 br label %b0 > 64 b0_f: > 65 store i32 5, i32* %ret > > 66 br label %return And here block %b0_f ends, and block %3 begins. > 67 br label %b0 > > 68 b0: > 69 store i32 0, i32* %ret > > 70 br label %return > 71 return: > 72 %2 = load i32* %ret ; int Leading to %4 being the next anonymous value here, not %2. > > 73 ret i32 %2 > 74 } > ... > > llvm-as std_lang.ll > llvm-as: std_lang.ll:72:5: error: instruction expected to be numbered '%4' > ??? %2 = load i32* %ret????????????????????? ; int > ??? ^ > > Why %4 ??? what I did wrong? You didn't realize there were anonymous blocks in your code, I'm guessing. From judison at gmail.com Fri Mar 11 05:19:25 2011 From: judison at gmail.com (Judison) Date: Fri, 11 Mar 2011 08:19:25 -0300 Subject: [LLVMdev] Unnamed temporaries In-Reply-To: References: Message-ID: Thank you so much, removing then solved the problem. I did not know thre where such a thing as anonymous blocks I thought llvm was going to ignore anything after a terminator instruction (br, ret, etc) I'll make my code generator "block aware" :P thank you again!!! :P On Fri, Mar 11, 2011 at 8:11 AM, Frits van Bommel wrote: > On Fri, Mar 11, 2011 at 11:48 AM, Judison wrote: > > I hope this is the right place to ask it, sorry if I'm wrong... > > It's the right place, though the IRC channel would have been good too. > > > My compiler is generating this code: > > > > (line numbers included) (Please ignore the extra br label %b0 and the > whole > > b0) > > Sorry, but the extra branches can't be ignored since they are exactly > your problem. > > > 54 define i32 @std_lang__rest() { > > 55 entry: > > 56 %ret = alloca i32 ; int* > > > > 57 %0 = icmp eq i32 4, 5 ; boolean > > > > 58 br i1 %0, label %b0_t, label %b0_f > > > > 59 b0_t: > > 60 %1 = add i32 5, 2 ; int > > > > 61 store i32 %1, i32* %ret > > 62 br label %return > > > This "br label %return" ended block %b0_t and automatically started a > new one. Since you didn't provide a label, it's named %2. > > > > > 63 br label %b0 > > 64 b0_f: > > 65 store i32 5, i32* %ret > > > > 66 br label %return > > And here block %b0_f ends, and block %3 begins. > > > 67 br label %b0 > > > > 68 b0: > > 69 store i32 0, i32* %ret > > > > 70 br label %return > > 71 return: > > 72 %2 = load i32* %ret ; int > > Leading to %4 being the next anonymous value here, not %2. > > > > > 73 ret i32 %2 > > 74 } > > ... > > > > llvm-as std_lang.ll > > llvm-as: std_lang.ll:72:5: error: instruction expected to be numbered > '%4' > > %2 = load i32* %ret ; int > > ^ > > > > Why %4 ??? what I did wrong? > > You didn't realize there were anonymous blocks in your code, I'm guessing. > -- Judison judison at gmail.com "O ignorante que procura se instruir ? como um s?bio; o s?bio que fala sem discernimento se assemelha a um ignorante." Imam Ali (as) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/4f6e6bdf/attachment-0001.html From lgratian at gmail.com Fri Mar 11 05:53:51 2011 From: lgratian at gmail.com (Gratian Lup) Date: Fri, 11 Mar 2011 13:53:51 +0200 Subject: [LLVMdev] Call profiling and function placement in object file Message-ID: Hi! I'm interested in profile-guided optimizations and was looking at the functionality LLVM provides. I have two questions: - can the current (optimal) edge profiling be used to determine the number of times a function calls another one? I need to know not only how many times a function was called, but also by whom. Or to say in in a different way, can a profile edge be formed from blocks originating from different functions? I need this information so that I cluster the functions based on the call frequency and the relationship between them. - is there a way to force the linker to place the functions in a specific order in the object file? I mean something like associating a number with each function, and the linker placing the functions ordered by these numbers. The assigned number would be the ID of the cluster in which the function is found. If this is not possible the only idea left would be to place the functions in different sections, but that would interfere with directives written by the user in the source files. Thanks in advance, Gratian PS: It's probably clear that I want to implement a function placement optimization like the one described by Pettis & Hansen. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/8510c646/attachment.html From jan_sjodin at yahoo.com Fri Mar 11 06:31:53 2011 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Fri, 11 Mar 2011 04:31:53 -0800 (PST) Subject: [LLVMdev] Detrimental optimization for reducing relocations. In-Reply-To: <4D7982B3.2050309@gmail.com> References: <719316.52005.qm@web55608.mail.re4.yahoo.com> <4D794118.3010703@gmail.com> <730562.67234.qm@web55607.mail.re4.yahoo.com> <4D7982B3.2050309@gmail.com> Message-ID: <132314.67035.qm@web55604.mail.re4.yahoo.com> Ok, I will try and run the testsuite and spend some time checking things before I post a patch for review. Thanks, Jan ----- Original Message ---- > From: Rafael ?vila de Esp?ndola > To: Jan Sjodin > Cc: llvmdev at cs.uiuc.edu > Sent: Thu, March 10, 2011 9:02:27 PM > Subject: Re: [LLVMdev] Detrimental optimization for reducing relocations. > > > Will the testsuite work on ELF? The patch does not make any functional >change > > for the other formats. I know that gdb is okay with the example, but that > > doesn't say very much. > > The patch is probably OK then. The gdb testsuite works with clang on ELF. >There used to be a lot of silly failures like it not expecting clang warnings, >but I think most of the current ones are real. > > > - Jan > > > > Cheers, > Rafael > From jnspaulsson at hotmail.com Fri Mar 11 08:32:38 2011 From: jnspaulsson at hotmail.com (Jonas Paulsson) Date: Fri, 11 Mar 2011 15:32:38 +0100 Subject: [LLVMdev] make Message-ID: Hi, is it possible to reduce link time by excluding unused target backends? I would like to type tools/llc make -target=... , and just build it for one backend. /Jonas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/c2179915/attachment.html From ofv at wanadoo.es Fri Mar 11 08:44:07 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Fri, 11 Mar 2011 15:44:07 +0100 Subject: [LLVMdev] make In-Reply-To: (Jonas Paulsson's message of "Fri, 11 Mar 2011 15:32:38 +0100") References: Message-ID: <8762rpu37s.fsf@wanadoo.es> Jonas Paulsson writes: > is it possible to reduce link time by excluding unused target backends? > > I would like to type > > tools/llc make -target=... , and just build it for one backend. If you build with configure && make, use the configure option --enable-targets. If you build with cmake && make, pass -DLLVM_TARGETS_TO_BUILD=YOURBACKEND to cmake. See http://www.llvm.org/docs/CMake.html for more info on building with cmake. From baldrick at free.fr Fri Mar 11 08:53:07 2011 From: baldrick at free.fr (Duncan Sands) Date: Fri, 11 Mar 2011 15:53:07 +0100 Subject: [LLVMdev] LLVM vs GCC binary performance In-Reply-To: References: Message-ID: <4D7A3753.5020301@free.fr> Hi Yuli, > As a developer I'm very excited and interested in the LLVM project. Though my > knowledge of the details is cursory my general understanding is that the SSA > code that LLVM front ends produce is supposed to allow for optimizations that > are unfeasible in GCC. not so, GCC also uses SSA form. I'm not aware of any optimization that LLVM can do that GCC couldn't do if it tried. I also expect most important optimizations from GCC would > have been incorporated into LLVM by now since GCC code is open for everyone to > see. This is not the case. You make it sound like reimplementing optimizations is a five minute job while that is very far from true! Not to mention that GCC is a moving target: it is being worked on too and getting nice improvements all the time. For example it has auto-vectorization support while LLVM does not. Also, don't forget that LLVM is not simply GCC written in C++: it makes a lot of different design choices to GCC, and has a bunch of use cases that GCC does not, for example, the ability to JIT code. The LLVM developers may feel that the way GCC solved some problem is not the best way for LLVM to solve it, and even if they think GCC's approach to some problem is great it nonetheless might be hard to do things the same in LLVM due to the different design. Therefore I'm surprised to see that in most benchmarks LLVM produces > binaries are 10-15% slower than their GCC counterparts. While in my experience this used to be pretty systematically true on x86 linux, nowadays it is much more hit and miss: I see some programs running faster when compiled with LLVM, and others running faster when compiled with GCC. On the whole I would say that on my machine GCC usually results in faster programs. Would you mind > explaining the main reasons for why this is the case? On the whole GCC produces excellent code. Many fine engineers have worked hard on it for many years, and it shows. Doing better than GCC is difficult. Also, what remains to be > done for LLVM to surpass GCC in terms of binary performance? There's no magic bullet. The things to improve that would give you the most bang for your buck are probably the code generator and auto-vectorization. Increasing the number of developers would be helpful. Ciao, Duncan. From jnspaulsson at hotmail.com Fri Mar 11 10:02:47 2011 From: jnspaulsson at hotmail.com (Jonas Paulsson) Date: Fri, 11 Mar 2011 17:02:47 +0100 Subject: [LLVMdev] make In-Reply-To: <8762rpu37s.fsf@wanadoo.es> References: , <8762rpu37s.fsf@wanadoo.es> Message-ID: thanks! Can I run configure once again with no problems? I have reconfigured after adding a new target, per http://wiki.llvm.org/HowTo:_Create_and_register_a_new_back_end_%28a_new_hardware_target%29, which includes AutoRegen.sh. /Jonas > From: ofv at wanadoo.es > To: jnspaulsson at hotmail.com > CC: llvmdev at cs.uiuc.edu > Subject: Re: make > Date: Fri, 11 Mar 2011 15:44:07 +0100 > > Jonas Paulsson writes: > > > is it possible to reduce link time by excluding unused target backends? > > > > I would like to type > > > > tools/llc make -target=... , and just build it for one backend. > > If you build with configure && make, use the configure option > --enable-targets. If you build with cmake && make, pass > -DLLVM_TARGETS_TO_BUILD=YOURBACKEND to cmake. > > See http://www.llvm.org/docs/CMake.html for more info on building with > cmake. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/97eed59b/attachment.html From rengolin at systemcall.org Fri Mar 11 11:42:09 2011 From: rengolin at systemcall.org (Renato Golin) Date: Fri, 11 Mar 2011 17:42:09 +0000 Subject: [LLVMdev] LLVM vs GCC binary performance In-Reply-To: <4D7A3753.5020301@free.fr> References: <4D7A3753.5020301@free.fr> Message-ID: On 11 March 2011 14:53, Duncan Sands wrote: > There's no magic bullet. ?The things to improve that would give you the most > bang for your buck are probably the code generator and auto-vectorization. > Increasing the number of developers would be helpful. I'm not a GCC expert, but their auto-vectorization is not that great. It may be simple to do basic loop transformations and some stupid vectorization, but having a really good vectoriser is a lot of work. I personally think that the biggest difference is the number of people that have contributed over the years on very specific optimizations. There are as many corner cases as there are particles in the universe (maybe more), and implementing each one of them requires time and people willing. LLVM has the latter, but lacks the former, for now. Spending a full year on a vectoriser prototype might bring less value than the same year optimizing micro-benchmarks against GCC... Not that I don't think we should have a vectoriser, Poly is going to be great! But until it's not (and it's going to take some time), we better focus on some magic, as GCC did over the decades. My tuppence, --renato From jan_sjodin at yahoo.com Fri Mar 11 12:23:36 2011 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Fri, 11 Mar 2011 10:23:36 -0800 (PST) Subject: [LLVMdev] Detrimental optimization for reducing relocations. In-Reply-To: <4D794118.3010703@gmail.com> References: <719316.52005.qm@web55608.mail.re4.yahoo.com> <4D794118.3010703@gmail.com> Message-ID: <861777.91949.qm@web55606.mail.re4.yahoo.com> > From: Rafael ?vila de Esp?ndola > To: llvmdev at cs.uiuc.edu > Sent: Thu, March 10, 2011 4:22:32 PM > Subject: Re: [LLVMdev] Detrimental optimization for reducing relocations. > > > So, clearly the optimization is making things worse. Would it be okay to >delete > > this code and eliminate the isBaseAddressKnownZero? I would like to get rid >of > > it. > > I think it is OK. I can see ld/gdb expecting a relocation, but if that > is the case we should just have a flag saying it is needed. > > If you are really motivated to check it, run the gdb testsuite with your > patch, but on ELF our debug info is still too big to be usable. > > > - Jan > > Cheers, > Rafael I ran the gdb tests with and without the patch and there was no difference. I attached the patch. Thanks, Jan -------------- next part -------------- A non-text attachment was scrubbed... Name: 0047_sectionbasezerodelete.patch Type: application/octet-stream Size: 2311 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/a1a69878/attachment.obj From grosser at fim.uni-passau.de Fri Mar 11 12:46:54 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Fri, 11 Mar 2011 13:46:54 -0500 Subject: [LLVMdev] LLVM vs GCC binary performance In-Reply-To: References: <4D7A3753.5020301@free.fr> Message-ID: <4D7A6E1E.7010704@fim.uni-passau.de> On 03/11/2011 12:42 PM, Renato Golin wrote: > On 11 March 2011 14:53, Duncan Sands wrote: >> There's no magic bullet. The things to improve that would give you the most >> bang for your buck are probably the code generator and auto-vectorization. >> Increasing the number of developers would be helpful. > > I'm not a GCC expert, but their auto-vectorization is not that great. > It may be simple to do basic loop transformations and some stupid > vectorization, but having a really good vectoriser is a lot of work. > > I personally think that the biggest difference is the number of people > that have contributed over the years on very specific optimizations. > There are as many corner cases as there are particles in the universe > (maybe more), and implementing each one of them requires time and > people willing. LLVM has the latter, but lacks the former, for now. > > Spending a full year on a vectoriser prototype might bring less value > than the same year optimizing micro-benchmarks against GCC... > > Not that I don't think we should have a vectoriser, Poly is going to > be great! Hi, in case you are referring to PoLLy*, thanks for this nice comment. We can already do some basic vectorization and are currently working on increased coverage and enhanced robustness. I have already seen some nice speedups on some micro kernels, but need to get more confidence before I present them. I will also talk PoLLy on IMPACT/CGO 2011** , in case someone is around. > But until it's not (and it's going to take some time), we > better focus on some magic, as GCC did over the decades. Yes. Also for a vectorizer to be efficient you need to have a lot of magic and canonicalization done beforehand, to enable it to do a decent job. LLVM is actually pretty good in this respect. Cheers Tobi * Like Polly the parrot ** impact2011.inrialpes.fr From rafael.espindola at gmail.com Fri Mar 11 13:04:47 2011 From: rafael.espindola at gmail.com (Rafael Avila de Espindola) Date: Fri, 11 Mar 2011 14:04:47 -0500 Subject: [LLVMdev] Detrimental optimization for reducing relocations. In-Reply-To: <861777.91949.qm@web55606.mail.re4.yahoo.com> References: <719316.52005.qm@web55608.mail.re4.yahoo.com> <4D794118.3010703@gmail.com> <861777.91949.qm@web55606.mail.re4.yahoo.com> Message-ID: <4D7A724F.3030506@gmail.com> On 11-03-11 01:23 PM, Jan Sjodin wrote: > s with and without the patch and there was no difference. I > attached the patch. LGTM! > Thanks, Cheers, Rafael From Micah.Villmow at amd.com Fri Mar 11 13:11:08 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 11 Mar 2011 13:11:08 -0600 Subject: [LLVMdev] Accessing an empty machine function before instruction selection? Message-ID: I'm trying to access the MachineFunctionInfo structure from a pre-ISel pass. In order to do this I have to get access to the MachineFunction and then call getInfo(). Currently in my pass I request it via: void AMDILBarrierDetect::getAnalysisUsage(AnalysisUsage &AU) const { AU.addRequired(); FunctionPass::getAnalysisUsage(AU); } However, I am getting an assert: assert(NormalCtor && "Cannot call createPass on PassInfo without default ctor!"); First question, is this possible? If so, how do I get NormalCtor to not be NULL? Second question, if I want to pass information from before Instruction selection to after instruction selection, is this the preferred way? If not how? Thanks, Micah -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/953aeca4/attachment.html From rengolin at systemcall.org Fri Mar 11 14:12:54 2011 From: rengolin at systemcall.org (Renato Golin) Date: Fri, 11 Mar 2011 20:12:54 +0000 Subject: [LLVMdev] LLVM vs GCC binary performance In-Reply-To: <4D7A6E1E.7010704@fim.uni-passau.de> References: <4D7A3753.5020301@free.fr> <4D7A6E1E.7010704@fim.uni-passau.de> Message-ID: On 11 March 2011 18:46, Tobias Grosser wrote: > in case you are referring to PoLLy*, thanks for this nice comment. Hi Tobias, This is not the first time you correct me, sorry, but yes, I was talking about PoLLy. ;) cheers, --renato From Micah.Villmow at amd.com Fri Mar 11 14:13:25 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 11 Mar 2011 14:13:25 -0600 Subject: [LLVMdev] Accessing an empty machine function before instruction selection? In-Reply-To: References: Message-ID: To answer my own question. I added this to getAnalysisUsage and it seems to bypass the assert. AU.setPreservesAll(); Anyone have an idea on why this fixes the problem? Thanks, Micah From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Villmow, Micah Sent: Friday, March 11, 2011 11:11 AM To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] Accessing an empty machine function before instruction selection? I'm trying to access the MachineFunctionInfo structure from a pre-ISel pass. In order to do this I have to get access to the MachineFunction and then call getInfo(). Currently in my pass I request it via: void AMDILBarrierDetect::getAnalysisUsage(AnalysisUsage &AU) const { AU.addRequired(); FunctionPass::getAnalysisUsage(AU); } However, I am getting an assert: assert(NormalCtor && "Cannot call createPass on PassInfo without default ctor!"); First question, is this possible? If so, how do I get NormalCtor to not be NULL? Second question, if I want to pass information from before Instruction selection to after instruction selection, is this the preferred way? If not how? Thanks, Micah -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/9c6f8200/attachment.html From ahmedcharles at gmail.com Fri Mar 11 20:40:19 2011 From: ahmedcharles at gmail.com (Ahmed Charles) Date: Fri, 11 Mar 2011 18:40:19 -0800 Subject: [LLVMdev] MSVC compiling issue In-Reply-To: <1455E269-7531-43CE-BDF4-E0BDCBBAD369@2pi.dk> References: <4D75F1C5.6040900@tu-dresden.de> <4D77574E.6010608@tu-dresden.de> <1455E269-7531-43CE-BDF4-E0BDCBBAD369@2pi.dk> Message-ID: MSVC 9 asserts that the comparision function is symmetric, because of what was unclear wording in the standard (there are posts explaining this in the boost archive for those interested). The workaround is to supply the other overload of the comparator. Sorry Jacob, didn't reply all last time. On Wed, Mar 9, 2011 at 7:20 AM, Jakob Stoklund Olesen wrote: > On Mar 9, 2011, at 2:32 AM, Olaf Krzikalla wrote: > >> Hi @llvm, >> >> Am 08.03.2011 20:14, schrieb Jakob Stoklund Olesen: >>> Is that extra method getting called? What happens if you stick assert(0) in there? >> That won't work either (that is, the assert fires). In debug mode the MSVC lib tries to test the ordering of the sequence. And it uses the yielded predicate for this (which in this particular case is a very bad idea). > > I see. I guess that makes sense if it is written assuming symmetric types. > >> >>> I hoped the symmetric methods would be enough to trick MSVC into compiling it. >> Does that mean, that gcc actually only needs >> >> bool operator()(const LiveRange&A, SlotIndex B) ? > > Actually, the other way around. > >> According to C++(2003) 25.0.0.8 the answer is "yes", however that section talks about BinaryPredicate and not Compare. The standard is rather unclear at this point and I'm going over to comp.std.c++ to ask. > > Howard Hinnant was kind enough to clarify this a while back. > > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-August/010379.html >> > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -- Ahmed Charles http://www.ahmedcharles.com From rjmccall at apple.com Fri Mar 11 20:54:30 2011 From: rjmccall at apple.com (John McCall) Date: Fri, 11 Mar 2011 18:54:30 -0800 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts Message-ID: <0C328BE2-E0BD-44D1-A280-1CAF3E04FE3A@apple.com> This patch implements the current consensus of PR8973: http://llvm.org/bugs/show_bug.cgi?id=8973. The macro llvm_unreachable is used in LLVM to indicate that a particular place in a function is not supposed to be reachable during execution. Like an assert macro, it takes a string argument. In +Asserts builds, this string argument, together with some location information, is passed to a function which prints the information and calls abort(). In -Asserts builds, this string argument is dropped (to minimize code size impact), and instead a bunch of zero arguments are passed to the same function. The problem is that that's still not very good for code size, as it leaves a somewhat bulky function call in the emitted code. It also doesn't let give the compiler any opportunity to optimize based on our assertion that the code is unreachable. A much better alternative is to use an intrinsic, provided by Clang and GCC 4.5, called __builtin_unreachable; it has the semantics of being undefined behavior if reached, much like LLVM's own "unreachable" instruction, which incidentally is what Clang generates for it. This patch keeps the old behavior of llvm_unreachable in +Asserts (!defined(NDEBUG)) builds, but changes the behavior in -Asserts builds to call __builtin_unreachable() (in GCC 4.5 and Clang) or abort() (in everything else). This is effectively a change in the practical semantics of llvm_unreachable: if the call is actually reachable, then you will get some really crazy behavior in -Asserts builds. If you've been using this macro in places that can logically be reached ? e.g., after you've tested for all the instructions you've actually implemented in your backend ? then you've been violating the spirit of this macro, as communicated by its name, and you should change your code to handle unexpected patterns more responsibly. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/08ec01f2/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: unreachable.patch.txt Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/08ec01f2/attachment.txt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/08ec01f2/attachment-0001.html From rjmccall at apple.com Fri Mar 11 20:55:42 2011 From: rjmccall at apple.com (John McCall) Date: Fri, 11 Mar 2011 18:55:42 -0800 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts Message-ID: This patch implements the current consensus of PR8973: http://llvm.org/bugs/show_bug.cgi?id=8973. The macro llvm_unreachable is used in LLVM to indicate that a particular place in a function is not supposed to be reachable during execution. Like an assert macro, it takes a string argument. In +Asserts builds, this string argument, together with some location information, is passed to a function which prints the information and calls abort(). In -Asserts builds, this string argument is dropped (to minimize code size impact), and instead a bunch of zero arguments are passed to the same function. The problem is that that's still not very good for code size, as it leaves a somewhat bulky function call in the emitted code. It also doesn't let give the compiler any opportunity to optimize based on our assertion that the code is unreachable. A much better alternative is to use an intrinsic, provided by Clang and GCC 4.5, called __builtin_unreachable; it has the semantics of being undefined behavior if reached, much like LLVM's own "unreachable" instruction, which incidentally is what Clang generates for it. This patch keeps the old behavior of llvm_unreachable in +Asserts (!defined(NDEBUG)) builds, but changes the behavior in -Asserts builds to call __builtin_unreachable() (in GCC 4.5 and Clang) or abort() (in everything else). This is effectively a change in the practical semantics of llvm_unreachable: if the call is actually reachable, then you will get some really crazy behavior in -Asserts builds. If you've been using this macro in places that can logically be reached ? e.g., after you've tested for all the instructions you've actually implemented in your backend ? then you've been violating the spirit of this macro, as communicated by its name, and you should change your code to handle unexpected patterns more responsibly. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/48dff4c4/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: unreachable.patch.txt Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/48dff4c4/attachment.txt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/48dff4c4/attachment-0001.html From stoklund at 2pi.dk Fri Mar 11 21:13:36 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Fri, 11 Mar 2011 19:13:36 -0800 Subject: [LLVMdev] MSVC compiling issue In-Reply-To: References: <4D75F1C5.6040900@tu-dresden.de> <4D77574E.6010608@tu-dresden.de> <1455E269-7531-43CE-BDF4-E0BDCBBAD369@2pi.dk> Message-ID: On Mar 11, 2011, at 6:40 PM, Ahmed Charles wrote: > MSVC 9 asserts that the comparision function is symmetric, because of > what was unclear wording in the standard (there are posts explaining > this in the boost archive for those interested). The workaround is to > supply the other overload of the comparator. That's what I did initially, but apparently the debug library also verifies that the array is ordered, so three version was required. I finally gave up and stole Howard's upper_bound(), see r127522. It is much less code to write your own algorithms instead of using the STL. Tell your kids! Interestingly, X86FloatingPoint.cpp has an asymmetric lower_bound that hasn't caused problems. It uses operator<(), though /jakob From grosbach at apple.com Fri Mar 11 21:45:11 2011 From: grosbach at apple.com (Jim Grosbach) Date: Fri, 11 Mar 2011 19:45:11 -0800 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: References: Message-ID: Such an awesome change it was worth saying twice! :) Sounds great, and I completely agree it's a nice enhancement to what we can effectively express to help the compiler optimize more effectively. Thanks for doing this. -Jim On Mar 11, 2011, at 6:55 PM, John McCall wrote: > This patch implements the current consensus of PR8973: > http://llvm.org/bugs/show_bug.cgi?id=8973. > > The macro llvm_unreachable is used in LLVM to indicate that > a particular place in a function is not supposed to be reachable > during execution. Like an assert macro, it takes a string > argument. In +Asserts builds, this string argument, together with > some location information, is passed to a function which prints > the information and calls abort(). In -Asserts builds, this string > argument is dropped (to minimize code size impact), and > instead a bunch of zero arguments are passed to the same > function. > > The problem is that that's still not very good for code size, as it > leaves a somewhat bulky function call in the emitted code. It > also doesn't let give the compiler any opportunity to optimize > based on our assertion that the code is unreachable. A much > better alternative is to use an intrinsic, provided by Clang and > GCC 4.5, called __builtin_unreachable; it has the semantics > of being undefined behavior if reached, much like LLVM's own > "unreachable" instruction, which incidentally is what Clang > generates for it. > > This patch keeps the old behavior of llvm_unreachable in > +Asserts (!defined(NDEBUG)) builds, but changes the behavior > in -Asserts builds to call __builtin_unreachable() (in GCC 4.5 > and Clang) or abort() (in everything else). > > This is effectively a change in the practical semantics of > llvm_unreachable: if the call is actually reachable, then you > will get some really crazy behavior in -Asserts builds. If you've > been using this macro in places that can logically be reached ? > e.g., after you've tested for all the instructions you've actually > implemented in your backend ? then you've been violating the > spirit of this macro, as communicated by its name, and you > should change your code to handle unexpected patterns > more responsibly. > > John. > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110311/d52b46fb/attachment.html From xinfinity_a at yahoo.com Sat Mar 12 04:15:07 2011 From: xinfinity_a at yahoo.com (Jimborean Alexandra) Date: Sat, 12 Mar 2011 02:15:07 -0800 (PST) Subject: [LLVMdev] problems when the llvm::ExtractCodeRegion updates the Phi nodes in successors ... is there a method to eliminate phi nodes ? Message-ID: <47549.81024.qm@web130220.mail.mud.yahoo.com> Hi, I use the llvm::ExtractCodeRegion to extract each loop into a separate function, but I have a problem when I run this on the SPEC CPU 2006 on the 401.bzip2 benchmark. First I use clang -O3 to generate optimized llvm code and then I extract some loops from the module built from blocksort.c source file. The problem is that the PHI nodes contained in the successors of the codeRepl include two or more incoming edges for different blocks contained in the loop. Therefore, when the loop is extracted in the new function, these incoming edges are updated to have an entry from the codeRepl block instead of the the original blocks. But in case there are incoming edges from more blocks belonging to the loop, this generates an invalid Phi node which contains multiple entries for the codeRepl block. I use LLVM 2.8 and in the file CodeExtractor.cpp lines 730 - 745 when the Phi nodes in the successors are updated, there is a test to check that the Phi node does not contain an entry from the same BasicBlock from the loop. But there is no test to check if two different blocks of the loop reach the same phi node. (If I correctly understood this part of code... ) Did I obtain an invalid loop or this kind of loops are not eligible for llvm::ExtractCodeRegion ? Is there any other method to eliminate the PHI nodes except the reg2mem pass? I do not want to pollute the code with so many additional load instructions. Is it possible to dissolve the PHI nodes without reg2mem? Thank you. Alexandra -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110312/e9cb4c3b/attachment.html From baldrick at free.fr Sat Mar 12 04:17:27 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 12 Mar 2011 11:17:27 +0100 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: References: Message-ID: <4D7B4837.20709@free.fr> Hi John, > This patch implements the current consensus of PR8973: > http://llvm.org/bugs/show_bug.cgi?id=8973. > > The macro llvm_unreachable is used in LLVM to indicate that > a particular place in a function is not supposed to be reachable > during execution. Like an assert macro, it takes a string > argument. In +Asserts builds, this string argument, together with > some location information, is passed to a function which prints > the information and calls abort(). In -Asserts builds, this string > argument is dropped (to minimize code size impact), and > instead a bunch of zero arguments are passed to the same > function. I have to ask: what is the point of llvm_unreachable? Why not just use assert? Ciao, Duncan. From sebastian.redl at getdesigned.at Sat Mar 12 04:47:24 2011 From: sebastian.redl at getdesigned.at (Sebastian Redl) Date: Sat, 12 Mar 2011 11:47:24 +0100 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: <4D7B4837.20709@free.fr> References: <4D7B4837.20709@free.fr> Message-ID: <6200AED3-60B1-4BE8-A30A-E1BE1A5CB883@getdesigned.at> On 12.03.2011, at 11:17, Duncan Sands wrote: > Hi John, > >> This patch implements the current consensus of PR8973: >> http://llvm.org/bugs/show_bug.cgi?id=8973. >> >> The macro llvm_unreachable is used in LLVM to indicate that >> a particular place in a function is not supposed to be reachable >> during execution. Like an assert macro, it takes a string >> argument. In +Asserts builds, this string argument, together with >> some location information, is passed to a function which prints >> the information and calls abort(). In -Asserts builds, this string >> argument is dropped (to minimize code size impact), and >> instead a bunch of zero arguments are passed to the same >> function. > > I have to ask: what is the point of llvm_unreachable? Why not just > use assert? assert completely disappears in release builds, often leading to compiler warnings when the compiler thinks a control path doesn't return a value. Sebastian From baldrick at free.fr Sat Mar 12 05:01:58 2011 From: baldrick at free.fr (Duncan Sands) Date: Sat, 12 Mar 2011 12:01:58 +0100 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: <6200AED3-60B1-4BE8-A30A-E1BE1A5CB883@getdesigned.at> References: <4D7B4837.20709@free.fr> <6200AED3-60B1-4BE8-A30A-E1BE1A5CB883@getdesigned.at> Message-ID: <4D7B52A6.50904@free.fr> Hi Sebastian, >>> This patch implements the current consensus of PR8973: >>> http://llvm.org/bugs/show_bug.cgi?id=8973. >>> >>> The macro llvm_unreachable is used in LLVM to indicate that >>> a particular place in a function is not supposed to be reachable >>> during execution. Like an assert macro, it takes a string >>> argument. In +Asserts builds, this string argument, together with >>> some location information, is passed to a function which prints >>> the information and calls abort(). In -Asserts builds, this string >>> argument is dropped (to minimize code size impact), and >>> instead a bunch of zero arguments are passed to the same >>> function. >> >> I have to ask: what is the point of llvm_unreachable? Why not just >> use assert? > > assert completely disappears in release builds, often leading to compiler warnings when the compiler thinks a control path doesn't return a value. if the point is to have a better assert, why not introduce llvm_assert and do a bulk replace of all asserts with it? By the way, GCC does this: #ifdef ENABLE_RUNTIME_CHECKING #define gcc_assert(EXPR) ((void)(!(EXPR) ? abort (), 0 : 0)) #else /* Include EXPR, so that unused variable warnings do not occur. */ #define gcc_assert(EXPR) ((void)(0 && (EXPR))) #endif /* Use gcc_unreachable() to mark unreachable locations (like an unreachable default case of a switch. Do not use gcc_assert(0). */ #define gcc_unreachable() (abort ()) Ciao, Duncan. From geek4civic at gmail.com Sat Mar 12 05:05:42 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Sat, 12 Mar 2011 20:05:42 +0900 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: <6200AED3-60B1-4BE8-A30A-E1BE1A5CB883@getdesigned.at> References: <4D7B4837.20709@free.fr> <6200AED3-60B1-4BE8-A30A-E1BE1A5CB883@getdesigned.at> Message-ID: On Sat, Mar 12, 2011 at 7:47 PM, Sebastian Redl wrote: >> I have to ask: what is the point of llvm_unreachable? ?Why not just >> use assert? > > assert completely disappears in release builds, often leading to compiler warnings when the compiler thinks a control path doesn't return a value. Shall clang hook and override system's assert() somehow? ...Takumi(fine) From rjmccall at apple.com Sat Mar 12 12:29:16 2011 From: rjmccall at apple.com (John McCall) Date: Sat, 12 Mar 2011 10:29:16 -0800 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: <4D7B52A6.50904@free.fr> References: <4D7B4837.20709@free.fr> <6200AED3-60B1-4BE8-A30A-E1BE1A5CB883@getdesigned.at> <4D7B52A6.50904@free.fr> Message-ID: <98C708D2-0BA6-4555-AEE9-76E32B03B017@apple.com> On Mar 12, 2011, at 3:01 AM, Duncan Sands wrote: > if the point is to have a better assert, why not introduce llvm_assert and > do a bulk replace of all asserts with it? The point is not to have a better assert; it's to have a better unreachable marker. There are lots of reasons to use a dedicated unreachable marker instead of assert(0): - the intent is much more obvious in the code; - assertions disappear entirely in -Asserts builds, which is obviously desirable, but we don't necessarily want that for unreachable markers because of the point Sebastian raised; and - we can't optimize based on generic assert conditions (*) because it's extremely common to assert on relatively expensive conditions that the programmer certainly doesn't want to see evaluated in -Asserts builds. By contrast, optimizing based on unreachability is quite simple. (*) I'm aware of research into doing this, but it requires fairly sophisticated compiler support so that you don't start running those expensive conditions all the time. > By the way, GCC does this: Note how GCC also has different macros for normal assertion checks vs. marking unreachable code, presumably for exactly the reasons above. We're following that same model, except our unreachable macro: - takes an explanatory string, which we honor in +Asserts builds, and - can be optimized in -Asserts builds. John. From andreas.faerber at web.de Sun Mar 13 05:41:01 2011 From: andreas.faerber at web.de (=?ISO-8859-1?Q?Andreas_F=E4rber?=) Date: Sun, 13 Mar 2011 11:41:01 +0100 Subject: [LLVMdev] backend question In-Reply-To: References: Message-ID: <2BB64B34-BF8F-4C55-8A6E-8C732CA61FC3@web.de> Am 08.03.2011 um 19:59 schrieb Ken Dyck: > If you are interested, I can send you a patch of the changes that I > made to the 2.8 release for a backend that targets a 24-bit > word-addressable DSP, but it is quite rough and it includes changes in > which you probably aren't interested (support for non-power-of-2 > integer sizes and some other bug fixes). I would be interested in non-power-of-two support. I started adding the very basics for i24 in my STM8 repo but there are still a couple places that do getIntegerWidth() * 2 calculations for expansion. Do you have a public repo and/or plans to merge those features into trunk? Andreas From andreas.faerber at web.de Sun Mar 13 05:52:20 2011 From: andreas.faerber at web.de (=?ISO-8859-1?Q?Andreas_F=E4rber?=) Date: Sun, 13 Mar 2011 11:52:20 +0100 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: References: Message-ID: <45F7535E-F930-4451-81BD-72F3176D552E@web.de> Am 10.03.2011 um 05:35 schrieb Jakob Stoklund Olesen: > On Mar 9, 2011, at 8:15 PM, Lu Mitnick wrote: > >> Hello Jakob, >> >> Is this means that TableGen execution is handled in Makefile. >> Porting programmer doesn't need to execute TableGen by hand? > > That's right. > > You are going to be editing your .td files a lot, so you want that > integrated in the build system. In practice that'll mean adding the correct directives to CMakeLists.txt, not Makefile, right? That's what the targets I looked at did. Or is that optional when you just invoke "make"? Andreas From christoph at sicherha.de Sun Mar 13 06:33:48 2011 From: christoph at sicherha.de (Christoph Erhardt) Date: Sun, 13 Mar 2011 12:33:48 +0100 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: <45F7535E-F930-4451-81BD-72F3176D552E@web.de> References: <45F7535E-F930-4451-81BD-72F3176D552E@web.de> Message-ID: <4D7CAB9C.40706@sicherha.de> Hi Andreas, > In practice that'll mean adding the correct directives to > CMakeLists.txt, not Makefile, right? That's what the targets I looked > at did. LLVM can be built using either CMake or the GNU Autotools. Your backend ought to provide support for both build systems, so you should create a CMakeLists.txt as well as a Makefile in your sub-directory. The Makefile appears to be rather trivial because there's some automagic going on - nevertheless, you will need one. :-) > Or is that optional when you just invoke "make"? In that case you're using the Autotools, so your Makefile is going to be utilized. Best regards, Christoph From andreas.faerber at web.de Sun Mar 13 07:18:47 2011 From: andreas.faerber at web.de (=?ISO-8859-1?Q?Andreas_F=E4rber?=) Date: Sun, 13 Mar 2011 13:18:47 +0100 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: <4D7CAB9C.40706@sicherha.de> References: <45F7535E-F930-4451-81BD-72F3176D552E@web.de> <4D7CAB9C.40706@sicherha.de> Message-ID: <78AA7235-97B3-4752-B700-9609BCD6E680@web.de> Hi Christoph, Am 13.03.2011 um 12:33 schrieb Christoph Erhardt: >> In practice that'll mean adding the correct directives to >> CMakeLists.txt, not Makefile, right? That's what the targets I looked >> at did. > LLVM can be built using either CMake or the GNU Autotools. Your > backend > ought to provide support for both build systems, so you should > create a > CMakeLists.txt as well as a Makefile in your sub-directory. > The Makefile appears to be rather trivial because there's some > automagic > going on - nevertheless, you will need one. :-) Now I see: My confusion came from seeing only the customized file names in BUILT_SOURCES. But Makefile.rules:1681 actually processes those files by *Gen*.inc.tmp name patterns. Thanks for explaining, Andreas >> Or is that optional when you just invoke "make"? > In that case you're using the Autotools, so your Makefile is going > to be > utilized. > > Best regards, > Christoph From kecheng at cecs.pdx.edu Sun Mar 13 13:40:23 2011 From: kecheng at cecs.pdx.edu (kecheng at cecs.pdx.edu) Date: Sun, 13 Mar 2011 11:40:23 -0700 Subject: [LLVMdev] pass statistic In-Reply-To: <4D786231.10405@illinois.edu> References: <20110309212630.44855csk64fe3dhc@webmail.cecs.pdx.edu> <4D786231.10405@illinois.edu> Message-ID: <20110313114023.533545dn2x3dfedc@webmail.cecs.pdx.edu> Hi, It looks that the third way would be the most accurate, and won't generate a false alarm. I think I can do some simple experiment first. I want to dump the bitcodes before and after applying each pass and I don't want to insert the dumping code to every llvm pass. Is there a good place to do this once for all passes? Thanks. Best, Kecheng Quoting John Criswell : > On 3/9/2011 11:26 PM, kecheng at cecs.pdx.edu wrote: >> Hi folks, >> >> I wonder how to get the statistic of which pass has been "really" >> applied and which one is not. For instance, I try to apply 20 llvm >> passes on a single C source code. But since the precondition of each >> pass may not be satisfied (try loop-unrolling to a source code without >> loop), some of these pass may not affect the final result. How to know >> which pass affect and which one is ignored? Does Llvm have this kind >> of statistic? Thanks. > > One option is to use the -stats option with opt and hope that a > transform keeps statistics on how many transforms it makes. > However, this approach is fragile because an arbitrary LLVM pass may > not record any statistics on what it changes. > > Another approach is to use the -debug-pass=details option in opt. I > believe it will tell you when a pass has modified the code and when > it has not. That said, some transform passes may tell the > PassManager that they've modified the program when, in fact, they > haven't simply because it's too much programming work to track, > within the pass, whether it has modified anything. > > A third option might be to write a pass that somehow records the > current state of the bitcode and compares it to the state it saw > when it last executed. You could then run this pass in between > every other pass to detect cases where the module does not change. > > So, there are some ways to do it, but only the third option (the > most time-consuming to do) looks fool-proof. > > -- John T. > >> Best, >> >> Kecheng >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > From Arnaud.AllardDeGrandMaison at dibcom.com Sun Mar 13 16:01:52 2011 From: Arnaud.AllardDeGrandMaison at dibcom.com (Arnaud Allard de Grandmaison) Date: Sun, 13 Mar 2011 22:01:52 +0100 Subject: [LLVMdev] IndVarSimplify too aggressive ? Message-ID: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> Hi all, The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. The patch could probably be made smarter : I am welcoming all suggestions. Best Regards, -- Arnaud de Grandmaison -------------- next part -------------- A non-text attachment was scrubbed... Name: IndVarSimplify-nativeType.patch Type: application/octet-stream Size: 2091 bytes Desc: IndVarSimplify-nativeType.patch Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110313/71c8b9cc/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.c Type: text/x-c Size: 256 bytes Desc: test.c Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110313/71c8b9cc/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: test.s.patch.arm Type: application/octet-stream Size: 898 bytes Desc: test.s.patch.arm Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110313/71c8b9cc/attachment-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.s.patch.x86_32 Type: application/octet-stream Size: 2075 bytes Desc: test.s.patch.x86_32 Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110313/71c8b9cc/attachment-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.s.wo_patch.arm Type: application/octet-stream Size: 1102 bytes Desc: test.s.wo_patch.arm Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110313/71c8b9cc/attachment-0003.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.s.wo_patch.x86_32 Type: application/octet-stream Size: 2304 bytes Desc: test.s.wo_patch.x86_32 Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110313/71c8b9cc/attachment-0004.obj From stoklund at 2pi.dk Sun Mar 13 16:11:41 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Sun, 13 Mar 2011 14:11:41 -0700 Subject: [LLVMdev] Question about TableGen when adding LLVM Backend. In-Reply-To: <45F7535E-F930-4451-81BD-72F3176D552E@web.de> References: <45F7535E-F930-4451-81BD-72F3176D552E@web.de> Message-ID: On Mar 13, 2011, at 3:52 AM, Andreas F?rber wrote: > Am 10.03.2011 um 05:35 schrieb Jakob Stoklund Olesen: >> On Mar 9, 2011, at 8:15 PM, Lu Mitnick wrote: >>> Is this means that TableGen execution is handled in Makefile. Porting programmer doesn't need to execute TableGen by hand? >> You are going to be editing your .td files a lot, so you want that integrated in the build system. > > In practice that'll mean adding the correct directives to CMakeLists.txt, not Makefile, right? That's what the targets I looked at did. > Or is that optional when you just invoke "make"? Those are part of two unrelated build systems - cmake and autoconf. You need to maintain both for an in-tree target. Otherwise you only need build files for the build system you use. /jakob From eli.friedman at gmail.com Sun Mar 13 17:08:18 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 13 Mar 2011 18:08:18 -0400 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: On Sun, Mar 13, 2011 at 5:01 PM, Arnaud Allard de Grandmaison wrote: > Hi all, > > The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. > > I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. > > Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. > > The patch could probably be made smarter : I am welcoming all suggestions. It's worth pointing out that LoopStrengthReduce is doing essentially the same transformation. The only reason the generated code is improved at all with your change is that ISel has a longstanding issue where it can't conclude that the upper half of zext i32 %x to i64 is zero if the zext is in a different block from the user of the zext. -Eli From nadav.rotem at intel.com Sun Mar 13 23:42:36 2011 From: nadav.rotem at intel.com (Rotem, Nadav) Date: Mon, 14 Mar 2011 06:42:36 +0200 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: <6594DDFF12B03D4E89690887C248699402712F91FC@hasmsx504.ger.corp.intel.com> Arnaud, I also noticed that IndVarSimplify increases variable size, and in some cases pessimize the program. I just wanted to add that I have seen cases where i64 types were converted to i65 types, for which there is no native support. In the case of i65 multiplication, for some platforms there is not even a library call to perform a 128bit multiplication. So, I welcome your change and I will test your patch locally. Nadav -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Arnaud Allard de Grandmaison Sent: Sunday, March 13, 2011 23:02 To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] IndVarSimplify too aggressive ? Hi all, The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. The patch could probably be made smarter : I am welcoming all suggestions. Best Regards, -- Arnaud de Grandmaison --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From abhirupju at gmail.com Mon Mar 14 02:56:40 2011 From: abhirupju at gmail.com (Abhirup Ghosh) Date: Mon, 14 Mar 2011 13:26:40 +0530 Subject: [LLVMdev] set line number debug info Message-ID: Hi, I am new to LLVM infrastructure. Recently I am trying to set the debug info for an instruction. The main aim is to set the source line number of an instruction. Can anyone please show how to do that? I think that setMetadata method in Instruction class is to be used. But how do I create MDNode* consisting of desired source line number. Source line number can be extracted from the instruction-debug-info using getMetadata method and then using getLineNumber method from DILocation class. Unfortunately DILocation class does not have any function like setLinenumber. Abhirup Ghosh M. Tech Department of Computer Science & Engg. IIT, Bombay email - abhirupju at gmail.com , abhirup at cse.iitb.ac.in Contact - 9920735181 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/c15a824c/attachment.html From fvbommel at gmail.com Mon Mar 14 03:15:23 2011 From: fvbommel at gmail.com (Frits van Bommel) Date: Mon, 14 Mar 2011 09:15:23 +0100 Subject: [LLVMdev] set line number debug info In-Reply-To: References: Message-ID: On Mon, Mar 14, 2011 at 8:56 AM, Abhirup Ghosh wrote: > Hi, > ????? I am new to LLVM infrastructure. Recently I am trying to set the debug > info for an instruction. The main aim is to set the source line number of an > instruction. Can anyone please show how to do that? > ????? I think? that setMetadata method in Instruction class is to be used. > But how do I create MDNode* consisting of desired source line number. Source > line number can be extracted from the instruction-debug-info using > getMetadata method and then using getLineNumber method from DILocation > class. Unfortunately DILocation class does not have any function like > setLinenumber. You may want to check out Instruction::setDebugLoc(). llvm/Support/DebugLoc.h and llvm/Analysis/DIBuilder.h will likely be helpful as well. From blackfin.kang at gmail.com Mon Mar 14 04:09:03 2011 From: blackfin.kang at gmail.com (Michael.Kang) Date: Mon, 14 Mar 2011 17:09:03 +0800 Subject: [LLVMdev] How to load a data from the address of unsiged long type Message-ID: Now I have an address that present in a unsigned long address like the following format: Value* addr = CONST(0xc0008000) But I do not know how to read the data from the above addr varaible. I tried the following three kind of code: 1. Code: Value* addr = CONST(0xc0008000); Value* data = new LoadInst(addr, "", false, bb); Error: Segmentation fault 2. Code( use BotVastInst to vern): Value* addr = CONST(0xc0008000); a = new BitCastInst(a, PointerType::get(XgetType(Int32Ty), 0), "", bb); Value* data = new LoadInst(addr, "", false, bb); Error: Bitcast requires types of same width 3. Code: Type const *intptr_type = cpu->dyncom_engine->exec_engine->getTargetData()->getIntPtrType(_CTX()); Value* ptr = new IntToPtrInst(a, intptr_type, "", bb); Value* data = new LoadInst(ptr, "", false, bb); Error: Segmentation fault Any person can give me some hints for my case? Thanks MK -- www.skyeye.org From baldrick at free.fr Mon Mar 14 05:57:15 2011 From: baldrick at free.fr (Duncan Sands) Date: Mon, 14 Mar 2011 11:57:15 +0100 Subject: [LLVMdev] How to load a data from the address of unsiged long type In-Reply-To: References: Message-ID: <4D7DF48B.9070508@free.fr> Hi Michael, > Now I have an address that present in a unsigned long address like the > following format: > Value* addr = CONST(0xc0008000) what is the type of addr? For that matter, what is CONST - what does it do? Ciao, Duncan. > > But I do not know how to read the data from the above addr varaible. I > tried the following three kind of code: > 1. Code: > Value* addr = CONST(0xc0008000); > Value* data = new LoadInst(addr, "", false, bb); > Error: > Segmentation fault > 2. Code( use BotVastInst to vern): > Value* addr = CONST(0xc0008000); > a = new BitCastInst(a, PointerType::get(XgetType(Int32Ty), 0), "", bb); > Value* data = new LoadInst(addr, "", false, bb); > Error: > Bitcast requires types of same width > 3. Code: > Type const *intptr_type = > cpu->dyncom_engine->exec_engine->getTargetData()->getIntPtrType(_CTX()); > Value* ptr = new IntToPtrInst(a, intptr_type, "", bb); > Value* data = new LoadInst(ptr, "", false, bb); > Error: > Segmentation fault > > > Any person can give me some hints for my case? > > Thanks > MK > From peterl95124 at sbcglobal.net Mon Mar 14 09:42:51 2011 From: peterl95124 at sbcglobal.net (Peter Lawrence) Date: Mon, 14 Mar 2011 07:42:51 -0700 Subject: [LLVMdev] HUGE_VALF in OSX Message-ID: <18A26681-CDE6-4474-90F2-06D3FD116DEF@sbcglobal.net> all, this is all probably very old news and probably fixed in a later OSX, but..... on my 10.4.11 machine, math.h has (IMHO this bug) #define HUGE_VALF 1e50f this compiles when I build LLVM using "configure", but not with "cmake", probably to do with different C-standard-ness and pedantic-ness switches.... I would recommend LLVM Support not import the HUGE_VAL, HUGE_VALF feature, and instead use its own definition, I redefined HUGE_VALF in Support/ DataTypes.h {.in,.cmake} to be 3.4e38f and got LLVM to build on 10.4.11 that way. sincerely, Peter Lawrence. From nadav.rotem at intel.com Mon Mar 14 09:56:02 2011 From: nadav.rotem at intel.com (Rotem, Nadav) Date: Mon, 14 Mar 2011 16:56:02 +0200 Subject: [LLVMdev] Vector select/compare support in LLVM In-Reply-To: References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> <6594DDFF12B03D4E89690887C2486994027129FB10@hasmsx504.ger.corp.intel.com> Message-ID: <6594DDFF12B03D4E89690887C248699402712F9692@hasmsx504.ger.corp.intel.com> David, The problem with the sparse representation is that it is word-width dependent. For 32-bit data-types, the mask is the 32nd bit, while fore 64bit types the mask is the 64th bit. How would you legalize the mask for the following code ? %mask = cmp nge <4 x float> %A, %B ; <4 x i1> %val = select <4 x i1>% mask, <4 x double> %X, %Y ; <4 x double> Moreover, in some cases the generator of the mask and the consumer of the mask are in different basic blocks. The legalizer works on one basic block at a time. This makes it impossible for the legalizer to find the 'native' representation. I wrote down some of the comments which were made in this email thread: http://wiki.llvm.org/Vector_select Cheers, Nadav -----Original Message----- From: David A. Greene [mailto:greened at obbligato.org] Sent: Thursday, March 10, 2011 18:57 To: Rotem, Nadav Cc: David A. Greene; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Vector select/compare support in LLVM "Rotem, Nadav" writes: > One of the arguments for packing masks is that it reduces > vector-registers pressure. Auto-vectorizing compilers maintain > multiple masks for different execution paths (for each loop nesting, > etc). Saving masks in xmm registers may result in vector-register > pressure which will cause spilling of these registers. I agree with > you that GP registers are also a precious resource. GPRs are more precious than vector registers in my experience. Spilling a vector register isn't that painful. Spilling a GPR holding an address is disastrous. > In my private branch, I added the [v4i1 .. v64i1] types. I also > implemented a new type of target lowering: "PACK". This lowering packs Is PACK in the X86 namespace? It seems a pretty target-specific thing. > I also plan to experiment with promoting <4 x i1> to <4 x i32>. At > this point I can't really say what needs to be done. Implementing > this kind of promotion also requires adding legalization support for > strange vector types such as <4 x i65>. How often do we see something like that? Baby steps, baby steps... :) -Dave --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From xerxes at zafena.se Mon Mar 14 10:48:06 2011 From: xerxes at zafena.se (=?ISO-8859-1?Q?Xerxes_R=E5nby?=) Date: Mon, 14 Mar 2011 16:48:06 +0100 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> Message-ID: <4D7E38B6.4020701@zafena.se> On 2011-03-09 02:51, Bill Wendling wrote: > There are LLVM 2.9 RC1 pre-release tarballs source available. You can find them here: > llvm 2.9rc1 test on Dualcore ARM running Ubuntu Natty gunzip llvm-2.9rc1.src.tar.gz tar zxvf llvm-2.9rc1.src.tar cd llvm-2.9-build ../llvm-2.9rc1/configure --enable-shared time make clean ;time make CXXFLAGS=-marm CFLAGS=-marm xranby at panda:/media/dh0/llvm-2.9-build$ time make check Failing Tests (64): LLVM :: CodeGen/CellSPU/2009-01-01-BrCond.ll LLVM :: CodeGen/CellSPU/call_indirect.ll LLVM :: CodeGen/CellSPU/extract_elt.ll LLVM :: CodeGen/CellSPU/fcmp64.ll LLVM :: CodeGen/CellSPU/fneg-fabs.ll LLVM :: CodeGen/CellSPU/i64ops.ll LLVM :: CodeGen/CellSPU/immed64.ll LLVM :: CodeGen/CellSPU/private.ll LLVM :: CodeGen/CellSPU/rotate_ops.ll LLVM :: CodeGen/CellSPU/sext128.ll LLVM :: CodeGen/CellSPU/shuffles.ll LLVM :: CodeGen/CellSPU/stores.ll LLVM :: CodeGen/CellSPU/struct_1.ll LLVM :: CodeGen/CellSPU/trunc.ll LLVM :: CodeGen/CellSPU/v2f32.ll LLVM :: CodeGen/CellSPU/v2i32.ll LLVM :: CodeGen/CellSPU/vec_const.ll LLVM :: CodeGen/CellSPU/vecinsert.ll LLVM :: CodeGen/MSP430/2009-05-10-CyclicDAG.ll LLVM :: CodeGen/MSP430/2009-05-17-Rot.ll LLVM :: CodeGen/MSP430/2009-05-17-Shift.ll LLVM :: CodeGen/MSP430/2009-05-19-DoubleSplit.ll LLVM :: CodeGen/MSP430/2009-08-25-DynamicStackAlloc.ll LLVM :: CodeGen/MSP430/2009-09-18-AbsoluteAddr.ll LLVM :: CodeGen/MSP430/2009-10-10-OrImpDef.ll LLVM :: CodeGen/MSP430/2009-11-08-InvalidResNo.ll LLVM :: CodeGen/MSP430/2009-11-20-NewNode.ll LLVM :: CodeGen/MSP430/2009-12-21-FrameAddr.ll LLVM :: CodeGen/MSP430/AddrMode-bis-rx.ll LLVM :: CodeGen/MSP430/AddrMode-bis-xr.ll LLVM :: CodeGen/MSP430/AddrMode-mov-rx.ll LLVM :: CodeGen/MSP430/AddrMode-mov-xr.ll LLVM :: CodeGen/MSP430/Inst16mi.ll LLVM :: CodeGen/MSP430/Inst16mm.ll LLVM :: CodeGen/MSP430/Inst16mr.ll LLVM :: CodeGen/MSP430/Inst16rm.ll LLVM :: CodeGen/MSP430/Inst8mi.ll LLVM :: CodeGen/MSP430/Inst8mm.ll LLVM :: CodeGen/MSP430/Inst8mr.ll LLVM :: CodeGen/MSP430/Inst8rm.ll LLVM :: CodeGen/MSP430/bit.ll LLVM :: CodeGen/MSP430/indirectbr.ll LLVM :: CodeGen/MSP430/indirectbr2.ll LLVM :: CodeGen/MSP430/inline-asm.ll LLVM :: CodeGen/MSP430/mult-alt-generic-msp430.ll LLVM :: CodeGen/Mips/2008-07-03-SRet.ll LLVM :: CodeGen/Mips/2008-07-05-ByVal.ll LLVM :: CodeGen/Mips/2008-07-15-SmallSection.ll LLVM :: CodeGen/Mips/2008-08-03-ReturnDouble.ll LLVM :: CodeGen/Mips/2008-10-13-LegalizerBug.ll LLVM :: CodeGen/Mips/2008-11-10-xint_to_fp.ll LLVM :: CodeGen/Mips/2010-07-20-Switch.ll LLVM :: CodeGen/Mips/blockaddr.ll LLVM :: CodeGen/Mips/cmov.ll LLVM :: CodeGen/Mips/divrem.ll LLVM :: CodeGen/Mips/private.ll LLVM :: CodeGen/SPARC/2007-07-05-LiveIntervalAssert.ll LLVM :: CodeGen/SPARC/2009-08-28-PIC.ll LLVM :: CodeGen/SPARC/2011-01-11-CC.ll LLVM :: CodeGen/SPARC/2011-01-19-DelaySlot.ll LLVM :: CodeGen/SPARC/2011-01-22-SRet.ll LLVM :: CodeGen/SPARC/mult-alt-generic-sparc.ll LLVM :: CodeGen/Thumb/select.ll LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll Expected Passes : 5191 Expected Failures : 50 Unsupported Tests : 543 Unexpected Failures: 64 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: llvm 2.9rc1 test on Dualcore ARM running Ubuntu Natty.txt Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/0a481d69/attachment-0001.txt From j.wilhelmy at arcor.de Mon Mar 14 10:56:59 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Mon, 14 Mar 2011 16:56:59 +0100 Subject: [LLVMdev] how to build a StructType incrementally Message-ID: <4D7E3ACB.6090704@arcor.de> Hi! StructType has no method for adding elements, but I'd like to build a type incrementally, i.e. it starts empty and a global of the type exists. then load instructions are added to a basic block and for each load instruction an element to the type has to be added. Is there a way to do this? For example I could first use UndefValues instead of load and then do a replaceAllUsesWith when the type is finally known. Does this work or is there a better solution? -Jochen From justin.holewinski at gmail.com Mon Mar 14 11:38:41 2011 From: justin.holewinski at gmail.com (Justin Holewinski) Date: Mon, 14 Mar 2011 12:38:41 -0400 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> <051C4414-6923-4562-BC2B-91B8E3180416@apple.com> Message-ID: On Thu, Mar 10, 2011 at 2:53 PM, Anton Korobeynikov wrote: > > Just to be clear: we _really_ do not want all the sha's to change for > trunk. > Yes. That's why I said there will be other way :) > > In any case - please try clang.git once again. It should contain new > branch/tag layout. If there will be some problems - let me know and > I'll revert to prev. repository. > Perhaps my git-fu is just exceptionally weak, but I'm having difficulties figuring out how to use the new llvm.git repository. This morning, I did a 'git fetch' and noticed it pulled new branches for the SVN releases and tags. So far so good. Then, I ran 'git svn rebase -l' but no commits were merged. Running a full 'git svn rebase' pulled several SVN revisions and appeared to commit them to my local repository, but username information appears to have been lost. Instead of the usual "Justin Holewinski ", I see "jholewinski ". Thinking the new repository layout is just not compatible with the old layout, I performed a fresh clone and did the usual: $ git config --add remote.origin.fetch '+refs/remotes/git-svn:refs/remotes/git-svn' $ git fetch $ git svn init https://llvm.org/svn/llvm-project/llvm/trunk $ git svn rebase -l This seemed to work (luckily there were no upstream commits during this time), until I finished a commit and tried a 'git svn dcommit'. It pushed my commit into the SVN repository, then proceeded to pull what seemed like 100+ different SVN revisions and merged them into my local tree. Running 'git log' after the commit showed my commit at the top, and then started with commits from 2001. I was concerned until I verified the problem was with my local repository and not with the upstream SVN repository! It seems like with the current setup, I need to do a fresh git clone every time a new commit is made upstream. Otherwise, either 'git fetch' or 'git svn rebase -l' messes something up. Clearly I'm doing something wrong. What am I missing here? Should the new layout affect the git workflow? Could you post a detailed overview of working with LLVM trunk through the git-svn bridge on llvm.org? I thought maybe running from the svn/trunk branch would help, but that branch seems to contain history only up to 2001. > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/23ca68c0/attachment.html From baldrick at free.fr Mon Mar 14 11:45:13 2011 From: baldrick at free.fr (Duncan Sands) Date: Mon, 14 Mar 2011 17:45:13 +0100 Subject: [LLVMdev] Vector select/compare support in LLVM In-Reply-To: <6594DDFF12B03D4E89690887C248699402712F9692@hasmsx504.ger.corp.intel.com> References: <6594DDFF12B03D4E89690887C2486994027129F2A2@hasmsx504.ger.corp.intel.com> <6594DDFF12B03D4E89690887C2486994027129FB10@hasmsx504.ger.corp.intel.com> <6594DDFF12B03D4E89690887C248699402712F9692@hasmsx504.ger.corp.intel.com> Message-ID: <4D7E4619.6050908@free.fr> Hi Nadav, > The problem with the sparse representation is that it is word-width dependent. For 32-bit data-types, the mask is the 32nd bit, while fore 64bit types the mask is the 64th bit. > > How would you legalize the mask for the following code ? > > %mask = cmp nge<4 x float> %A, %B ;<4 x i1> > %val = select<4 x i1>% mask,<4 x double> %X, %Y ;<4 x double> I would expect this to become %mask = cmp nge<4 x float> %A, %B with result type <4 x i32> %mask_lo = extract elements 0, 1 from %mask, result type <2 x i64> %mask_hi = extract elements 2, 3 from %mask, result type <2 x i64> %val_lo = select <2 x i64> %mask_lo, <2 x double> %X_lo, %Y_lo %val_hi = select <2 x i64> %mask_hi, <2 x double> %X_hi, %Y_hi > > Moreover, in some cases the generator of the mask and the consumer of the mask are in different basic blocks. The legalizer works on one basic block at a time. This makes it impossible for the legalizer to find the 'native' representation. I don't understand what you are saying here. Ciao, Duncan. > > I wrote down some of the comments which were made in this email thread: > > http://wiki.llvm.org/Vector_select > > > Cheers, > Nadav > > > -----Original Message----- > From: David A. Greene [mailto:greened at obbligato.org] > Sent: Thursday, March 10, 2011 18:57 > To: Rotem, Nadav > Cc: David A. Greene; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Vector select/compare support in LLVM > > "Rotem, Nadav" writes: > >> One of the arguments for packing masks is that it reduces >> vector-registers pressure. Auto-vectorizing compilers maintain >> multiple masks for different execution paths (for each loop nesting, >> etc). Saving masks in xmm registers may result in vector-register >> pressure which will cause spilling of these registers. I agree with >> you that GP registers are also a precious resource. > > GPRs are more precious than vector registers in my experience. Spilling > a vector register isn't that painful. Spilling a GPR holding an address > is disastrous. > >> In my private branch, I added the [v4i1 .. v64i1] types. I also >> implemented a new type of target lowering: "PACK". This lowering packs > > Is PACK in the X86 namespace? It seems a pretty target-specific thing. > >> I also plan to experiment with promoting<4 x i1> to<4 x i32>. At >> this point I can't really say what needs to be done. Implementing >> this kind of promotion also requires adding legalization support for >> strange vector types such as<4 x i65>. > > How often do we see something like that? Baby steps, baby steps... :) > > -Dave > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From xerxes at zafena.se Mon Mar 14 11:49:29 2011 From: xerxes at zafena.se (=?ISO-8859-1?Q?Xerxes_R=E5nby?=) Date: Mon, 14 Mar 2011 17:49:29 +0100 Subject: [LLVMdev] LLVM 2.9 release notes + "Projects using LLVM" + API changes - IcedTea docs section. In-Reply-To: References: Message-ID: <4D7E4719.5070805@zafena.se> On 2011-03-10 08:54, Chris Lattner wrote: > Hi All, > > With 2.9 starting to make its way out into the world, it is time to start poking at the release notes. I plan to make a pass through llvm-commits to cull some of the major changes into bullets, but am already behind and insanely busy with other stuff over the next week. > > If you have commit access, I'd really appreciate it if you could take a pass through llvm/docs/ReleaseNotes.html to fill in notes about things that are new in the release (particularly for clang, mc, dragonegg and other subprojects) and important API changes for external clients. Feel free to directly commit to the release notes, I will fill in more details when I get bandwidth and will generally tidy it up and edit it, so don't worry about it all fitting together well at this point. > > If you have an external project that works with LLVM 2.9, please send me a blurb offlist and I'll include it in the release notes. We've been doing this over the last couple of releases and I think it adds a lot of value to show the different ways that LLVM gets used. Please send me a blurb along the lines of the examples from 2.8: http://llvm.org/releases/2.8/docs/ReleaseNotes.html#externalproj > > Note that I've zapped *all* of the existing blurbs from the 2.8 release notes, so if you want to be included in the 2.9 notes, please send me updated text. > > Thanks all, this is going to be a great release! > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev Hi Chris! Here are an IcedTea ReleaseNotes.html#externalproj Docs section for LLVM 2.9 Cheers Xerxes

IcedTea provides a harness to build OpenJDK using only free software build tools and to provide replacements for the not-yet free parts of OpenJDK. One of the extensions that IcedTea provides is a new JIT compiler named Shark which uses LLVM to provide native code generation without introducing processor-dependent code.

OpenJDK 7 b112, IcedTea6 1.9 and IcedTea7 1.13 and later have been tested and are known to work with LLVM 2.9 (and continue to work with older LLVM releases >= 2.6 as well).

From anton at korobeynikov.info Mon Mar 14 12:12:13 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 14 Mar 2011 20:12:13 +0300 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> <051C4414-6923-4562-BC2B-91B8E3180416@apple.com> Message-ID: Hello Justin, > What am I missing here? ?Should the new layout affect the git workflow? No. Stuff was just added. Trunk is still available as "master" as before. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From anton at korobeynikov.info Mon Mar 14 12:14:05 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 14 Mar 2011 20:14:05 +0300 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <4D7E38B6.4020701@zafena.se> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> <4D7E38B6.4020701@zafena.se> Message-ID: Hello Xerxes, > llvm 2.9rc1 test on Dualcore ARM running Ubuntu Natty What is the gcc used for the compilation? Can you try to do the -O0 build and see whether this changed the stuff? -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From Arnaud.AllardDeGrandMaison at dibcom.com Mon Mar 14 13:27:27 2011 From: Arnaud.AllardDeGrandMaison at dibcom.com (Arnaud Allard de Grandmaison) Date: Mon, 14 Mar 2011 19:27:27 +0100 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1C04ED4@FRPAR1CL009.coe.adi.dibcom.com> Thanks Eli, After digging thru mail archives & bugzilla, it seems fixing properly this issue would require a major change in the selectionDAG code --- to have it operate on a per function basis instead of per basic-block. This however, does not seem to be the only issue. The following C code does not produce an efficicient assembly sequence either. extern void f(unsigned long long v); void test2() { for (unsigned i=0; i<512; i++) f(i); } The resulting .ll out of clang looks reasonnable (with and without the patch), but the arm assembly output looks ugly, though marginally better with my patch : the induction variable should be counting up, and it could be zero extended before the call to f. This again points to Isel, but to a different area, as everything is taking place in the same BB. Is this some known issue ? I could not find a bug report matching this. -- Arnaud de Grandmaison -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Sunday, March 13, 2011 11:08 PM To: Arnaud Allard de Grandmaison Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] IndVarSimplify too aggressive ? On Sun, Mar 13, 2011 at 5:01 PM, Arnaud Allard de Grandmaison wrote: > Hi all, > > The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. > > I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. > > Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. > > The patch could probably be made smarter : I am welcoming all suggestions. It's worth pointing out that LoopStrengthReduce is doing essentially the same transformation. The only reason the generated code is improved at all with your change is that ISel has a longstanding issue where it can't conclude that the upper half of zext i32 %x to i64 is zero if the zext is in a different block from the user of the zext. -Eli -------------- next part -------------- A non-text attachment was scrubbed... Name: test2.s.wo_patch.arm Type: application/octet-stream Size: 447 bytes Desc: test2.s.wo_patch.arm Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/3e93ccc4/attachment.obj -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test2.c Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/3e93ccc4/attachment.c -------------- next part -------------- A non-text attachment was scrubbed... Name: test2.ll.w_patch.arm Type: application/octet-stream Size: 744 bytes Desc: test2.ll.w_patch.arm Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/3e93ccc4/attachment-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test2.ll.wo_patch.arm Type: application/octet-stream Size: 676 bytes Desc: test2.ll.wo_patch.arm Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/3e93ccc4/attachment-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test2.s.w_patch.arm Type: application/octet-stream Size: 440 bytes Desc: test2.s.w_patch.arm Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/3e93ccc4/attachment-0003.obj From justin.holewinski at gmail.com Mon Mar 14 14:07:50 2011 From: justin.holewinski at gmail.com (Justin Holewinski) Date: Mon, 14 Mar 2011 15:07:50 -0400 Subject: [LLVMdev] [cfe-dev] GIT mirrors In-Reply-To: References: <66B20541-9A42-4DC4-87F4-B67C33891226@apple.com> <2E4A9A0A-8D5F-4C3F-A905-D890D4E3AA9A@2pi.dk> <051C4414-6923-4562-BC2B-91B8E3180416@apple.com> Message-ID: On Mon, Mar 14, 2011 at 1:12 PM, Anton Korobeynikov wrote: > Hello Justin, > > > What am I missing here? Should the new layout affect the git workflow? > No. Stuff was just added. Trunk is still available as "master" as before. > Alright, figured it out. Thanks to Tobias for the git insight! The previous way I was using the git-svn bridge was to pull from refs/remotes/git-svn (learned from an older post on the list). However, git-svn does not appear to be a valid, up-to-date ref anymore. It only pulls history through March 10, which I assume is the around the time the layout change occurred. The fix is to have git-svn read from refs/remotes/origin/master instead of relying on refs/remotes/git-svn. Now it works great! So, instead of: git config --add remote.origin.fetch '+refs/remotes/git-svn:refs/remotes/git-svn' I needed: git config svn-remote.svn.fetch ":refs/remotes/origin/master" Hopefully this will be helpful to others. > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > -- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/c86f2c8d/attachment.html From kd at kendyck.com Mon Mar 14 14:21:13 2011 From: kd at kendyck.com (Ken Dyck) Date: Mon, 14 Mar 2011 15:21:13 -0400 Subject: [LLVMdev] backend question In-Reply-To: <2BB64B34-BF8F-4C55-8A6E-8C732CA61FC3@web.de> References: <2BB64B34-BF8F-4C55-8A6E-8C732CA61FC3@web.de> Message-ID: On Sun, Mar 13, 2011 at 6:41 AM, Andreas F?rber wrote: > Am 08.03.2011 um 19:59 schrieb Ken Dyck: > >> If you are interested, I can send you a patch of the changes that I >> made to the 2.8 release for a backend that targets a 24-bit >> word-addressable DSP, but it is quite rough and it includes changes in >> which you probably aren't interested (support for non-power-of-2 >> integer sizes and some other bug fixes). > > I would be interested in non-power-of-two support. I started adding the very > basics for i24 in my STM8 repo but there are still a couple places that do > getIntegerWidth() * 2 calculations for expansion. > > Do you have a public repo and/or plans to merge those features into trunk? Sorry, no public repository. I hope to someday merge the changes to the trunk, but I can't promise any definite timeline or that they will retain any semblance of their current form. Until then, you can find the patch (zipped) attached. -Ken -------------- next part -------------- A non-text attachment was scrubbed... Name: non-po2.word-addressable.llvm-clang.patch.zip Type: application/zip Size: 46445 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/f9503998/attachment.zip From clattner at apple.com Mon Mar 14 15:00:20 2011 From: clattner at apple.com (Chris Lattner) Date: Mon, 14 Mar 2011 13:00:20 -0700 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: <0C328BE2-E0BD-44D1-A280-1CAF3E04FE3A@apple.com> References: <0C328BE2-E0BD-44D1-A280-1CAF3E04FE3A@apple.com> Message-ID: <46E82785-3F04-46A0-882C-22303DBAB29A@apple.com> On Mar 11, 2011, at 6:54 PM, John McCall wrote: > This patch implements the current consensus of PR8973: > http://llvm.org/bugs/show_bug.cgi?id=8973. > > The macro llvm_unreachable is used in LLVM to indicate that > a particular place in a function is not supposed to be reachable > during execution. Like an assert macro, it takes a string > argument. In +Asserts builds, this string argument, together with > some location information, is passed to a function which prints > the information and calls abort(). In -Asserts builds, this string > argument is dropped (to minimize code size impact), and > instead a bunch of zero arguments are passed to the same > function. Hi John, The patch looks great to me! -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/c88c95b9/attachment.html From rjmccall at apple.com Mon Mar 14 15:05:38 2011 From: rjmccall at apple.com (John McCall) Date: Mon, 14 Mar 2011 13:05:38 -0700 Subject: [LLVMdev] [patch] Change llvm_unreachable to use __builtin_unreachable() in -asserts In-Reply-To: References: Message-ID: <99DA9976-6F3D-4DC6-A146-52DF86ADB3CF@apple.com> On Mar 11, 2011, at 7:45 PM, Jim Grosbach wrote: > Such an awesome change it was worth saying twice! :) > > Sounds great, and I completely agree it's a nice enhancement to what we can effectively express to help the compiler optimize more effectively. Thanks for doing this. Thanks! Consensus seems to be not opposed, so here we go. :) John. From douglasdocouto at gmail.com Mon Mar 14 16:55:42 2011 From: douglasdocouto at gmail.com (Douglas do Couto Teixeira) Date: Mon, 14 Mar 2011 18:55:42 -0300 Subject: [LLVMdev] How to integrate an analysis into LVI? Message-ID: Hi guys, I have an analysis that is able to answer questions like this: given an integer variable, what is the interval of values that this variable can assume during the program's execution? I want to integrate this analysis into LLVM and it seems LVI (Lazy Value Info) is the best place to do this kind of stuff. Can someone give some hints about what I have to do to integrate my analysis into LVI? Best regards, Douglas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/6d569a53/attachment.html From lxfind at gmail.com Mon Mar 14 17:19:36 2011 From: lxfind at gmail.com (Xun Li) Date: Mon, 14 Mar 2011 15:19:36 -0700 Subject: [LLVMdev] Questions about linking with math library using llvm Message-ID: Hi, I have been trying to figure this out for a long time and really need some help. I am compiling C programs which uses some math functions (such as pow, ceil) into SPARC ISA, using llvm-gcc. I imagine below is the right process: llvm-gcc -c -emit-llvm *.c llvm-ld -lm *.bc -o test llc -march=sparc test.bc -o test.s However when I look into test.s, I realized that those math functions are not linked in. This can be also seen if I use "llc test.bc -o test.s" to generate the assembly for the local machine and run "gcc test.s", it will prompt the error saying that there are undefined reference to math functions. So the "-lm" option in llvm-ld really did not link the math functions. How could I link those math functions into the compiled assemblies? Thanks. -- Xun Li Computer Architecture Lab Department of Computer Science University of California, Santa Barbara -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/11e1fc23/attachment-0001.html From rengolin at systemcall.org Mon Mar 14 17:43:50 2011 From: rengolin at systemcall.org (Renato Golin) Date: Mon, 14 Mar 2011 22:43:50 +0000 Subject: [LLVMdev] Warning in LLVM Message-ID: When compiling LLVM on my Intel(R) Core(TM)2 Duo CPU P7450 running Ubuntu (gcc 4.4.5), I get this warning: /home/rengolin/workspace/llvm/rw/build/Release+Asserts/lib/libLLVMARMAsmParser.a(ARMAsmParser.o): In function `(anonymous namespace)::ARMAsmParser::ParseRegisterList(llvm::SmallVectorImpl&)': ARMAsmParser.cpp:(.text+0x4a05): warning: memset used with constant zero length parameter; this could be due to transposed parameters Couldn't see anything special at 0x4a05, even because there was no such address (it's not even aligned?). Also, nothing got my attention on the source code, any ideas? -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From judison at gmail.com Mon Mar 14 19:52:54 2011 From: judison at gmail.com (Judison) Date: Mon, 14 Mar 2011 21:52:54 -0300 Subject: [LLVMdev] Kinda noob questions Message-ID: Hi all, I have 2 simple questions... First, just for you to know, I'm making a compiler (kinda obvious) and the language definition is mine, so there is no much constraint for me, it's written in Java (it generates a .ll file as output) and it's just for fun by now, maybe a serious project after I learn all the stuff needed... and sorry bad english 1) I want to know if when I do this kind of thing: entry: %x = alloca i32 %y = alloca i32 ... normal instructions... other_block: %z = alloca i32 .... br .... label %other_block the stack "grows" when alloca is "called", or its the same as putting "%z = alloca" in the beggining? or... if alloca is in a loop, the stack grows every time it iterates?? 2) does the LLVM optimizes this: ; this code (arg -> var) is very similar to what gcc generates define void @foo(i32 %arg0) { entry: %var0 = alloca i32 store i32 %arg0, i32* var0 ... %x = load i32* %var0 .... ; never more stores to %var0 or pass it to any function } to something like: define void @foo(i32 arg0) { entry: ; no alloca .... %x = %arg0 .... } does it??? I'm asking because I can check in my compiler if some arg is only readed or if its used as a var before generate the IR, an then only create (alloca) a %var if the arg is used for read-write :P (my language is "safe" (no pointers)) If LLVM optimizes it, I'll let it... if not, I'll do this optimization Thanks in advance :P -- Judison judison at gmail.com "A wise man first thinks and then speaks and a fool speaks first and then thinks." Imam Ali (as) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/1906fed8/attachment.html From criswell at illinois.edu Mon Mar 14 19:56:58 2011 From: criswell at illinois.edu (John Criswell) Date: Mon, 14 Mar 2011 19:56:58 -0500 Subject: [LLVMdev] Kinda noob questions In-Reply-To: References: Message-ID: <4D7EB95A.4060901@illinois.edu> On 3/14/11 7:52 PM, Judison wrote: > Hi all, > > I have 2 simple questions... > > First, just for you to know, I'm making a compiler (kinda obvious) and > the language definition is mine, so there is no much constraint for > me, it's written in Java (it generates a .ll file as output) and it's > just for fun by now, maybe a serious project after I learn all the > stuff needed... > > and sorry bad english > > 1) I want to know if when I do this kind of thing: > > entry: > %x = alloca i32 > %y = alloca i32 > ... normal instructions... > other_block: > %z = alloca i32 > .... > br .... label %other_block > > the stack "grows" when alloca is "called", or its the same as putting > "%z = alloca" in the beggining? > or... if alloca is in a loop, the stack grows every time it iterates?? Yes, you can have alloca's instead of loops, and they will grow the size of the stack frame each time they are executed dynamically. > > > > 2) does the LLVM optimizes this: > > ; this code (arg -> var) is very similar to what gcc generates > define void @foo(i32 %arg0) { > entry: > %var0 = alloca i32 > store i32 %arg0, i32* var0 > ... > %x = load i32* %var0 > .... > ; never more stores to %var0 or pass it to any function > } > > to something like: > > define void @foo(i32 arg0) { > entry: > ; no alloca > .... > %x = %arg0 > .... > } > > does it??? I suspect that mem2reg plus a few additional optimizations will do this for you. -- John T. > > I'm asking because I can check in my compiler if some arg is only > readed or if its used as a var before generate the IR, an then only > create (alloca) a %var if the arg is used for read-write :P (my > language is "safe" (no pointers)) > If LLVM optimizes it, I'll let it... if not, I'll do this optimization > > Thanks in advance :P > > -- > Judison > judison at gmail.com > > "A wise man first thinks and then speaks and a fool speaks first and > then thinks." Imam Ali (as) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/9f8aabda/attachment.html From rjmccall at apple.com Mon Mar 14 20:23:11 2011 From: rjmccall at apple.com (John McCall) Date: Mon, 14 Mar 2011 18:23:11 -0700 Subject: [LLVMdev] Kinda noob questions In-Reply-To: <4D7EB95A.4060901@illinois.edu> References: <4D7EB95A.4060901@illinois.edu> Message-ID: <03C0EFE1-9904-4743-B90F-8456AE60F85F@apple.com> On Mar 14, 2011, at 5:56 PM, John Criswell wrote: > On 3/14/11 7:52 PM, Judison wrote: >> >> 2) does the LLVM optimizes this: >> >> ; this code (arg -> var) is very similar to what gcc generates >> define void @foo(i32 %arg0) { >> entry: >> %var0 = alloca i32 >> store i32 %arg0, i32* var0 >> ... >> %x = load i32* %var0 >> .... >> ; never more stores to %var0 or pass it to any function >> } >> >> to something like: >> >> define void @foo(i32 arg0) { >> entry: >> ; no alloca >> .... >> %x = %arg0 >> .... >> } >> >> does it??? > > I suspect that mem2reg plus a few additional optimizations will do this for you. mem2reg is sufficient. There's also no limit on the number of loads and stores you can do without disturbing the optimization; just don't do anything too opaque with the address. John. From blackfin.kang at gmail.com Mon Mar 14 20:36:47 2011 From: blackfin.kang at gmail.com (Michael.Kang) Date: Tue, 15 Mar 2011 09:36:47 +0800 Subject: [LLVMdev] How to load a data from the address of unsiged long type In-Reply-To: <4D7DF48B.9070508@free.fr> References: <4D7DF48B.9070508@free.fr> Message-ID: On Mon, Mar 14, 2011 at 6:57 PM, Duncan Sands wrote: > Hi Michael, > >> Now I have an address that present in a unsigned long address like the >> following format: >> Value* addr = CONST(0xc0008000) > > what is the type of addr? ?For that matter, what is CONST - what does it do? > > Ciao, Duncan. I forget to interpret the CONST macro .It is defined as the following: ConstantInt::get(getIntegerType(s), v) It is used to transform a C language int type to llvm Integer type. Thanks MK > >> >> But I do not know how to read the data from the above addr varaible. I >> tried the following three kind of code: >> 1. Code: >> ? ? ? ? ? Value* addr = CONST(0xc0008000); >> ? ? ? ? ? Value* data = new LoadInst(addr, "", false, bb); >> ? ? Error: >> ? ? ? ? ? Segmentation fault >> 2. Code( use BotVastInst to vern): >> ? ? ? ? Value* addr = CONST(0xc0008000); >> ? ? ? ? ?a = new BitCastInst(a, PointerType::get(XgetType(Int32Ty), 0), "", bb); >> ? ? ? ? ?Value* data = new LoadInst(addr, "", false, bb); >> ? ? ?Error: >> ? ? ? ? ? Bitcast requires types of same width >> 3. Code: >> ? ? ? ? Type const *intptr_type = >> cpu->dyncom_engine->exec_engine->getTargetData()->getIntPtrType(_CTX()); >> ? ? ? ? ?Value* ptr = new IntToPtrInst(a, intptr_type, "", bb); >> ? ? ? ? ?Value* data = new LoadInst(ptr, "", false, bb); >> ? ? Error: >> ? ? ? ? ?Segmentation fault >> >> >> Any person can give me some hints for my case? >> >> Thanks >> MK >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -- www.skyeye.org From judison at gmail.com Mon Mar 14 20:44:31 2011 From: judison at gmail.com (Judison) Date: Mon, 14 Mar 2011 22:44:31 -0300 Subject: [LLVMdev] Kinda noob questions In-Reply-To: <03C0EFE1-9904-4743-B90F-8456AE60F85F@apple.com> References: <4D7EB95A.4060901@illinois.edu> <03C0EFE1-9904-4743-B90F-8456AE60F85F@apple.com> Message-ID: Thank you John and John :P these optimizations (mem2reg and the "few additional" ones) I have to enable then or something like this?? (I compile the .ll to .o with the sequence llvm-as, llc, as) Yet about dynamic stack allocation, what is (generally) better? to pre alloc everything at start, or let it be? Imagine this pseudo-code while (x) { int b = 0; ... } using alloca where b is declared, it will grow the stack dynamically, but it will shrink it at } or in llvm's words in the end of a block where b is not accessible anymore? Can I force its deallocation?? Thanks again On Mon, Mar 14, 2011 at 10:23 PM, John McCall wrote: > On Mar 14, 2011, at 5:56 PM, John Criswell wrote: > > On 3/14/11 7:52 PM, Judison wrote: > >> > >> 2) does the LLVM optimizes this: > >> > >> ; this code (arg -> var) is very similar to what gcc generates > >> define void @foo(i32 %arg0) { > >> entry: > >> %var0 = alloca i32 > >> store i32 %arg0, i32* var0 > >> ... > >> %x = load i32* %var0 > >> .... > >> ; never more stores to %var0 or pass it to any function > >> } > >> > >> to something like: > >> > >> define void @foo(i32 arg0) { > >> entry: > >> ; no alloca > >> .... > >> %x = %arg0 > >> .... > >> } > >> > >> does it??? > > > > I suspect that mem2reg plus a few additional optimizations will do this > for you. > > mem2reg is sufficient. There's also no limit on the number of loads > and stores you can do without disturbing the optimization; just > don't do anything too opaque with the address. > > John. -- Judison judison at gmail.com "O ignorante que procura se instruir ? como um s?bio; o s?bio que fala sem discernimento se assemelha a um ignorante." Imam Ali (as) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110314/8497cd56/attachment.html From rjmccall at apple.com Mon Mar 14 20:49:12 2011 From: rjmccall at apple.com (John McCall) Date: Mon, 14 Mar 2011 18:49:12 -0700 Subject: [LLVMdev] Kinda noob questions In-Reply-To: References: <4D7EB95A.4060901@illinois.edu> <03C0EFE1-9904-4743-B90F-8456AE60F85F@apple.com> Message-ID: <34BBB62B-7265-4322-BD3B-FCA9AE2E4759@apple.com> On Mar 14, 2011, at 6:44 PM, Judison wrote: > Thank you John and John :P > > these optimizations (mem2reg and the "few additional" ones) I have to enable then or something like this?? (I compile the .ll to .o with the sequence llvm-as, llc, as) Look into 'opt'. > Yet about dynamic stack allocation, what is (generally) better? to pre alloc everything at start, or let it be? > > Imagine this pseudo-code > > while (x) { > int b = 0; > ... > } > > using alloca where b is declared, it will grow the stack dynamically, but it will shrink it at } or in llvm's words in the end of a block where b is not accessible anymore? Can I force its deallocation?? Unless you're actually reliant on dynamic allocation ? e.g. you don't know the size statically, or you're intending to repeatedly allocate memory ? you should always allocate in the entry block. LLVM is reasonably good at re-using stack slots when it can. John. From sarevokcc at gmail.com Tue Mar 15 04:34:22 2011 From: sarevokcc at gmail.com (Dongrui She) Date: Tue, 15 Mar 2011 10:34:22 +0100 Subject: [LLVMdev] How to choose targets to build in CMake? Message-ID: Hi all, I understand from the document that I should be able to configure llvm with only specific targets by passing -DLLVM_TARGETS_TO_BUILD=some-backend to cmake. However, I get the following errors whenever I don't include X86 in the target list: CMake Error at cmake/modules/LLVMConfig.cmake:127 (message): Library `X86' not found in list of llvm libraries. Call Stack (most recent call first): cmake/modules/LLVMConfig.cmake:47 (explicit_map_components_to_libraries) cmake/modules/LLVMConfig.cmake:40 (explicit_llvm_config) cmake/modules/AddLLVM.cmake:82 (llvm_config) cmake/modules/AddLLVM.cmake:114 (add_llvm_executable) examples/Kaleidoscope/Chapter4/CMakeLists.txt:3 (add_llvm_example) -- Configuring incomplete, errors occurred! When I try the autotools, I can get what I want by running configure with --enable-targets=some-backend Any ideas? I use cmake 2.8.0 on a fedora 10 i686 and I check out llvm from the svn repository. -- Regards, Dongrui -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110315/d7c86ad0/attachment.html From geek4civic at gmail.com Tue Mar 15 05:00:20 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Tue, 15 Mar 2011 19:00:20 +0900 Subject: [LLVMdev] How to choose targets to build in CMake? In-Reply-To: References: Message-ID: On Tue, Mar 15, 2011 at 6:34 PM, Dongrui She wrote: > However, I get the following errors whenever I don't include X86 in the > target list: on top of trunk, it seems cmake requires at least "native" (for yours, X86) target. (I reconfirmed with -DLLVM_TARGETS_TO_BUILD=ARM on x86) For workaround, you have to include "X86" always. ...Takumi From xerxes at zafena.se Tue Mar 15 05:50:03 2011 From: xerxes at zafena.se (=?ISO-8859-1?Q?Xerxes_R=E5nby?=) Date: Tue, 15 Mar 2011 11:50:03 +0100 Subject: [LLVMdev] Building VMKit - ARM AtomicCmpSwap In-Reply-To: References: Message-ID: <4D7F445B.7080305@zafena.se> On 2011-03-10 22:32, Andrew Wiley wrote: > I tried to build VMKit on an ARM device today (a Sheevaplug - armv5te) > (native, not cross compiled), and got this error: > > llvm[3]: Building LLVM assembly with > /home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime/LLVMAssembly.ll > /home/debio/build/vmkit-build/vmkit/lib/Mvm/Runtime/LLVMAssembly64.ll > ExpandIntegerResult #0: 0x16fbf88: i64,ch = AtomicCmpSwap 0x16e8d84, > 0x16fbf00, 0x16fc3c8, 0x16fc1a8 [ORD=4] [ID=0] > > Do not know how to expand the result of this operator! > UNREACHABLE executed at LegalizeIntegerTypes.cpp:982! > Stack dump: > 0. Program arguments: > /home/debio/build/vmkit-build/vmkit/../llvm//Debug+Asserts/bin/llc -o > LLVMAssembly.s > 1. Running pass 'Function Pass Manager' on module ''. > 2. Running pass 'ARM Instruction Selection' on function > '@llvm_atomic_cmp_swap_i64' > Hi this JIT bug are a long standing unimplemented issue on ARM. http://llvm.org/bugs/show_bug.cgi?id=3877 > > I was following the instructions from > http://vmkit.llvm.org/get_started.html although I didn't build my own > classpath, as I have a packaged version installed and I was waiting to > see whether configure would pick it up automatically. From what I can > tell, that doesn't seem relevant to this error. > Is this something that just isn't supported on my platform, is the > trunk build currently broken somehow, or am I just doing it wrong? > > Thanks, > Andrew Wiley To fix this you have to implement the missing atomic intrinsics on ARM. Gary Benson did implement the missing intrinsics on PPC in 2008 and you might use this old conversation thread to guide you on how to implement it on ARM. http://markmail.org/message/73owc5nrvsbmrhes#query:+page:1+mid:73owc5nrvsbmrhes+state:results Before ARMv7 there exist a SWP instruction. On ARMv7 and later you can use LDREX and STREX instructions. http://www.doulos.com/knowhow/arm/Hints_and_Tips/Implementing_Semaphores/ Cheers Xerxes From xerxes at zafena.se Tue Mar 15 06:45:14 2011 From: xerxes at zafena.se (=?UTF-8?B?WGVyeGVzIFLDpW5ieQ==?=) Date: Tue, 15 Mar 2011 12:45:14 +0100 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> <4D7E38B6.4020701@zafena.se> Message-ID: <4D7F514A.70206@zafena.se> On 2011-03-14 18:14, Anton Korobeynikov wrote: > Hello Xerxes, > >> llvm 2.9rc1 test on Dualcore ARM running Ubuntu Natty > What is the gcc used for the compilation? Can you try to do the -O0 > build and see whether this changed the stuff? > xranby at panda:/media/dh0/llvm-2.9-build-O0$ gcc --version gcc (Ubuntu/Linaro 4.5.2-5ubuntu1) 4.5.2 Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. -O0 did change quite a lot see below: xranby at panda:/media/dh0/llvm-2.9-build-O0$ time make check llvm[0]: Running test suite make[1]: Entering directory `/media/dh0/llvm-2.9-build-O0/test' Making a new site.exp file... Making LLVM 'lit.site.cfg' file... Making LLVM unittest 'lit.site.cfg' file... ( ulimit -t 600 ; ulimit -d 512000 ; ulimit -m 512000 ; ulimit -v 1024000 ; \ /media/dh0/llvm-2.9rc1/utils/lit/lit.py -s -v . ) FAIL: LLVM :: CodeGen/Thumb/select.ll (1595 of 5848) ******************** TEST 'LLVM :: CodeGen/Thumb/select.ll' FAILED ******************** Script: -- /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -march=thumb | grep beq | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -march=thumb | grep bgt | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -march=thumb | grep blt | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 3 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -march=thumb | grep ble | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -march=thumb | grep bls | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -march=thumb | grep bhi | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/Thumb/select.ll -mtriple=thumb-apple-darwin | grep __ltdf2 -- Exit Code: 1 Command Output (stderr): -- Expected 3 lines, got 1. -- ******************** FAIL: LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll (2396 of 5848) ******************** TEST 'LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll' FAILED ******************** Script: -- /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/X86/fold-pcmpeqd-0.ll -mtriple=i386-apple-darwin | grep pcmpeqd | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/X86/fold-pcmpeqd-0.ll -mtriple=x86_64-apple-darwin | grep pcmpeqd | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 -- Exit Code: 1 Command Output (stderr): -- llc: /media/dh0/llvm-2.9rc1/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:3149: void llvm::SelectionDAGBuilder::visitTargetIntrinsic(const llvm::CallInst&, unsigned int): Assertion `TLI.isTypeLegal(Op.getValueType()) && "Intrinsic uses a non-legal type?"' failed. Stack dump: 0. Program arguments: /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc -mtriple=i386-apple-darwin 1. Running pass 'Function Pass Manager' on module ''. 2. Running pass 'X86 DAG->DAG Instruction Selection' on function '@program_1' Expected 1 lines, got 0. -- ******************** Testing Time: 1090.71s ******************** Failing Tests (2): LLVM :: CodeGen/Thumb/select.ll LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll Expected Passes : 5253 Expected Failures : 50 Unsupported Tests : 543 Unexpected Failures: 2 make[1]: *** [check-local-lit] Error 1 make[1]: Leaving directory `/media/dh0/llvm-2.9-build-O0/test' make: *** [check] Error 2 real 18m18.916s user 21m6.875s sys 7m49.664s From anton at korobeynikov.info Tue Mar 15 07:12:16 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 15 Mar 2011 15:12:16 +0300 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <4D7F514A.70206@zafena.se> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> <4D7E38B6.4020701@zafena.se> <4D7F514A.70206@zafena.se> Message-ID: Hello Xerxes, > xranby at panda:/media/dh0/llvm-2.9-build-O0$ gcc --version > gcc (Ubuntu/Linaro 4.5.2-5ubuntu1) 4.5.2 Ok. Looks like we'll need to mark this gcc as "known bad" for ARM. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From baldrick at free.fr Tue Mar 15 07:39:05 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 15 Mar 2011 13:39:05 +0100 Subject: [LLVMdev] Warning in LLVM In-Reply-To: References: Message-ID: <4D7F5DE9.6080803@free.fr> Hi Renato, > When compiling LLVM on my Intel(R) Core(TM)2 Duo CPU P7450 running > Ubuntu (gcc 4.4.5), I get this warning: > > /home/rengolin/workspace/llvm/rw/build/Release+Asserts/lib/libLLVMARMAsmParser.a(ARMAsmParser.o): > In function `(anonymous > namespace)::ARMAsmParser::ParseRegisterList(llvm::SmallVectorImpl&)': > ARMAsmParser.cpp:(.text+0x4a05): warning: memset used with constant > zero length parameter; this could be due to transposed parameters > > Couldn't see anything special at 0x4a05, even because there was no > such address (it's not even aligned?). Also, nothing got my attention > on the source code, any ideas? a bunch of warnings of this kind were fixed recently by tweaking DenseMap. Are you at top-of-tree? Ciao, Duncan. From rengolin at systemcall.org Tue Mar 15 08:10:43 2011 From: rengolin at systemcall.org (Renato Golin) Date: Tue, 15 Mar 2011 13:10:43 +0000 Subject: [LLVMdev] Warning in LLVM In-Reply-To: <4D7F5DE9.6080803@free.fr> References: <4D7F5DE9.6080803@free.fr> Message-ID: On 15 March 2011 12:39, Duncan Sands wrote: > a bunch of warnings of this kind were fixed recently by tweaking DenseMap. > Are you at top-of-tree? I was when I sent this email, I was doing the final update&check-all for my commit yesterday. cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From brian-mokrzycki at uiowa.edu Tue Mar 15 09:02:46 2011 From: brian-mokrzycki at uiowa.edu (Brian Mokrzycki) Date: Tue, 15 Mar 2011 10:02:46 -0400 Subject: [LLVMdev] Noob Backend Orientation Message-ID: <50C24CD0-2A21-4EE6-BC61-F17C7EC22AED@uiowa.edu> Hi all, I'm new to the project and will probably ask quite a few obvious questions, so please, bare with me. I'm trying to get my bearings straight as to the general path forward for producing a backend for a custom DSP processor. Currently, I have a port of binutils and a basic simulator (based on SID). I want to make it easier for software engineers to produce code for this architecture so I'm moving onto the next phase by attempting a C compiler, so here I am. First, I would rather not maintain two different descriptions of the assembly/ISA specification, one in binutils/SID (CGEN) and another in LLVM. I have found some posts about the LLVM MC project, that it can essentially produce an inline or standalone assembler, but is the MC project ready for prime-time? Should I invest my time going this route? It's hard to get a sense of what direction the project is moving in. Second, related to the previous, if I were to only maintain the assembly/ISA description in LLVM, can LLVM produce a simulator to execute target specific machine code binaries? I realize that you can, in theory, translate the ll assembly to the host machines processor architecture as a pseudo-simulation, but this doesn't give you hardware specific information about your real target machine, such as what is stored in register r6 at breakpoint 10. Lastly, the DSP processor is a VLIW/EPIC machine. I assume that this fact is going to produce a few hiccups when creating the LLVM backend, just like it did when I was doing the binutils port. Any words of wisdom would be very welcome on this front. Thanks, -Brian From m.vijay at nus.edu.sg Tue Mar 15 09:48:37 2011 From: m.vijay at nus.edu.sg (Vijayaraghavan Murali) Date: Tue, 15 Mar 2011 22:48:37 +0800 Subject: [LLVMdev] LLVM Register allocation Message-ID: <4D7F7C45.4030509@nus.edu.sg> Hello, I'm relatively a newcomer to this forum and LLVM. I wish to do the following: 1) play with LLVM's register allocation without any other optimizations performed, such as inlining. This is because I'm trying to observe the effects of our path-sensitive tool on register allocation but other optimizations could influence the results. In other words, I would like to perform register allocation with -O0, if that's possible. 2) view the results of register allocation. That is, the mapping of variables to physical registers. I'm comfortable with reading dwarf information. For eg, using gcc I would do: gcc -c -g hello.c ; dwarfdump hello.o Kindly guide me in performing the above steps. If they are not possible, is there any workaround? Thank you! ----- Vijayaraghavan Murali http://www.comp.nus.edu.sg/~mvijayar From ofv at wanadoo.es Tue Mar 15 10:02:15 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Tue, 15 Mar 2011 16:02:15 +0100 Subject: [LLVMdev] How to choose targets to build in CMake? In-Reply-To: (NAKAMURA Takumi's message of "Tue, 15 Mar 2011 19:00:20 +0900") References: Message-ID: <87oc5cpgug.fsf@wanadoo.es> NAKAMURA Takumi writes: > On Tue, Mar 15, 2011 at 6:34 PM, Dongrui She wrote: >> However, I get the following errors whenever I don't include X86 in the >> target list: > > on top of trunk, it seems cmake requires at least "native" (for yours, > X86) target. > (I reconfirmed with -DLLVM_TARGETS_TO_BUILD=ARM on x86) > For workaround, you have to include "X86" always. Committed a fix on r127679. I configured with -DLLVM_TARGETS_TO_BUILD=Sparc and LLVM and Clang builds fine but check-all shows test failures: Expected Passes : 5077 Expected Failures : 30 Unsupported Tests : 3390 Unexpected Failures: 291 "JIT" is mentioned a lot on the list of failed tests. Of course Sparc has not JIT, but then I expect that those tests should be disabled. From grosbach at apple.com Tue Mar 15 11:33:25 2011 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 15 Mar 2011 09:33:25 -0700 Subject: [LLVMdev] Noob Backend Orientation In-Reply-To: <50C24CD0-2A21-4EE6-BC61-F17C7EC22AED@uiowa.edu> References: <50C24CD0-2A21-4EE6-BC61-F17C7EC22AED@uiowa.edu> Message-ID: <01088AD7-50C5-4AED-BB22-59C043C946EA@apple.com> On Mar 15, 2011, at 7:02 AM, Brian Mokrzycki wrote: > Hi all, > > I'm new to the project and will probably ask quite a few obvious questions, so please, bare with me. > > I'm trying to get my bearings straight as to the general path forward for producing a backend for a custom DSP processor. Currently, I have a port of binutils and a basic simulator (based on SID). I want to make it easier for software engineers to produce code for this architecture so I'm moving onto the next phase by attempting a C compiler, so here I am. > > First, I would rather not maintain two different descriptions of the assembly/ISA specification, one in binutils/SID (CGEN) and another in LLVM. I have found some posts about the LLVM MC project, that it can essentially produce an inline or standalone assembler, but is the MC project ready for prime-time? Should I invest my time going this route? It's hard to get a sense of what direction the project is moving in. Hi Brian, MC is used as the integrated assembler for the X86 backend (enabled by default in clang) and ARM (disabled by default due while asm parsing is finished up for inline assembly). So yes, it's ready for prime-time and well worth investing in, but it's also still relatively young, so don't be too surprised if there's a few growing pains along the way. I've done a lot of work on MC-based things, as have others on this list, so if you have questions about how to organize a target description for MC and/or run into troublesome constructs, post questions here and folks will be glad to help. > Second, related to the previous, if I were to only maintain the assembly/ISA description in LLVM, can LLVM produce a simulator to execute target specific machine code binaries? I realize that you can, in theory, translate the ll assembly to the host machines processor architecture as a pseudo-simulation, but this doesn't give you hardware specific information about your real target machine, such as what is stored in register r6 at breakpoint 10. Not currently, no. That's outside the scope of what LLVM's target description is designed to handle. > Lastly, the DSP processor is a VLIW/EPIC machine. I assume that this fact is going to produce a few hiccups when creating the LLVM backend, just like it did when I was doing the binutils port. Any words of wisdom would be very welcome on this front. This part I'm less familiar with, but I believe some others here have done work on similar things. -Jim From sjosef at cs.utah.edu Tue Mar 15 13:27:39 2011 From: sjosef at cs.utah.edu (Josef Spjut) Date: Tue, 15 Mar 2011 12:27:39 -0600 Subject: [LLVMdev] mblaze backend: unreachable executed Message-ID: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> Hello, I am working on a backend for a custom ISA that is somewhat similar to the MicroBlaze ISA so I've decided to use that as a starting point. I am trying to compile a custom ray tracer (lots of floating point) and the llvm-g++ frontend generates an fneg instruction which is not supported by the MBlaze backend in the 2.8 release. I added code to emit an fneg assembly instruction and now I'm getting "UNREACHABLE executed!" when trying to compile the code. The only change I made was to add the following two lines next to the square root lines in the code (FPU and sqrt are enabled in llc): def FSQRT : ArithF2<0x16, 0x300, "fsqrt ", IIAlu>; def FNEG : ArithF2<0x16, 0x300, "fneg ", IIAlu>; // added for fneg and def : Pat<(fsqrt FGR32:$V), (FSQRT FGR32:$V)>; def : Pat<(fneg FGR32:$V), (FNEG FGR32:$V)>; // added for fneg Does anyone know what common causes of "UNREACHABLE executed!" messages are and what this message in particular means? The full error message is the following: UNREACHABLE executed! 0 llc 0x0000000100936ae2 PrintStackTrace(void*) + 34 1 llc 0x0000000100937603 SignalHandler(int) + 531 2 libSystem.B.dylib 0x00007fff82adf67a _sigtramp + 26 3 libSystem.B.dylib 000000000000000000 _sigtramp + 2102528416 4 llc 0x0000000100936aa6 abort + 22 5 llc 0x000000010091551d llvm::llvm_unreachable_internal(char const*, char const*, unsigned int) + 381 6 llc 0x000000010058100e llvm::CCState::AnalyzeCallResult(llvm::SmallVectorImpl const&, bool (*)(unsigned int, llvm::EVT, llvm::EVT, llvm::CCValAssign::LocInfo, llvm::ISD::ArgFlagsTy, llvm::CCState&)) + 158 7 llc 0x000000010007319c llvm::TraxTargetLowering::LowerCallResult(llvm::SDValue, llvm::SDValue, llvm::CallingConv::ID, bool, llvm::SmallVectorImpl const&, llvm::DebugLoc, llvm::SelectionDAG&, llvm::SmallVectorImpl&) const + 172 8 llc 0x0000000100075353 llvm::TraxTargetLowering::LowerCall(llvm::SDValue, llvm::SDValue, llvm::CallingConv::ID, bool, bool&, llvm::SmallVectorImpl const&, llvm::SmallVectorImpl const&, llvm::SmallVectorImpl const&, llvm::DebugLoc, llvm::SelectionDAG&, llvm::SmallVectorImpl&) const + 4179 9 llc 0x00000001004c3dcd llvm::TargetLowering::LowerCallTo(llvm::SDValue, llvm::Type const*, bool, bool, bool, bool, unsigned int, llvm::CallingConv::ID, bool, bool, llvm::SDValue, std::vector >&, llvm::SelectionDAG&, llvm::DebugLoc) const + 4269 10 llc 0x00000001004d1c0b llvm::SelectionDAGBuilder::LowerCallTo(llvm::ImmutableCallSite, llvm::SDValue, bool, llvm::MachineBasicBlock*) + 2363 11 llc 0x00000001004e6dc9 llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) + 185 12 llc 0x00000001004c8828 llvm::SelectionDAGBuilder::visit(unsigned int, llvm::User const&) + 600 13 llc 0x00000001004fdef3 llvm::SelectionDAGBuilder::visit(llvm::Instruction const&) + 51 14 llc 0x000000010050a558 llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator, llvm::ilist_iterator, bool&) + 56 15 llc 0x000000010050abe2 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) + 1506 16 llc 0x000000010050b4a8 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 392 17 llc 0x00000001005d4cbd llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 125 18 llc 0x00000001008809b0 llvm::FPPassManager::runOnFunction(llvm::Function&) + 656 19 llc 0x0000000100880a6b llvm::FPPassManager::runOnModule(llvm::Module&) + 139 20 llc 0x0000000100882122 llvm::MPPassManager::runOnModule(llvm::Module&) + 562 21 llc 0x0000000100882423 llvm::PassManagerImpl::run(llvm::Module&) + 243 22 llc 0x00000001008824bd llvm::PassManager::run(llvm::Module&) + 13 23 llc 0x0000000100023c8a main + 3754 24 llc 0x0000000100022848 start + 52 Stack dump: 0. Program arguments: ./Release/bin/llc rt-rot.bc -o rt-rot.s -march=trax 1. Running pass 'Function Pass Manager' on module 'rt-rot.bc'. 2. Running pass 'Trax DAG->DAG Pattern Instruction Selection' on function '@main' Note that Trax is the name of my custom back end and the only changes other than the ones above are changing the names to Trax from MBlaze. Thanks for any help. Josef Spjut From Matthieu.Moy at grenoble-inp.fr Tue Mar 15 13:39:44 2011 From: Matthieu.Moy at grenoble-inp.fr (Matthieu Moy) Date: Tue, 15 Mar 2011 19:39:44 +0100 Subject: [LLVMdev] [PATCH] Fix weak/linkonce linkage in execution engine Message-ID: Hi, I've had problem with a program using LLVM that tried to dynamic_cast objects created in the JIT execution engine, from the native part of the program (for the curious, the program is PinaVM http://gitorious.org/pinavm/pages/Home). I've narrowed down the issue to the linkage of weak_odr and linkonce_odr symbols, used for the vtables, and that _must_ be unique for dynamic_cast to work. Attached are two patches: the first adds a (failing) testcase, the second fixes the issue. I'm not familiar with patch submission on this list, let me know if there's a better way to submit patches. -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-add-testcase-for-weak_odr-and-linkonce_odr-in-JIT.patch Type: text/x-diff Size: 2683 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110315/f74220f3/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-JIT-fix-linkage-of-weak-and-linkonce-symbols.patch Type: text/x-diff Size: 2880 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110315/f74220f3/attachment-0001.bin -------------- next part -------------- -- Matthieu Moy http://www-verimag.imag.fr/~moy/ From anton at korobeynikov.info Tue Mar 15 14:27:40 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 15 Mar 2011 22:27:40 +0300 Subject: [LLVMdev] mblaze backend: unreachable executed In-Reply-To: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> References: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> Message-ID: Hello > Does anyone know what common causes of "UNREACHABLE executed!" messages are and what this message in particular means? The full error message is the following: Looks like you have not implemented the calling convention bits for your backend. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From andreas.faerber at web.de Tue Mar 15 14:50:39 2011 From: andreas.faerber at web.de (=?ISO-8859-1?Q?Andreas_F=E4rber?=) Date: Tue, 15 Mar 2011 20:50:39 +0100 Subject: [LLVMdev] mblaze backend: unreachable executed In-Reply-To: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> References: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> Message-ID: <731F5342-63BD-42EF-9E22-9C5695ED373A@web.de> Hello, Am 15.03.2011 um 19:27 schrieb Josef Spjut: > Does anyone know what common causes of "UNREACHABLE executed!" > messages are and what this message in particular means? The full > error message is the following: > > UNREACHABLE executed! > 0 llc 0x0000000100936ae2 PrintStackTrace(void*) + 34 > 1 llc 0x0000000100937603 SignalHandler(int) + 531 > 2 libSystem.B.dylib 0x00007fff82adf67a _sigtramp + 26 > 3 libSystem.B.dylib 000000000000000000 _sigtramp + 2102528416 > 4 llc 0x0000000100936aa6 abort + 22 > 5 llc 0x000000010091551d > llvm::llvm_unreachable_internal(char const*, char const*, unsigned > int) + 381 > 6 llc 0x000000010058100e > llvm > ::CCState > ::AnalyzeCallResult(llvm::SmallVectorImpl > const&, bool (*)(unsigned int, llvm::EVT, llvm::EVT, > llvm::CCValAssign::LocInfo, llvm::ISD::ArgFlagsTy, llvm::CCState&)) > + 158 > 7 llc 0x000000010007319c > llvm::TraxTargetLowering::LowerCallResult(llvm::SDValue, > llvm::SDValue, llvm::CallingConv::ID, bool, > llvm::SmallVectorImpl const&, llvm::DebugLoc, > llvm::SelectionDAG&, llvm::SmallVectorImpl&) const + > 172 > 8 llc 0x0000000100075353 > llvm::TraxTargetLowering::LowerCall(llvm::SDValue, llvm::SDValue, > llvm::CallingConv::ID, bool, bool&, > llvm::SmallVectorImpl const&, > llvm::SmallVectorImpl const&, > llvm::SmallVectorImpl const&, llvm::DebugLoc, > llvm::SelectionDAG&, llvm::SmallVectorImpl&) const + > 4179 > 9 llc 0x00000001004c3dcd > llvm::TargetLowering::LowerCallTo(llvm::SDValue, llvm::Type const*, > bool, bool, bool, bool, unsigned int, llvm::CallingConv::ID, bool, > bool, llvm::SDValue, std::vector std::allocator >&, > llvm::SelectionDAG&, llvm::DebugLoc) const + 4269 > 10 llc 0x00000001004d1c0b > llvm::SelectionDAGBuilder::LowerCallTo(llvm::ImmutableCallSite, > llvm::SDValue, bool, llvm::MachineBasicBlock*) + 2363 > 11 llc 0x00000001004e6dc9 > llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) + 185 > 12 llc 0x00000001004c8828 > llvm::SelectionDAGBuilder::visit(unsigned int, llvm::User const&) + > 600 > 13 llc 0x00000001004fdef3 > llvm::SelectionDAGBuilder::visit(llvm::Instruction const&) + 51 > 14 llc 0x000000010050a558 > llvm > ::SelectionDAGISel > ::SelectBasicBlock(llvm::ilist_iterator, > llvm::ilist_iterator, bool&) + 56 > 15 llc 0x000000010050abe2 > llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) > + 1506 > 16 llc 0x000000010050b4a8 > llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) > + 392 > 17 llc 0x00000001005d4cbd > llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 125 > 18 llc 0x00000001008809b0 > llvm::FPPassManager::runOnFunction(llvm::Function&) + 656 > 19 llc 0x0000000100880a6b > llvm::FPPassManager::runOnModule(llvm::Module&) + 139 > 20 llc 0x0000000100882122 > llvm::MPPassManager::runOnModule(llvm::Module&) + 562 > 21 llc 0x0000000100882423 > llvm::PassManagerImpl::run(llvm::Module&) + 243 > 22 llc 0x00000001008824bd > llvm::PassManager::run(llvm::Module&) + 13 > 23 llc 0x0000000100023c8a main + 3754 > 24 llc 0x0000000100022848 start + 52 > Stack dump: > 0. Program arguments: ./Release/bin/llc rt-rot.bc -o rt-rot.s - > march=trax > 1. Running pass 'Function Pass Manager' on module 'rt-rot.bc'. > 2. Running pass 'Trax DAG->DAG Pattern Instruction Selection' on > function '@main' Running the Debug+Asserts version instead of the Release should give you the file and line of the "unreachable" code. Andreas From sjosef at cs.utah.edu Tue Mar 15 18:15:14 2011 From: sjosef at cs.utah.edu (Josef Spjut) Date: Tue, 15 Mar 2011 17:15:14 -0600 Subject: [LLVMdev] mblaze backend: unreachable executed In-Reply-To: <731F5342-63BD-42EF-9E22-9C5695ED373A@web.de> References: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> <731F5342-63BD-42EF-9E22-9C5695ED373A@web.de> Message-ID: On Mar 15, 2011, at 1:50 PM, Andreas F?rber wrote: > Hello, > > Running the Debug+Asserts version instead of the Release should give you the file and line of the "unreachable" code. > > Andreas Thanks for the suggestion. I recompiled with Debug+Asserts and it shows that the unreachable is in CallingConvLower.cpp:162. Here is the error message: Call result #2 has unhandled type f32UNREACHABLE executed at CallingConvLower.cpp:162! I don't think my backend is modified enough from the MBlaze backend that is in the release to be causing this error. I am however looking through the various files of the backend to try to find where the calling convention might be causing problems with f32 data types. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110315/eb02e676/attachment.html From jfonseca at vmware.com Tue Mar 15 18:15:50 2011 From: jfonseca at vmware.com (jfonseca at vmware.com) Date: Tue, 15 Mar 2011 23:15:50 +0000 Subject: [LLVMdev] Prevent unbounded memory consuption of long lived JIT processes Message-ID: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> This series of patches address several issues causing memory usage to grow indefinetely on a long lived process. These are not convenional leaks -- memory would have been freed when the LLVM context or/and JIT engine is destroyed -- but for as long as they aren't the memory is usage effectively ubounded. The issues were found using valgrind with '--show-reachable=yes' option: 1. Compile a bunch of functions with JIT once; delete the result; and exit without destroying LLVM context nor JIT engine. (valgrind will report a bunch of unfreed LLVM objects) 2. Do as 1, but compile and delete the functions twice 3. Ditto three times. 4. Etc. Flawless code should not cause the memory usage to increase when compiling the same -- ie valgrind's log for every run should show the very same unfreed objects, regardless of the number of times a given code was compilation, but that was not the case. The attached patches cover most of the causes for new objects being allocated. It should be possible to automate such test, but I didn't get that far. From jfonseca at vmware.com Tue Mar 15 18:15:51 2011 From: jfonseca at vmware.com (jfonseca at vmware.com) Date: Tue, 15 Mar 2011 23:15:51 +0000 Subject: [LLVMdev] [PATCH 1/5] Prevent infinite growth of the DenseMap. In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <1300230955-24833-2-git-send-email-jfonseca@vmware.com> From: Jos? Fonseca When the hash function uses object pointers all free entries eventually become tombstones as they are used at least once, regardless of the size. DenseMap cannot function with zero empty keys, so it double size to get get ridof the tombstones. However DenseMap never shrinks automatically unless it is cleared, so the net result is that certain tables grow infinitely. The solution is to make a fresh copy of the table without tombstones instead of doubling size, by simply calling grow with the current size. --- include/llvm/ADT/DenseMap.h | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/include/llvm/ADT/DenseMap.h b/include/llvm/ADT/DenseMap.h index 9d2b11d..71dcc25 100644 --- a/include/llvm/ADT/DenseMap.h +++ b/include/llvm/ADT/DenseMap.h @@ -289,11 +289,14 @@ private: // table completely filled with tombstones, no lookup would ever succeed, // causing infinite loops in lookup. ++NumEntries; - if (NumEntries*4 >= NumBuckets*3 || - NumBuckets-(NumEntries+NumTombstones) < NumBuckets/8) { + if (NumEntries*4 >= NumBuckets*3) { this->grow(NumBuckets * 2); LookupBucketFor(Key, TheBucket); } + if (NumBuckets-(NumEntries+NumTombstones) < NumBuckets/8) { + this->grow(NumBuckets); + LookupBucketFor(Key, TheBucket); + } // If we are writing over a tombstone, remember this. if (!KeyInfoT::isEqual(TheBucket->first, getEmptyKey())) -- 1.7.4.1 From jfonseca at vmware.com Tue Mar 15 18:15:52 2011 From: jfonseca at vmware.com (jfonseca at vmware.com) Date: Tue, 15 Mar 2011 23:15:52 +0000 Subject: [LLVMdev] [PATCH 2/5] Prevent infinite growth of SmallMap instances. In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <1300230955-24833-3-git-send-email-jfonseca@vmware.com> From: Jos? Fonseca Rehash but don't grow when full of tombstones. --- include/llvm/ADT/StringMap.h | 16 ++-------------- lib/Support/StringMap.cpp | 14 +++++++++++++- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/llvm/ADT/StringMap.h b/include/llvm/ADT/StringMap.h index bad0e6f..f3d6b9f 100644 --- a/include/llvm/ADT/StringMap.h +++ b/include/llvm/ADT/StringMap.h @@ -81,16 +81,6 @@ protected: StringMapImpl(unsigned InitSize, unsigned ItemSize); void RehashTable(); - /// ShouldRehash - Return true if the table should be rehashed after a new - /// element was recently inserted. - bool ShouldRehash() const { - // If the hash table is now more than 3/4 full, or if fewer than 1/8 of - // the buckets are empty (meaning that many are filled with tombstones), - // grow the table. - return NumItems*4 > NumBuckets*3 || - NumBuckets-(NumItems+NumTombstones) < NumBuckets/8; - } - /// LookupBucketFor - Look up the bucket that the specified string should end /// up in. If it already exists as a key in the map, the Item pointer for the /// specified bucket will be non-null. Otherwise, it will be null. In either @@ -340,8 +330,7 @@ public: Bucket.Item = KeyValue; ++NumItems; - if (ShouldRehash()) - RehashTable(); + RehashTable(); return true; } @@ -383,8 +372,7 @@ public: // filled in by LookupBucketFor. Bucket.Item = NewItem; - if (ShouldRehash()) - RehashTable(); + RehashTable(); return *NewItem; } diff --git a/lib/Support/StringMap.cpp b/lib/Support/StringMap.cpp index 90ec299..f193aa4 100644 --- a/lib/Support/StringMap.cpp +++ b/lib/Support/StringMap.cpp @@ -177,7 +177,19 @@ StringMapEntryBase *StringMapImpl::RemoveKey(StringRef Key) { /// RehashTable - Grow the table, redistributing values into the buckets with /// the appropriate mod-of-hashtable-size. void StringMapImpl::RehashTable() { - unsigned NewSize = NumBuckets*2; + unsigned NewSize; + + // If the hash table is now more than 3/4 full, or if fewer than 1/8 of + // the buckets are empty (meaning that many are filled with tombstones), + // grow/rehash the table. + if (NumItems*4 > NumBuckets*3) { + NewSize = NumBuckets*2; + } else if (NumBuckets-(NumItems+NumTombstones) < NumBuckets/8) { + NewSize = NumBuckets; + } else { + return; + } + // Allocate one extra bucket which will always be non-empty. This allows the // iterators to stop at end. ItemBucket *NewTableArray =(ItemBucket*)calloc(NewSize+1, sizeof(ItemBucket)); -- 1.7.4.1 From jfonseca at vmware.com Tue Mar 15 18:15:53 2011 From: jfonseca at vmware.com (jfonseca at vmware.com) Date: Tue, 15 Mar 2011 23:15:53 +0000 Subject: [LLVMdev] [PATCH 3/5] Prevent infinite growth of SmallPtrSet instances. In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <1300230955-24833-4-git-send-email-jfonseca@vmware.com> From: Jos? Fonseca Rehash but don't grow when full of tombstones. --- include/llvm/ADT/SmallPtrSet.h | 2 +- lib/Support/SmallPtrSet.cpp | 15 +++++++++------ 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/include/llvm/ADT/SmallPtrSet.h b/include/llvm/ADT/SmallPtrSet.h index ff32ba8..9992858 100644 --- a/include/llvm/ADT/SmallPtrSet.h +++ b/include/llvm/ADT/SmallPtrSet.h @@ -133,7 +133,7 @@ private: void shrink_and_clear(); /// Grow - Allocate a larger backing store for the buckets and move it over. - void Grow(); + void Grow(unsigned NewSize); void operator=(const SmallPtrSetImpl &RHS); // DO NOT IMPLEMENT. protected: diff --git a/lib/Support/SmallPtrSet.cpp b/lib/Support/SmallPtrSet.cpp index 504e649..997ce0b 100644 --- a/lib/Support/SmallPtrSet.cpp +++ b/lib/Support/SmallPtrSet.cpp @@ -52,10 +52,14 @@ bool SmallPtrSetImpl::insert_imp(const void * Ptr) { // Otherwise, hit the big set case, which will call grow. } - // If more than 3/4 of the array is full, grow. - if (NumElements*4 >= CurArraySize*3 || - CurArraySize-(NumElements+NumTombstones) < CurArraySize/8) - Grow(); + if (NumElements*4 >= CurArraySize*3) { + // If more than 3/4 of the array is full, grow. + Grow(CurArraySize < 64 ? 128 : CurArraySize*2); + } else if (CurArraySize-(NumElements+NumTombstones) < CurArraySize/8) { + // If fewer of 1/8 of the array is empty (meaning that many are filled with + // tombstones), rehash. + Grow(CurArraySize); + } // Okay, we know we have space. Find a hash bucket. const void **Bucket = const_cast(FindBucketFor(Ptr)); @@ -125,10 +129,9 @@ const void * const *SmallPtrSetImpl::FindBucketFor(const void *Ptr) const { /// Grow - Allocate a larger backing store for the buckets and move it over. /// -void SmallPtrSetImpl::Grow() { +void SmallPtrSetImpl::Grow(unsigned NewSize) { // Allocate at twice as many buckets, but at least 128. unsigned OldSize = CurArraySize; - unsigned NewSize = OldSize < 64 ? 128 : OldSize*2; const void **OldBuckets = CurArray; bool WasSmall = isSmall(); -- 1.7.4.1 From jfonseca at vmware.com Tue Mar 15 18:15:54 2011 From: jfonseca at vmware.com (jfonseca at vmware.com) Date: Tue, 15 Mar 2011 23:15:54 +0000 Subject: [LLVMdev] [PATCH 4/5] Reset StringMap's NumTombstones on clears and rehashes. In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <1300230955-24833-5-git-send-email-jfonseca@vmware.com> From: Jos? Fonseca StringMap was not properly updating NumTombstones after a clear or rehash. This was not fatal until now because the table was growing faster than NumTombstones could, but with the previous change of preventing infinite growth of the table the invariant (NumItems + NumTombstones <= NumBuckets) stopped being observed, causing infinite loops in certain situations. --- include/llvm/ADT/StringMap.h | 3 +++ lib/Support/StringMap.cpp | 3 +++ 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/include/llvm/ADT/StringMap.h b/include/llvm/ADT/StringMap.h index f3d6b9f..907c72d 100644 --- a/include/llvm/ADT/StringMap.h +++ b/include/llvm/ADT/StringMap.h @@ -329,6 +329,7 @@ public: --NumTombstones; Bucket.Item = KeyValue; ++NumItems; + assert(NumItems + NumTombstones <= NumBuckets); RehashTable(); return true; @@ -348,6 +349,7 @@ public: } NumItems = 0; + NumTombstones = 0; } /// GetOrCreateValue - Look up the specified key in the table. If a value @@ -367,6 +369,7 @@ public: if (Bucket.Item == getTombstoneVal()) --NumTombstones; ++NumItems; + assert(NumItems + NumTombstones <= NumBuckets); // Fill in the bucket for the hash table. The FullHashValue was already // filled in by LookupBucketFor. diff --git a/lib/Support/StringMap.cpp b/lib/Support/StringMap.cpp index f193aa4..a1ac512 100644 --- a/lib/Support/StringMap.cpp +++ b/lib/Support/StringMap.cpp @@ -169,6 +169,8 @@ StringMapEntryBase *StringMapImpl::RemoveKey(StringRef Key) { TheTable[Bucket].Item = getTombstoneVal(); --NumItems; ++NumTombstones; + assert(NumItems + NumTombstones <= NumBuckets); + return Result; } @@ -224,4 +226,5 @@ void StringMapImpl::RehashTable() { TheTable = NewTableArray; NumBuckets = NewSize; + NumTombstones = 0; } -- 1.7.4.1 From jfonseca at vmware.com Tue Mar 15 18:15:55 2011 From: jfonseca at vmware.com (jfonseca at vmware.com) Date: Tue, 15 Mar 2011 23:15:55 +0000 Subject: [LLVMdev] [PATCH 5/5] Don't add the same analysis implementation pair twice. In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <1300230955-24833-6-git-send-email-jfonseca@vmware.com> From: Jos? Fonseca Prevent infinite growth of the list. --- include/llvm/PassAnalysisSupport.h | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/include/llvm/PassAnalysisSupport.h b/include/llvm/PassAnalysisSupport.h index a3342d5..fede121 100644 --- a/include/llvm/PassAnalysisSupport.h +++ b/include/llvm/PassAnalysisSupport.h @@ -142,6 +142,8 @@ public: Pass *findImplPass(Pass *P, AnalysisID PI, Function &F); void addAnalysisImplsPair(AnalysisID PI, Pass *P) { + if (findImplPass(PI) == P) + return; std::pair pir = std::make_pair(PI,P); AnalysisImpls.push_back(pir); } -- 1.7.4.1 From jfonseca at vmware.com Tue Mar 15 18:20:31 2011 From: jfonseca at vmware.com (=?ISO-8859-1?Q?Jos=E9?= Fonseca) Date: Tue, 15 Mar 2011 23:20:31 +0000 Subject: [LLVMdev] [PATCH 1/5] Prevent infinite growth of the DenseMap. In-Reply-To: <1300230955-24833-2-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> <1300230955-24833-2-git-send-email-jfonseca@vmware.com> Message-ID: <1300231231.27102.262.camel@jfonseca-laptop.eng.vmware.com> git-send-mail was supposed to send a summary for the patch series, but it didn't made it somehow. Here it is: This series of patches address several issues causing memory usage to grow indefinitely on a long lived process. These are not conventional leaks -- memory will be freed when the LLVM context or/and JIT engine is destroyed -- but for as long as they aren't the memory is usage effectively unbounded. The issues were found using valgrind with '--show-reachable=yes' option: 1. Compile a bunch of functions with JIT once; delete the result; and exit without destroying LLVM context nor JIT engine. (valgrind will report a bunch of unfreed LLVM objects) 2. Do as 1, but compile and delete the functions twice 3. Ditto three times. 4. Etc. Flawless code should not cause the memory usage to increase when compiling the same -- ie valgrind's log for every run should show the very same unfreed objects, regardless of the number of times a given code was compilation, but that was not the case. The attached patches cover most of the causes for new objects being allocated. It should be possible to automate such test, but I didn't get that far. Jose From evan.cheng at apple.com Tue Mar 15 18:40:35 2011 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 15 Mar 2011 16:40:35 -0700 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1C04ED4@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1C04ED4@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: Andy is working on gutting indvarsimplify. Evan On Mar 14, 2011, at 11:27 AM, Arnaud Allard de Grandmaison wrote: > Thanks Eli, > > After digging thru mail archives & bugzilla, it seems fixing properly this issue would require a major change in the selectionDAG code --- to have it operate on a per function basis instead of per basic-block. > > This however, does not seem to be the only issue. The following C code does not produce an efficicient assembly sequence either. > > extern void f(unsigned long long v); > > void test2() > { > for (unsigned i=0; i<512; i++) > f(i); > } > > The resulting .ll out of clang looks reasonnable (with and without the patch), but the arm assembly output looks ugly, though marginally better with my patch : the induction variable should be counting up, and it could be zero extended before the call to f. This again points to Isel, but to a different area, as everything is taking place in the same BB. > > Is this some known issue ? I could not find a bug report matching this. > > -- > Arnaud de Grandmaison > > -----Original Message----- > From: Eli Friedman [mailto:eli.friedman at gmail.com] > Sent: Sunday, March 13, 2011 11:08 PM > To: Arnaud Allard de Grandmaison > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] IndVarSimplify too aggressive ? > > On Sun, Mar 13, 2011 at 5:01 PM, Arnaud Allard de Grandmaison > wrote: >> Hi all, >> >> The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. >> >> I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. >> >> Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. >> >> The patch could probably be made smarter : I am welcoming all suggestions. > > It's worth pointing out that LoopStrengthReduce is doing essentially > the same transformation. The only reason the generated code is > improved at all with your change is that ISel has a longstanding issue > where it can't conclude that the upper half of zext i32 %x to i64 is > zero if the zext is in a different block from the user of the zext. > > -Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Tue Mar 15 18:48:25 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 16 Mar 2011 02:48:25 +0300 Subject: [LLVMdev] mblaze backend: unreachable executed In-Reply-To: References: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> <731F5342-63BD-42EF-9E22-9C5695ED373A@web.de> Message-ID: > I don't think my backend is modified enough from the MBlaze backend that is > in the release to be causing this error. I am however looking through the > various files of the backend to try to find where the calling convention > might be causing problems with f32 data types. Form the backtrace it seems like you haven't defined how to return f32 stuff out of the function. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From sjosef at cs.utah.edu Tue Mar 15 18:53:49 2011 From: sjosef at cs.utah.edu (Josef Spjut) Date: Tue, 15 Mar 2011 17:53:49 -0600 Subject: [LLVMdev] mblaze backend: unreachable executed In-Reply-To: References: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> <731F5342-63BD-42EF-9E22-9C5695ED373A@web.de> Message-ID: >> I don't think my backend is modified enough from the MBlaze backend that is >> in the release to be causing this error. I am however looking through the >> various files of the backend to try to find where the calling convention >> might be causing problems with f32 data types. > Form the backtrace it seems like you haven't defined how to return f32 > stuff out of the function. Could it be that the microblaze backend only has 2 return registers and we've written a function call that wants 3? If that could be the problem I'd guess it is highly likely because a lot of our code if 3 element vectors being passed around. Josef From anton at korobeynikov.info Tue Mar 15 19:07:18 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 16 Mar 2011 03:07:18 +0300 Subject: [LLVMdev] mblaze backend: unreachable executed In-Reply-To: References: <70C22F1A-AE38-4C9D-9AA9-242A90842113@cs.utah.edu> <731F5342-63BD-42EF-9E22-9C5695ED373A@web.de> Message-ID: > Could it be that the microblaze backend only has 2 return registers and we've written a function call that wants 3? If that could be the problem I'd guess it is highly likely because a lot of our code if 3 element vectors being passed around. Maybe -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From atrick at apple.com Tue Mar 15 19:15:47 2011 From: atrick at apple.com (Andrew Trick) Date: Tue, 15 Mar 2011 17:15:47 -0700 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: <605CD19F-8A9C-4DA6-8381-1C6C8175908B@apple.com> On Mar 13, 2011, at 2:01 PM, Arnaud Allard de Grandmaison wrote: > Hi all, > > The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. > > I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. > > Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. > > The patch could probably be made smarter : I am welcoming all suggestions. > > Best Regards, > -- > Arnaud de Grandmaison Arnaud, I've been investigating whether it's safe to apply your patch. I still need to understand why our generated code is slower in some cases. I noticed a particularly bad regression in MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow that I documented here: http://llvm.org/bugs/show_bug.cgi?id=9490 We would like to avoid generating canonical induction variables in IndVarSimplify. Once that work is complete, your patch should no longer be needed. Although in the meantime, it would be nice to understand why promoting IVs to wider types is sometimes required for codegen. -Andy From wendling at apple.com Tue Mar 15 20:23:20 2011 From: wendling at apple.com (Bill Wendling) Date: Tue, 15 Mar 2011 18:23:20 -0700 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <4D7F514A.70206@zafena.se> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> <4D7E38B6.4020701@zafena.se> <4D7F514A.70206@zafena.se> Message-ID: <6C0E136C-2E94-455A-B8A0-E2C7346990C7@apple.com> On Mar 15, 2011, at 4:45 AM, Xerxes R?nby wrote: > On 2011-03-14 18:14, Anton Korobeynikov wrote: >> Hello Xerxes, >> >>> llvm 2.9rc1 test on Dualcore ARM running Ubuntu Natty >> What is the gcc used for the compilation? Can you try to do the -O0 >> build and see whether this changed the stuff? >> > > xranby at panda:/media/dh0/llvm-2.9-build-O0$ gcc --version > gcc (Ubuntu/Linaro 4.5.2-5ubuntu1) 4.5.2 > Copyright (C) 2010 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > > -O0 did change quite a lot see below: > > xranby at panda:/media/dh0/llvm-2.9-build-O0$ time make check > llvm[0]: Running test suite > make[1]: Entering directory `/media/dh0/llvm-2.9-build-O0/test' > Making a new site.exp file... > Making LLVM 'lit.site.cfg' file... > Making LLVM unittest 'lit.site.cfg' file... > ( ulimit -t 600 ; ulimit -d 512000 ; ulimit -m 512000 ; ulimit -v 1024000 ; \ > /media/dh0/llvm-2.9rc1/utils/lit/lit.py -s -v . ) > ******************** > FAIL: LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll (2396 of 5848) > ******************** TEST 'LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll' FAILED ******************** > Script: > -- > /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/X86/fold-pcmpeqd-0.ll -mtriple=i386-apple-darwin | grep pcmpeqd | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 > /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/X86/fold-pcmpeqd-0.ll -mtriple=x86_64-apple-darwin | grep pcmpeqd | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 > -- > Exit Code: 1 > Command Output (stderr): > -- > llc: /media/dh0/llvm-2.9rc1/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:3149: void llvm::SelectionDAGBuilder::visitTargetIntrinsic(const llvm::CallInst&, unsigned int): Assertion `TLI.isTypeLegal(Op.getValueType()) && "Intrinsic uses a non-legal type?"' failed. > Stack dump: > 0. Program arguments: /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc -mtriple=i386-apple-darwin > 1. Running pass 'Function Pass Manager' on module ''. > 2. Running pass 'X86 DAG->DAG Instruction Selection' on function '@program_1' > Expected 1 lines, got 0. > -- > Hmm...This might be missing a flag? But Eric was the last to touch it (because of his ILP stuff). Eric, do you have any comments? -bw From echristo at apple.com Tue Mar 15 20:30:47 2011 From: echristo at apple.com (Eric Christopher) Date: Tue, 15 Mar 2011 18:30:47 -0700 Subject: [LLVMdev] LLVM 2.9 RC1 Pre-release Tarballs In-Reply-To: <6C0E136C-2E94-455A-B8A0-E2C7346990C7@apple.com> References: <3EF399DF-E592-470B-98C4-5051A6828EC4@apple.com> <4D7E38B6.4020701@zafena.se> <4D7F514A.70206@zafena.se> <6C0E136C-2E94-455A-B8A0-E2C7346990C7@apple.com> Message-ID: <1AD1991A-937E-4383-908C-82CB542436A4@apple.com> On Mar 15, 2011, at 6:23 PM, Bill Wendling wrote: > On Mar 15, 2011, at 4:45 AM, Xerxes R?nby wrote: > >> On 2011-03-14 18:14, Anton Korobeynikov wrote: >>> Hello Xerxes, >>> >>>> llvm 2.9rc1 test on Dualcore ARM running Ubuntu Natty >>> What is the gcc used for the compilation? Can you try to do the -O0 >>> build and see whether this changed the stuff? >>> >> >> xranby at panda:/media/dh0/llvm-2.9-build-O0$ gcc --version >> gcc (Ubuntu/Linaro 4.5.2-5ubuntu1) 4.5.2 >> Copyright (C) 2010 Free Software Foundation, Inc. >> This is free software; see the source for copying conditions. There is NO >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. >> >> >> -O0 did change quite a lot see below: >> >> xranby at panda:/media/dh0/llvm-2.9-build-O0$ time make check >> llvm[0]: Running test suite >> make[1]: Entering directory `/media/dh0/llvm-2.9-build-O0/test' >> Making a new site.exp file... >> Making LLVM 'lit.site.cfg' file... >> Making LLVM unittest 'lit.site.cfg' file... >> ( ulimit -t 600 ; ulimit -d 512000 ; ulimit -m 512000 ; ulimit -v 1024000 ; \ >> /media/dh0/llvm-2.9rc1/utils/lit/lit.py -s -v . ) >> ******************** >> FAIL: LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll (2396 of 5848) >> ******************** TEST 'LLVM :: CodeGen/X86/fold-pcmpeqd-0.ll' FAILED ******************** >> Script: >> -- >> /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/X86/fold-pcmpeqd-0.ll -mtriple=i386-apple-darwin | grep pcmpeqd | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 >> /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc < /media/dh0/llvm-2.9rc1/test/CodeGen/X86/fold-pcmpeqd-0.ll -mtriple=x86_64-apple-darwin | grep pcmpeqd | /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/count 1 >> -- >> Exit Code: 1 >> Command Output (stderr): >> -- >> llc: /media/dh0/llvm-2.9rc1/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:3149: void llvm::SelectionDAGBuilder::visitTargetIntrinsic(const llvm::CallInst&, unsigned int): Assertion `TLI.isTypeLegal(Op.getValueType()) && "Intrinsic uses a non-legal type?"' failed. >> Stack dump: >> 0. Program arguments: /media/dh0/llvm-2.9-build-O0/Release+Asserts/bin/llc -mtriple=i386-apple-darwin >> 1. Running pass 'Function Pass Manager' on module ''. >> 2. Running pass 'X86 DAG->DAG Instruction Selection' on function '@program_1' >> Expected 1 lines, got 0. >> -- >> > Hmm...This might be missing a flag? But Eric was the last to touch it (because of his ILP stuff). Eric, do you have any comments? Should be fixed I think on the branch. -eric From geek4civic at gmail.com Tue Mar 15 20:39:41 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 16 Mar 2011 10:39:41 +0900 Subject: [LLVMdev] Prevent unbounded memory consuption of long lived JIT processes In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: Good morning Jose, Thank you to send patches. - Please send patches to llvm-commits. - Please make patches with "--attach". You may add "format.attach" to git config. I have not seen yours yet, but I pushed yours to github; https://github.com/chapuni/LLVM/compare/ed4edf9e...jfonseca%2F20110316 (Excuse me I could not input accent) ...Takumi On Wed, Mar 16, 2011 at 8:15 AM, wrote: > This series of patches address several issues causing memory usage to grow > indefinetely on a long lived process. > > These are not convenional leaks -- memory would have been freed when the LLVM > context or/and JIT engine is destroyed -- but for as long as they aren't the > memory is usage effectively ubounded. > > The issues were found using valgrind with '--show-reachable=yes' option: > 1. Compile a bunch of functions with JIT once; delete the result; and exit > ? without destroying LLVM context nor JIT engine. (valgrind will report a > ? bunch of unfreed LLVM objects) > 2. Do as 1, but compile and delete the functions twice > 3. Ditto three times. > 4. Etc. > > Flawless code should not cause the memory usage to increase when compiling the > same -- ie valgrind's log for every run should show the very same unfreed > objects, regardless of the number of times a given code was compilation, but > that was not the case. The attached patches cover most of the causes for new > objects being allocated. > > It should be possible to automate such test, but I didn't get that far. From geek4civic at gmail.com Tue Mar 15 21:25:33 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 16 Mar 2011 11:25:33 +0900 Subject: [LLVMdev] [release_29] Good status of ppc-redhat-linux on Fedora 12 PS3 Message-ID: Good morning. LLVM and clang can be built successfully on Fedora 12 PS3. On RC1, only one test failed. test/CodeGen/X86/fold-pcmpeqd-0.ll On release_29 branch, all llvm tests can pass. (I don't mention clang tests :p ) ...Takumi Fedora release 12 (Constantine) Linux speedking.localdomain 2.6.32.23-170.fc12.ppc64 #1 SMP Mon Sep 27 17:09:35 UTC 2010 ppc64 ppc64 ppc64 GNU/Linux llvm config.status 2.9svn configured by ../../llvm/configure, generated by GNU Autoconf 2.60, with options "'-C' '--build=ppc-redhat-linux' '--enable-targets=all' '--enable-optimized' 'build_alias=ppc-redhat-linux' '--with-optimize-option=-O3 -Werror' '--prefix=/home/chapuni/BUILD/llvm-ppc-static/install/stage1'" Failing Tests (39): Clang :: CXX/expr/expr.unary/expr.unary.noexcept/cg.cpp Clang :: CodeGen/bitfield-promote.c Clang :: CodeGenCXX/debug-info-byval.cpp Clang :: CodeGenCXX/debug-info-namespace.cpp Clang :: CodeGenCXX/vtable-debug-info.cpp Clang :: Driver/hello.c Clang :: Index/c-index-api-loadTU-test.m Clang :: Index/c-index-getCursor-test.m Clang :: Index/c-index-pch.c Clang :: PCH/chain-cxx.cpp Clang :: PCH/chain-remap-types.m Clang :: PCH/chain-selectors.m Clang :: PCH/check-deserializations.cpp Clang :: PCH/cuda-kernel-call.cu Clang :: PCH/cxx-friends.cpp Clang :: PCH/cxx-namespaces.cpp Clang :: PCH/cxx-static_assert.cpp Clang :: PCH/cxx-templates.cpp Clang :: PCH/cxx-traits.cpp Clang :: PCH/cxx-typeid.cpp Clang :: PCH/cxx-using.cpp Clang :: PCH/cxx-variadic-templates.cpp Clang :: PCH/cxx_exprs.cpp Clang :: PCH/exprs.c Clang :: PCH/headersearch.cpp Clang :: PCH/missing-file.cpp Clang :: PCH/namespaces.cpp Clang :: PCH/objc_import.m Clang :: PCH/objc_methods.m Clang :: PCH/objc_property.m Clang :: PCH/objcxx-ivar-class.mm Clang :: PCH/pragma-diag-section.cpp Clang :: PCH/reinclude.cpp Clang :: PCH/struct.c Clang :: PCH/typo.m Clang :: PCH/va_arg.cpp Clang :: Sema/stdcall-fastcall.c Clang :: Sema/x86-builtin-palignr.c Clang :: SemaCXX/attr-regparm.cpp Expected Passes : 8116 Expected Failures : 68 Unsupported Tests : 542 Unexpected Failures: 39 From stoklund at 2pi.dk Tue Mar 15 22:29:43 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 15 Mar 2011 20:29:43 -0700 Subject: [LLVMdev] Prevent unbounded memory consuption of long lived JIT processes In-Reply-To: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <6F4108ED-091B-4390-80C3-A13359F4AD79@2pi.dk> On Mar 15, 2011, at 4:15 PM, jfonseca at vmware.com wrote: > This series of patches address several issues causing memory usage to grow > indefinetely on a long lived process. Thanks for working on this. Did you measure the performance impact of these changes? /jakob From kd at kendyck.com Wed Mar 16 07:58:07 2011 From: kd at kendyck.com (Ken Dyck) Date: Wed, 16 Mar 2011 08:58:07 -0400 Subject: [LLVMdev] Calls to functions with signext/zeroext return values Message-ID: In SelectionDAGBuilder::visitRet(), there is this bit of code: // FIXME: C calling convention requires the return type to be promoted // to at least 32-bit. But this is not necessary for non-C calling // conventions. The frontend should mark functions whose return values // require promoting with signext or zeroext attributes. if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) { EVT MinVT = TLI.getRegisterType(*DAG.getContext(), MVT::i32); if (VT.bitsLT(MinVT)) VT = MinVT; } There have been a few discussions about this snippet on llvmdev in the past[1][2][3], and there seems to be a general consensus that the responsibility for promoting to the 'int' type should be transfered to the front end and the signext/zeroext attributes eliminated. But that's not what I'm interested in discussing here. What I'd like to ask about is calls to functions that have a signext/zeroext attribute on their return value. As far as I can tell, there isn't any corresponding promotion of the return value to i32 in visitCall(). Should there be? I ran into problems in a DSP back end that I'm working on where the return conventions for i16 and i32 are slightly different (they are both returned in the same accumulator register, but at different offsets within the accumulator). The callee promoted the return value to i32, but the caller was expecting it to be returned as an i16. So I made some changes to SelectionDAGBuilder (see attached patch) to truncate return values back to their declared sizes and that seemed to fix the problems for my DSP backend. But these changes broke some regression tests in other back ends. Specifically, LLVM :: CodeGen/MSP430/2009-11-05-8BitLibcalls.ll LLVM :: CodeGen/MSP430/indirectbr.ll LLVM :: CodeGen/X86/h-registers-3.ll The failures in the MSP430 tests are particularly troubling because they are assertion failures in LegalizeDAG because i32 is not a legal type. The X86 failure seems less serious. Based on my limited knowledge of X86 assembly, it looks like the back end is generating code that works but that is different from what the test expects. So my questions: 1. Should visitCall() promote the return value to i32 as is done in visitRet()? 2. If so, any suggestions on how to fix my patch or the MSP430 back end so it won't crash the MSP430 tests? 3. If not, what options do I have for handling the convention mismatch in my back end? The only one that I can see is ensuring that i32 and i16 return values use compatible conventions. -Ken [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-February/012840.html [2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-May/014449.html [3] http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/020078.html -------------- next part -------------- A non-text attachment was scrubbed... Name: call-extended-return.patch Type: text/x-patch Size: 2681 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110316/ef97fbfc/attachment-0001.bin From jfonseca at vmware.com Wed Mar 16 08:19:02 2011 From: jfonseca at vmware.com (=?ISO-8859-1?Q?Jos=E9?= Fonseca) Date: Wed, 16 Mar 2011 13:19:02 +0000 Subject: [LLVMdev] Prevent unbounded memory consuption of long lived JIT processes In-Reply-To: <6F4108ED-091B-4390-80C3-A13359F4AD79@2pi.dk> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> <6F4108ED-091B-4390-80C3-A13359F4AD79@2pi.dk> Message-ID: <1300281542.27102.307.camel@jfonseca-laptop.eng.vmware.com> On Tue, 2011-03-15 at 20:29 -0700, Jakob Stoklund Olesen wrote: > On Mar 15, 2011, at 4:15 PM, jfonseca at vmware.com wrote: > > > This series of patches address several issues causing memory usage to grow > > indefinetely on a long lived process. > > Thanks for working on this. > > Did you measure the performance impact of these changes? I tracked performance with this change with X86 JIT and there was no measurable difference, but the performance was governed more by the quality of the compiled code, and not so much the compilation time. If you can point me to a good compilation time benchmark I can get some figures. I'd expect either no measurable impact in compilation time, or a slight improvement due to smaller memory footprint: - for patches 1-3 (prevent infinite growth of several hash maps data types) should above all reduce memory usage; there might be some cases (e.g., frequent updates with a small bounded number of elements) where it may trade off an exponentially growing table size (i.e., memory) for more rehashes (i.e., cpu), but that should be a win on nowadays processors. - patch 4 (Reset StringMap's NumTombstones on clears and rehashes) should improve performance - patch 5 refers to a function that doesn't get called frequently Jose From joearms at gmail.com Wed Mar 16 05:14:02 2011 From: joearms at gmail.com (Joe armstrong) Date: Wed, 16 Mar 2011 10:14:02 +0000 (UTC) Subject: [LLVMdev] Bug in opt Message-ID: I have a problem. I'm writing a C compiler in my favorite programming language (don't ask :-) I have made a .s file, which can be correctly assembled and run with lli. But when I optimize it I get no errors from the optimizer, but the resultant file is incorrect. Here's what happens: llvm-as test2_gen.s %% no errors test2_gen.s.bc is produced lli test2_gen.s.bc n=887459712 %% no errors opt -std-compile-opts -S test2_gen.s.bc > test2_opt.s.bc %% no errors %% But now the generated file cannon be disassembled or run lli test2_opt.s.bc lli: error loading program 'test2_opt.s.bc': Bitcode stream should be a multiple of 4 bytes in length llvm-dis test2_opt.s.bc llvm-dis: Bitcode stream should be a multiple of 4 bytes in length The generated .s file is as follows: ; ----- start ; Compiled by the amazing Ericsson C->LLVM compiler ; Hand crafted in Erlang ; ModuleID = 'test2.c' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32: 64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32" target triple = "i386-pc-linux-gnu" ; Sock it to me baby ;; globals declare i32 @printf(i8* , ...) @main.str1 = private constant [6x i8] c"n=%i\0A\00" ;; code define i32 @main() nounwind { ;;return register %tmp_1 = alloca i32 ,align 4 %i = alloca i32 ,align 4 %max = alloca i32 ,align 4 %n = alloca i32 ,align 4 %tmp_2 = add i32 0,0 store i32 %tmp_2 ,i32* %i %tmp_3 = add i32 0,100000000 store i32 %tmp_3 ,i32* %max %tmp_4 = add i32 0,0 store i32 %tmp_4 ,i32* %n br label %initfor_1 initfor_1: %tmp_5 = add i32 0,0 store i32 %tmp_5 ,i32* %i br label %testfor_3 updatefor_2: %tmp_6 = load i32* %i %tmp_7 = add i32 0,1 %tmp_8 = add i32 %tmp_6 ,%tmp_7 store i32 %tmp_8 ,i32* %i br label %testfor_3 testfor_3: %tmp_9 = load i32* %i %tmp_10 = load i32* %max %tmp_11 = icmp slt i32 %tmp_9 ,%tmp_10 br i1 %tmp_11 ,label %bodyfor_4,label %endfor_5 bodyfor_4: %tmp_12 = load i32* %n %tmp_13 = load i32* %i %tmp_14 = add i32 %tmp_12 ,%tmp_13 store i32 %tmp_14 ,i32* %n br label %updatefor_2 endfor_5: %tmp_15 = getelementptr [6 x i8]* @main.str1, i32 0, i32 0 %tmp_16 = load i32* %n %tmp_17 = call i32 (i8* , ...)* @printf(i8* %tmp_15 , i32 %tmp_16 ) %tmp_18 = add i32 0,0 ret i32 %tmp_18 } The C code was as follows: int printf(const char * format, ...); int main() { int i=0, max=100000000,n=0; for(i = 0; i < max; i = i + 1){ n = n + i; } printf("n=%i\n", n); return(0); } /Joe From jfonseca at vmware.com Wed Mar 16 08:23:34 2011 From: jfonseca at vmware.com (=?ISO-8859-1?Q?Jos=E9?= Fonseca) Date: Wed, 16 Mar 2011 13:23:34 +0000 Subject: [LLVMdev] Prevent unbounded memory consuption of long lived JIT processes In-Reply-To: References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> Message-ID: <1300281814.27102.311.camel@jfonseca-laptop.eng.vmware.com> On Tue, 2011-03-15 at 18:39 -0700, NAKAMURA Takumi wrote: > Good morning Jose, > > Thank you to send patches. > > - Please send patches to llvm-commits. > - Please make patches with "--attach". You may add "format.attach" > to git config. Will do, thanks. > I have not seen yours yet, but I pushed yours to github; > https://github.com/chapuni/LLVM/compare/ed4edf9e...jfonseca%2F20110316 > (Excuse me I could not input accent) > > ...Takumi Nice. I haven't used github yet, but I'll try using it going forward. Jose > > On Wed, Mar 16, 2011 at 8:15 AM, wrote: > > This series of patches address several issues causing memory usage to grow > > indefinetely on a long lived process. > > > > These are not convenional leaks -- memory would have been freed when the LLVM > > context or/and JIT engine is destroyed -- but for as long as they aren't the > > memory is usage effectively ubounded. > > > > The issues were found using valgrind with '--show-reachable=yes' option: > > 1. Compile a bunch of functions with JIT once; delete the result; and exit > > without destroying LLVM context nor JIT engine. (valgrind will report a > > bunch of unfreed LLVM objects) > > 2. Do as 1, but compile and delete the functions twice > > 3. Ditto three times. > > 4. Etc. > > > > Flawless code should not cause the memory usage to increase when compiling the > > same -- ie valgrind's log for every run should show the very same unfreed > > objects, regardless of the number of times a given code was compilation, but > > that was not the case. The attached patches cover most of the causes for new > > objects being allocated. > > > > It should be possible to automate such test, but I didn't get that far. From baldrick at free.fr Wed Mar 16 08:59:08 2011 From: baldrick at free.fr (Duncan Sands) Date: Wed, 16 Mar 2011 14:59:08 +0100 Subject: [LLVMdev] Bug in opt In-Reply-To: References: Message-ID: <4D80C22C.3050308@free.fr> Hi Joe, > I have made a .s file, which can be correctly assembled > and run with lli. But when I optimize it I get no errors > from the optimizer, but the resultant file is incorrect. > > Here's what happens: > > llvm-as test2_gen.s %% no errors test2_gen.s.bc is produced there's actually no need to assemble this to bitcode: you can pass test2_gen.s directly to opt. At least you can in recent versions of LLVM. > opt -std-compile-opts -S test2_gen.s.bc> test2_opt.s.bc By using -S you ask opt to produce human readable IR rather than bitcode, so you should really output to test2_opt.s. > > %% no errors > %% But now the generated file cannon be disassembled or run > > lli test2_opt.s.bc > lli: error loading program 'test2_opt.s.bc': Bitcode stream should be a > multiple > of 4 bytes in length This means that it doesn't contain bitcode. And indeed it doesn't, it contains human readable IR due to your using -S above. > llvm-dis test2_opt.s.bc > llvm-dis: Bitcode stream should be a multiple of 4 bytes in length Same problem. That said, in latest LLVM lli accepts human readable IR as well as bitcode, so I'm guessing that you are using an older version that does not have this feature. Of course I may also have misdiagnosed the problem :) Ciao, Duncan. From 6yearold at gmail.com Wed Mar 16 08:59:14 2011 From: 6yearold at gmail.com (arrowdodger) Date: Wed, 16 Mar 2011 16:59:14 +0300 Subject: [LLVMdev] Bug in opt In-Reply-To: References: Message-ID: On Wed, Mar 16, 2011 at 1:14 PM, Joe armstrong wrote: > opt -std-compile-opts -S test2_gen.s.bc > test2_opt.s.bc > You have produced .ll, not .bc. This is die -S flag. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110316/ee773385/attachment.html From anton at korobeynikov.info Wed Mar 16 09:05:11 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 16 Mar 2011 17:05:11 +0300 Subject: [LLVMdev] Bug in opt In-Reply-To: References: Message-ID: Hello > opt -std-compile-opts -S test2_gen.s.bc > test2_opt.s.bc I believe -S will yield the text output. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From stoklund at 2pi.dk Wed Mar 16 10:39:53 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 16 Mar 2011 08:39:53 -0700 Subject: [LLVMdev] Prevent unbounded memory consuption of long lived JIT processes In-Reply-To: <1300281542.27102.307.camel@jfonseca-laptop.eng.vmware.com> References: <1300230955-24833-1-git-send-email-jfonseca@vmware.com> <6F4108ED-091B-4390-80C3-A13359F4AD79@2pi.dk> <1300281542.27102.307.camel@jfonseca-laptop.eng.vmware.com> Message-ID: <2A47DA52-B4E2-49BC-9F07-5B4F06B910E2@2pi.dk> On Mar 16, 2011, at 6:19 AM, Jos? Fonseca wrote: > On Tue, 2011-03-15 at 20:29 -0700, Jakob Stoklund Olesen wrote: >> On Mar 15, 2011, at 4:15 PM, jfonseca at vmware.com wrote: >> >>> This series of patches address several issues causing memory usage to grow >>> indefinetely on a long lived process. >> >> Thanks for working on this. >> >> Did you measure the performance impact of these changes? > > I tracked performance with this change with X86 JIT and there was no > measurable difference, but the performance was governed more by the > quality of the compiled code, and not so much the compilation time. > > If you can point me to a good compilation time benchmark I can get some > figures. I normally use 403.gcc, but if you don't have SPEC sources, these tests in the nightly test suite take a while to compile: MultiSource/Applications/ClamAV MultiSource/Applications/JM/ldecod MultiSource/Applications/JM/lencod MultiSource/Applications/SPASS MultiSource/Applications/kimwitu++/kc MultiSource/Applications/sqlite3/sqlite3 If you run 'make TEST=nightly', both llc and opt compile times are interesting. The runtime of opt is cryptically reported in the GCCAS column. /jakob From joearms at gmail.com Wed Mar 16 11:05:21 2011 From: joearms at gmail.com (Joe Armstrong) Date: Wed, 16 Mar 2011 17:05:21 +0100 Subject: [LLVMdev] Bug in opt In-Reply-To: <4D80C22C.3050308@free.fr> References: <4D80C22C.3050308@free.fr> Message-ID: On Wed, Mar 16, 2011 at 2:59 PM, Duncan Sands wrote: > Hi Joe, > >> I have made a .s file, which can be correctly assembled >> and run with lli. But when I optimize it I get no errors >> from the optimizer, but the resultant file is incorrect. >> >> Here's what happens: >> >> llvm-as test2_gen.s ? %% no errors test2_gen.s.bc is produced > > there's actually no need to assemble this to bitcode: you can pass > test2_gen.s directly to opt. ?At least you can in recent versions of > LLVM. > >> opt -std-compile-opts -S test2_gen.s.bc> ?test2_opt.s.bc > > By using -S you ask opt to produce human readable IR rather than > bitcode, so you should really output to test2_opt.s. > >> >> %% no errors >> %% But now the generated file cannon be disassembled or run >> >> lli test2_opt.s.bc >> lli: error loading program 'test2_opt.s.bc': Bitcode stream should be a >> multiple >> of 4 bytes in length > > This means that it doesn't contain bitcode. ?And indeed it doesn't, it > contains human readable IR due to your using -S above. Silly me I didn't think to look - you're right. This is very cool - my C compiler spits out lousy code, but after "opt'ing" the result more or less results in what as optimising C compiler would have spit out. Which means that language interoperability becomes really easy - just parse and de-sugar the input (form any language) and your're away. Thanks for you help /Joe > >> llvm-dis test2_opt.s.bc >> llvm-dis: Bitcode stream should be a multiple of 4 bytes in length > > Same problem. > > That said, in latest LLVM lli accepts human readable IR as well as bitcode, > so I'm guessing that you are using an older version that does not have this > feature. ?Of course I may also have misdiagnosed the problem :) > > Ciao, Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From zwarich at apple.com Wed Mar 16 11:31:44 2011 From: zwarich at apple.com (Cameron Zwarich) Date: Wed, 16 Mar 2011 09:31:44 -0700 Subject: [LLVMdev] Calls to functions with signext/zeroext return values In-Reply-To: References: Message-ID: On Mar 16, 2011, at 5:58 AM, Ken Dyck wrote: > In SelectionDAGBuilder::visitRet(), there is this bit of code: > > // FIXME: C calling convention requires the return type to be promoted > // to at least 32-bit. But this is not necessary for non-C calling > // conventions. The frontend should mark functions whose return values > // require promoting with signext or zeroext attributes. > if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) { > EVT MinVT = TLI.getRegisterType(*DAG.getContext(), MVT::i32); > if (VT.bitsLT(MinVT)) > VT = MinVT; > } > > There have been a few discussions about this snippet on llvmdev in the > past[1][2][3], and there seems to be a general consensus that the > responsibility for promoting to the 'int' type should be transfered to > the front end and the signext/zeroext attributes eliminated. But > that's not what I'm interested in discussing here. > > What I'd like to ask about is calls to functions that have a > signext/zeroext attribute on their return value. As far as I can tell, > there isn't any corresponding promotion of the return value to i32 in > visitCall(). Should there be? Promoting the return value is unsafe for bool returns on x86-64, which in the latest revision of the ABI only guarantees that the top 7 bits of the 8-bit register are 0. Cameron From zwarich at apple.com Wed Mar 16 11:35:16 2011 From: zwarich at apple.com (Cameron Zwarich) Date: Wed, 16 Mar 2011 09:35:16 -0700 Subject: [LLVMdev] Calls to functions with signext/zeroext return values In-Reply-To: References: Message-ID: <5B302BC2-B209-40CA-A05D-4C1FFC11F349@apple.com> On Mar 16, 2011, at 9:31 AM, Cameron Zwarich wrote: > Promoting the return value is unsafe for bool returns on x86-64, which in the latest revision of the ABI only guarantees that the top 7 bits of the 8-bit register are 0. My comment is a bit off, because the question of what type to make the return value is somewhat orthogonal to the question of which zext assert we should add. Cameron From stoklund at 2pi.dk Wed Mar 16 11:37:41 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 16 Mar 2011 09:37:41 -0700 Subject: [LLVMdev] Bug in opt In-Reply-To: References: Message-ID: On Mar 16, 2011, at 3:14 AM, Joe armstrong wrote: > I have made a .s file, which can be correctly assembled > and run with lli. But when I optimize it I get no errors > from the optimizer, but the resultant file is incorrect. > > Here's what happens: > > llvm-as test2_gen.s %% no errors test2_gen.s.bc is produced > > lli test2_gen.s.bc > n=887459712 %% no errors > > opt -std-compile-opts -S test2_gen.s.bc > test2_opt.s.bc We normally reserve the .s suffix for native assembly files and use .ll for LLVM IR assembly. The .bc suffix implies binary bitcode. llvm-as: .ll -> .bc llvm-dis: .bc -> .ll llc: .ll/.bc -> .s I don't think the tools require these suffixes, but it helps avoid confusion. /jakob From Arnaud.AllardDeGrandMaison at dibcom.com Wed Mar 16 12:23:31 2011 From: Arnaud.AllardDeGrandMaison at dibcom.com (Arnaud Allard de Grandmaison) Date: Wed, 16 Mar 2011 18:23:31 +0100 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <605CD19F-8A9C-4DA6-8381-1C6C8175908B@apple.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> <605CD19F-8A9C-4DA6-8381-1C6C8175908B@apple.com> Message-ID: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1F3B64E@FRPAR1CL009.coe.adi.dibcom.com> Hi Andy, Thanks for looking into this. I have tried today to make a reduced testcase from the value function, but as I do not have any arm hardware available to measure the real cycle count, it can be quite errorprone, especially with all those loops. Maybe I should give a try at qemu. Best regards, -- Arnaud de Grandmaison -----Original Message----- From: Andrew Trick [mailto:atrick at apple.com] Sent: Wednesday, March 16, 2011 1:16 AM To: Arnaud Allard de Grandmaison Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] IndVarSimplify too aggressive ? On Mar 13, 2011, at 2:01 PM, Arnaud Allard de Grandmaison wrote: > Hi all, > > The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. > > I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. > > Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. > > The patch could probably be made smarter : I am welcoming all suggestions. > > Best Regards, > -- > Arnaud de Grandmaison Arnaud, I've been investigating whether it's safe to apply your patch. I still need to understand why our generated code is slower in some cases. I noticed a particularly bad regression in MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow that I documented here: http://llvm.org/bugs/show_bug.cgi?id=9490 We would like to avoid generating canonical induction variables in IndVarSimplify. Once that work is complete, your patch should no longer be needed. Although in the meantime, it would be nice to understand why promoting IVs to wider types is sometimes required for codegen. -Andy From jgu222 at gmail.com Wed Mar 16 13:06:42 2011 From: jgu222 at gmail.com (Junjie Gu) Date: Wed, 16 Mar 2011 11:06:42 -0700 Subject: [LLVMdev] linkage type Message-ID: What is the difference between WeakAnyLinkage and ExternalWeakLinkage ? They are defined in GlobalValue.h. Thanks Junjie From code at klickverbot.at Wed Mar 16 13:43:56 2011 From: code at klickverbot.at (David Nadlinger) Date: Wed, 16 Mar 2011 19:43:56 +0100 Subject: [LLVMdev] linkage type In-Reply-To: References: Message-ID: <4D8104EC.3050201@klickverbot.at> There is a description of all the possible linkage types at http://llvm.org/docs/LangRef.html#linkage ? does this answer your question? (Basically, an extern_weak resp. ExternalWeakLinkage symbol becomes null instead of being an undefined reference) David On 3/16/11 7:06 PM, Junjie Gu wrote: > What is the difference between WeakAnyLinkage and ExternalWeakLinkage > ? They are defined in GlobalValue.h. Thanks > > Junjie > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kd at kendyck.com Wed Mar 16 13:43:08 2011 From: kd at kendyck.com (Ken Dyck) Date: Wed, 16 Mar 2011 14:43:08 -0400 Subject: [LLVMdev] Calls to functions with signext/zeroext return values In-Reply-To: <5B302BC2-B209-40CA-A05D-4C1FFC11F349@apple.com> References: <5B302BC2-B209-40CA-A05D-4C1FFC11F349@apple.com> Message-ID: On Wed, Mar 16, 2011 at 12:35 PM, Cameron Zwarich wrote: > On Mar 16, 2011, at 9:31 AM, Cameron Zwarich wrote: > >> Promoting the return value is unsafe for bool returns on x86-64, which >> in the latest revision of the ABI only guarantees that the top 7 bits of >> the 8-bit register are 0. > > My comment is a bit off, because the question of what type to make > the return value is somewhat orthogonal to the question of which zext > assert we should add. I'm not sure I follow. Won't a zeroext attribute on a bool return value ensure that it will be zero-extended to 32 bits by the callee? Or does the X86 backend consider such functions unlowerable (via TargetLowering::CanLowerReturn()) and thereby bypass the extension to 32 bits in SelectionDAGBuilder::visitRet() making a promotion in the caller unnecessary? -Ken From dag at cray.com Wed Mar 16 15:44:21 2011 From: dag at cray.com (David Greene) Date: Wed, 16 Mar 2011 15:44:21 -0500 Subject: [LLVMdev] Long-Term ISel Design Message-ID: All, As I've done more integrating of AVX work upstream and more tuning here, I've run across several things which are clunky in the current isel design. A couple examples I can remember offhand: 1. We have special target-specific operators for certain shuffles in X86, such as X86unpckl. I don't completely understand why but Bruno indicated it was to address inefficiecies. One of those is the need to check masks multiple times (once at legalize and again at isel). 2. Sometimes DAGs are legal in some contexts but not others and it is a pain to deal with. A good example is VBROADCAST, where a <0,0,0,0> shuffle is natively supported if the source vector is in memory. Otherwise it's not legal and manual lowering is required. In this case the legality check is doing the DAG match by hand, replicating what TableGen-produced code already does. These two examples are related: we're duplicating functionality manually that's already available automatically. As I've been thinking about this, it strikes me that we could get rid of the target-specific operators and a lot of other manual checks if we just had another isel phase. Let's say we structured things this way: legalize | V manual lowering (X86ISelLowering) | V manual isel (X86ISelDAGToDAG) | V table-driven isel (.td files/X86GenDAGISel) | V manual isel (some to-be-design piece) The idea is that we keep the existing manual pieces where they are to clean things up for TableGen-based isel and/or handle special cases. Maybe we consider getting rid of some in the future but that's a separate questions. The way things are now, if table-driven isel fails the codegen aborts. In the above scheme we get one last chance to do manual lowering before we give up. This helps the shuffle mask case by turning this: legalize | V check shuffle mask legality (X86ISelLowering) | V check shuffle mask legality (table-driven isel predicates) To this: legalize | V X86ISelLowering (no mask legality checks) | V check shuffle mask legality (table-driven isel predicates) | V lower remaining shuffles manually The advantage is that in the final stage we already know the shuffle mask isn't implementable manually so there's no need to check for legality. We simply need to implement whatever X86ISelLowering would have done in those case previously. This also helps example 2. In the memory-operand case we will match to a VBROADCASTSS/D. If we don't match we'll fall through to manual lowering and we'll implement the reg-reg broadcast via some other combination of shuffles. So we more gracefully handle situations where sometimes things are legal and sometimes they aren't depending on the context. Perhaps I'm repeating something that's already been discussed. Thoughts? -Dave From rengolin at systemcall.org Wed Mar 16 16:01:12 2011 From: rengolin at systemcall.org (Renato Golin) Date: Wed, 16 Mar 2011 21:01:12 +0000 Subject: [LLVMdev] Warning in Clang Message-ID: /home/rengolin/workspace/llvm/rw/src/tools/clang/lib/CodeGen/CGExprConstant.cpp: In member function ?llvm::Constant*::ConstExprEmitter::VisitCastExpr(clang::CastExpr*)?: /home/rengolin/workspace/llvm/rw/src/tools/clang/lib/CodeGen/CGExprConstant.cpp:621: warning: control reaches end of non-void function A switch without a default label or a return at the end. -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From wendling at apple.com Wed Mar 16 16:19:46 2011 From: wendling at apple.com (Bill Wendling) Date: Wed, 16 Mar 2011 14:19:46 -0700 Subject: [LLVMdev] [release_29] Good status of ppc-redhat-linux on Fedora 12 PS3 In-Reply-To: References: Message-ID: <08BBD0A2-B080-4FDA-AB59-5D6996DE05AD@apple.com> On Mar 15, 2011, at 7:25 PM, NAKAMURA Takumi wrote: > Good morning. > Hi Nakamura, > LLVM and clang can be built successfully on Fedora 12 PS3. > Hooray! :-) > On RC1, only one test failed. > test/CodeGen/X86/fold-pcmpeqd-0.ll > Eric commented that this should be fixed on the release branch right now. > On release_29 branch, all llvm tests can pass. Woo! :-) > (I don't mention clang tests :p ) > >.> > ...Takumi > > > Fedora release 12 (Constantine) > Linux speedking.localdomain 2.6.32.23-170.fc12.ppc64 #1 SMP Mon Sep 27 > 17:09:35 UTC 2010 ppc64 ppc64 ppc64 GNU/Linux > > llvm config.status 2.9svn > configured by ../../llvm/configure, generated by GNU Autoconf 2.60, > with options "'-C' '--build=ppc-redhat-linux' '--enable-targets=all' > '--enable-optimized' 'build_alias=ppc-redhat-linux' > '--with-optimize-option=-O3 -Werror' > '--prefix=/home/chapuni/BUILD/llvm-ppc-static/install/stage1'" > > Failing Tests (39): Hrm. Could you work with the Clang guys to prioritize these failures? If they're "real", we need to get them fixed soon. Thank you for the testing! -bw From zwarich at apple.com Wed Mar 16 17:08:51 2011 From: zwarich at apple.com (Cameron Zwarich) Date: Wed, 16 Mar 2011 15:08:51 -0700 Subject: [LLVMdev] Calls to functions with signext/zeroext return values In-Reply-To: References: <5B302BC2-B209-40CA-A05D-4C1FFC11F349@apple.com> Message-ID: On Mar 16, 2011, at 11:43 AM, Ken Dyck wrote: > On Wed, Mar 16, 2011 at 12:35 PM, Cameron Zwarich wrote: >> On Mar 16, 2011, at 9:31 AM, Cameron Zwarich wrote: >> >>> Promoting the return value is unsafe for bool returns on x86-64, which >>> in the latest revision of the ABI only guarantees that the top 7 bits of >>> the 8-bit register are 0. >> >> My comment is a bit off, because the question of what type to make >> the return value is somewhat orthogonal to the question of which zext >> assert we should add. > > I'm not sure I follow. Won't a zeroext attribute on a bool return > value ensure that it will be zero-extended to 32 bits by the callee? > Or does the X86 backend consider such functions unlowerable (via > TargetLowering::CanLowerReturn()) and thereby bypass the extension to > 32 bits in SelectionDAGBuilder::visitRet() making a promotion in the > caller unnecessary? The X86 backend currently zero-extends them to 32 bits, but according to the ABI it need only zero-extend them to 8 bits. I'm going to change this in a few minutes to expose some additional optimization opportunities. Cameron From geek4civic at gmail.com Wed Mar 16 17:34:36 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Thu, 17 Mar 2011 07:34:36 +0900 Subject: [LLVMdev] [release_29] Good status of ppc-redhat-linux on Fedora 12 PS3 In-Reply-To: <08BBD0A2-B080-4FDA-AB59-5D6996DE05AD@apple.com> References: <08BBD0A2-B080-4FDA-AB59-5D6996DE05AD@apple.com> Message-ID: Good morning, Bill! On Thu, Mar 17, 2011 at 6:19 AM, Bill Wendling wrote: >> On RC1, only one test failed. >> test/CodeGen/X86/fold-pcmpeqd-0.ll > Eric commented that this should be fixed on the release branch right now. Of course I knew! It was the reason why I checked also on release_29. >> Failing Tests (39): > Hrm. Could you work with the Clang guys to prioritize these failures? If they're "real", we need to get them fixed soon. I have given up to investigate. As you know, PS3,Cell BE PPU, is too slow. And I heard clang on ppc might be inmature. I will post clang test log, if I can, ... unless I got power cut! Good day! ...Takumi From greened at obbligato.org Wed Mar 16 18:13:47 2011 From: greened at obbligato.org (David A. Greene) Date: Wed, 16 Mar 2011 18:13:47 -0500 Subject: [LLVMdev] Long-Term ISel Design In-Reply-To: (David Greene's message of "Wed, 16 Mar 2011 15:44:21 -0500") References: Message-ID: David Greene writes: > The advantage is that in the final stage we already know the shuffle > mask isn't implementable manually so there's no need to check for > legality. s/manually/natively/ -Dave From ashay.rane at asu.edu Wed Mar 16 20:00:19 2011 From: ashay.rane at asu.edu (Ashay Rane) Date: Wed, 16 Mar 2011 20:00:19 -0500 Subject: [LLVMdev] Operating on contents of virtual registers Message-ID: Hello, I was facing some difficulty in implementing a transform and I was wondering if I could get some help please. The transform needs to operate on the operands of certain instructions. For example, given an instruction, say "%10 = load i32* %9, align 4", I have to record the value of %9 and process it. Of course, this is only possible at runtime and so I am instrumenting the code such that a particular function is invoked just before the instruction of interest is executed. The problem that I am facing is in getting the value of the operands. As I understand, the operands could either be program variables (e.g. in "%6 = load i32* %old, align 4") or one of the virtual registers (as in the first load instruction). For both cases, is it possible to extract the value/address of the operand (%9 or %old)? If yes, what should be the best function type to use so that I can pass this value as an argument to my function? For now, I am only concerned with load and store instructions and so (I suppose) I have to deal with pointers only. An easy way that I can think of is to directly insert the LLVM IR (e.g. call void @my_function(%old)) but because I am using Module::getOrInsertFunction(), I have to have a function type. So alternatively, is there a way to insert direct LLVM instructions (without going through the type hierarchy)? Thanks, Ashay --- Ashay Rane Research Associate The University of Texas at Austin http://www.public.asu.edu/~asrane/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110316/2265bf14/attachment.html From clattner at apple.com Wed Mar 16 22:54:14 2011 From: clattner at apple.com (Chris Lattner) Date: Wed, 16 Mar 2011 20:54:14 -0700 Subject: [LLVMdev] Long-Term ISel Design In-Reply-To: References: Message-ID: On Mar 16, 2011, at 1:44 PM, David Greene wrote: > All, > > As I've done more integrating of AVX work upstream and more tuning here, > I've run across several things which are clunky in the current isel > design. A couple examples I can remember offhand: > > 1. We have special target-specific operators for certain shuffles in X86, > such as X86unpckl. I don't completely understand why but Bruno > indicated it was to address inefficiecies. One of those is the need > to check masks multiple times (once at legalize and again at isel). It also eliminates a lot of fragility. Before doing this, X86 legalize would have to be very careful to specifically form shuffles that it knew isel would turn into (e.g.) unpck operations. Now instead of forming specific carefully constructed shuffle masks (not making sure other code doesn't violate them) it can just directly form the X86ISD node. > 2. Sometimes DAGs are legal in some contexts but not others and it is a > pain to deal with. A good example is VBROADCAST, where a <0,0,0,0> > shuffle is natively supported if the source vector is in memory. > Otherwise it's not legal and manual lowering is required. In this > case the legality check is doing the DAG match by hand, replicating > what TableGen-produced code already does. Yes, this isn't good. Instead, the shuffle should be legalized to something that takes a pointer (memory operand). That means that X86 isel would form the *fully legal* X86ISD node, and nothing would be able to break it and it could never fail to match. > These two examples are related: we're duplicating functionality manually > that's already available automatically. Not sure what you mean by this. > As I've been thinking about this, it strikes me that we could get rid of > the target-specific operators and a lot of other manual checks if we > just had another isel phase. Let's say we structured things this way: > > legalize > | > V > manual lowering (X86ISelLowering) > | > V > manual isel (X86ISelDAGToDAG) > | > V > table-driven isel (.td files/X86GenDAGISel) > | > V > manual isel (some to-be-design piece) > > The idea is that we keep the existing manual pieces where they are to > clean things up for TableGen-based isel and/or handle special cases. > Maybe we consider getting rid of some in the future but that's a > separate questions. I'm not sure what you mean here. Are you suggesting that these be completely separate passes over the dag? Why do "manual isel" and "table driven isel" as separate passes? If they are interlaced, then how is this different than what we already have? > The way things are now, if table-driven isel fails the codegen aborts. > In the above scheme we get one last chance to do manual lowering before > we give up. This helps the shuffle mask case by turning this: > > legalize > | > V > check shuffle mask legality (X86ISelLowering) > | > V > check shuffle mask legality (table-driven isel predicates) You're saying that we do this on a node-by-node basis? The reason that codegen aborts on unselectable operations is that they are invalid and should not be formed. Your example of vbroadcast is a great one: the X86ISD node for it *should not take a vector register input*. If it does, then the X86ISD node is incorrectly defined. -Chris From eli.friedman at gmail.com Wed Mar 16 23:03:23 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 16 Mar 2011 21:03:23 -0700 Subject: [LLVMdev] Operating on contents of virtual registers In-Reply-To: References: Message-ID: On Wed, Mar 16, 2011 at 6:00 PM, Ashay Rane wrote: > Hello, > I was facing some difficulty in implementing a transform and I was wondering > if I could get some help please. > The transform needs to operate on the operands of certain instructions. For > example, given an instruction, say "%10 = load i32* %9, align 4", I have to > record the value of %9 and process it. Of course, this is only possible at > runtime and so I am instrumenting the code such that a particular function > is invoked just before the instruction of interest is executed. > The problem that I am facing is in getting the value of the operands. As I > understand, the operands could either be program variables (e.g. in "%6 = > load i32* %old, align 4") or one of the virtual registers (as in the first > load instruction). For both cases, is it possible to extract the > value/address of the operand (%9 or %old)? Those are both the same case; names for instructions only exist for the sake of readability. > If yes, what should be the best > function type to use so that I can pass this value as an argument to my > function? For now, I am only concerned with load and store instructions and > so (I suppose) I have to deal with pointers only. If you only need pointers, just use i8* as the type of the argument, and bitcast the value to i8*. > An easy way that I can think of is to directly insert the LLVM IR (e.g.?call > void @my_function(%old)) but because I am using > Module::getOrInsertFunction(), I have to have a function type. So > alternatively, is there a way to insert direct LLVM instructions (without > going through the type hierarchy)? Your approach of inserting a call is fine; making your own instruction or intrinsic wouldn't make things any simpler. There aren't any shortcuts here. -Eli From ashay.rane at asu.edu Wed Mar 16 23:33:34 2011 From: ashay.rane at asu.edu (Ashay Rane) Date: Wed, 16 Mar 2011 23:33:34 -0500 Subject: [LLVMdev] Operating on contents of virtual registers In-Reply-To: References: Message-ID: Hi Eli, Thanks for the reply. The problem is that getOperand() returns an llvm::Instruction (that refers to the definition of the operand). What I am trying to find out is how to get the value of the operand. When you refer to bitcasting to i8*, do you mean casting the return value from getOperand() itself? Ashay On Wed, Mar 16, 2011 at 11:03 PM, Eli Friedman wrote: > On Wed, Mar 16, 2011 at 6:00 PM, Ashay Rane wrote: > > Hello, > > I was facing some difficulty in implementing a transform and I was > wondering > > if I could get some help please. > > The transform needs to operate on the operands of certain instructions. > For > > example, given an instruction, say "%10 = load i32* %9, align 4", I have > to > > record the value of %9 and process it. Of course, this is only possible > at > > runtime and so I am instrumenting the code such that a particular > function > > is invoked just before the instruction of interest is executed. > > The problem that I am facing is in getting the value of the operands. As > I > > understand, the operands could either be program variables (e.g. in "%6 = > > load i32* %old, align 4") or one of the virtual registers (as in the > first > > load instruction). For both cases, is it possible to extract the > > value/address of the operand (%9 or %old)? > > Those are both the same case; names for instructions only exist for > the sake of readability. > > > If yes, what should be the best > > function type to use so that I can pass this value as an argument to my > > function? For now, I am only concerned with load and store instructions > and > > so (I suppose) I have to deal with pointers only. > > If you only need pointers, just use i8* as the type of the argument, > and bitcast the value to i8*. > > > An easy way that I can think of is to directly insert the LLVM IR > (e.g. call > > void @my_function(%old)) but because I am using > > Module::getOrInsertFunction(), I have to have a function type. So > > alternatively, is there a way to insert direct LLVM instructions (without > > going through the type hierarchy)? > > Your approach of inserting a call is fine; making your own instruction > or intrinsic wouldn't make things any simpler. There aren't any > shortcuts here. > > -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110316/c0e591aa/attachment.html From eli.friedman at gmail.com Wed Mar 16 23:40:06 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 16 Mar 2011 21:40:06 -0700 Subject: [LLVMdev] Operating on contents of virtual registers In-Reply-To: References: Message-ID: On Wed, Mar 16, 2011 at 9:33 PM, Ashay Rane wrote: > Hi Eli, > Thanks for the reply. The problem is that getOperand() returns an > llvm::Instruction (that refers to the definition of the operand). What I am > trying to find out is how to get the value of the operand. When you refer to > bitcasting to i8*, do you mean casting the return value from getOperand() > itself? > Ashay I mean calling IRBuilder::CreateBitCast with the return value from getOperand(). -Eli > On Wed, Mar 16, 2011 at 11:03 PM, Eli Friedman > wrote: >> >> On Wed, Mar 16, 2011 at 6:00 PM, Ashay Rane wrote: >> > Hello, >> > I was facing some difficulty in implementing a transform and I was >> > wondering >> > if I could get some help please. >> > The transform needs to operate on the operands of certain instructions. >> > For >> > example, given an instruction, say "%10 = load i32* %9, align 4", I have >> > to >> > record the value of %9 and process it. Of course, this is only possible >> > at >> > runtime and so I am instrumenting the code such that a particular >> > function >> > is invoked just before the instruction of interest is executed. >> > The problem that I am facing is in getting the value of the operands. As >> > I >> > understand, the operands could either be program variables (e.g. in "%6 >> > = >> > load i32* %old, align 4") or one of the virtual registers (as in the >> > first >> > load instruction). For both cases, is it possible to extract the >> > value/address of the operand (%9 or %old)? >> >> Those are both the same case; names for instructions only exist for >> the sake of readability. >> >> > If yes, what should be the best >> > function type to use so that I can pass this value as an argument to my >> > function? For now, I am only concerned with load and store instructions >> > and >> > so (I suppose) I have to deal with pointers only. >> >> If you only need pointers, just use i8* as the type of the argument, >> and bitcast the value to i8*. >> >> > An easy way that I can think of is to directly insert the LLVM IR >> > (e.g.?call >> > void @my_function(%old)) but because I am using >> > Module::getOrInsertFunction(), I have to have a function type. So >> > alternatively, is there a way to insert direct LLVM instructions >> > (without >> > going through the type hierarchy)? >> >> Your approach of inserting a call is fine; making your own instruction >> or intrinsic wouldn't make things any simpler. ?There aren't any >> shortcuts here. >> >> -Eli From eli.friedman at gmail.com Thu Mar 17 00:53:18 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 16 Mar 2011 22:53:18 -0700 Subject: [LLVMdev] Operating on contents of virtual registers In-Reply-To: References: Message-ID: On Wed, Mar 16, 2011 at 10:44 PM, Ashay Rane wrote: > Thanks Eli, that worked! > For the record, I had to make one change though. I had to bitcast to i32* > (instead of i8*) otherwise I was seeing errors like: > Instruction referencing instruction not embedded in a basic block! > ??%retval = alloca i32 > ?? = bitcast i32* %retval to i8* > and sometimes: > Instruction does not dominate all uses! > That makes me curious to ask, what does the choice of i32 or i8 depend on? > Is it the architecture? A bitcast from i32* to i32* gets optimized out. :) I think you're missing a call to IRBuilder::SetInsertPoint. -Eli > Ashay > > On Wed, Mar 16, 2011 at 11:40 PM, Eli Friedman > wrote: >> >> On Wed, Mar 16, 2011 at 9:33 PM, Ashay Rane wrote: >> > Hi Eli, >> > Thanks for the reply. The problem is that getOperand() returns an >> > llvm::Instruction (that refers to the definition of the operand). What I >> > am >> > trying to find out is how to get the value of the operand. When you >> > refer to >> > bitcasting to i8*, do you mean casting the return value from >> > getOperand() >> > itself? >> > Ashay >> >> I mean calling IRBuilder::CreateBitCast with the return value from >> getOperand(). >> >> -Eli >> >> > On Wed, Mar 16, 2011 at 11:03 PM, Eli Friedman >> > wrote: >> >> >> >> On Wed, Mar 16, 2011 at 6:00 PM, Ashay Rane wrote: >> >> > Hello, >> >> > I was facing some difficulty in implementing a transform and I was >> >> > wondering >> >> > if I could get some help please. >> >> > The transform needs to operate on the operands of certain >> >> > instructions. >> >> > For >> >> > example, given an instruction, say "%10 = load i32* %9, align 4", I >> >> > have >> >> > to >> >> > record the value of %9 and process it. Of course, this is only >> >> > possible >> >> > at >> >> > runtime and so I am instrumenting the code such that a particular >> >> > function >> >> > is invoked just before the instruction of interest is executed. >> >> > The problem that I am facing is in getting the value of the operands. >> >> > As >> >> > I >> >> > understand, the operands could either be program variables (e.g. in >> >> > "%6 >> >> > = >> >> > load i32* %old, align 4") or one of the virtual registers (as in the >> >> > first >> >> > load instruction). For both cases, is it possible to extract the >> >> > value/address of the operand (%9 or %old)? >> >> >> >> Those are both the same case; names for instructions only exist for >> >> the sake of readability. >> >> >> >> > If yes, what should be the best >> >> > function type to use so that I can pass this value as an argument to >> >> > my >> >> > function? For now, I am only concerned with load and store >> >> > instructions >> >> > and >> >> > so (I suppose) I have to deal with pointers only. >> >> >> >> If you only need pointers, just use i8* as the type of the argument, >> >> and bitcast the value to i8*. >> >> >> >> > An easy way that I can think of is to directly insert the LLVM IR >> >> > (e.g.?call >> >> > void @my_function(%old)) but because I am using >> >> > Module::getOrInsertFunction(), I have to have a function type. So >> >> > alternatively, is there a way to insert direct LLVM instructions >> >> > (without >> >> > going through the type hierarchy)? >> >> >> >> Your approach of inserting a call is fine; making your own instruction >> >> or intrinsic wouldn't make things any simpler. ?There aren't any >> >> shortcuts here. >> >> >> >> -Eli > > From viridia at gmail.com Thu Mar 17 01:40:57 2011 From: viridia at gmail.com (Talin) Date: Wed, 16 Mar 2011 23:40:57 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? Message-ID: One problem that has been vexing me of late: It seems that whenever I run into a problem that requires debugging one of my programs in gdb, before I can do that I have to fix my frontend's broken generation of debugging info. The code that generates debugging information is quite fragile - you have to generate metadata for each of your files, classes, and functions, and do so without error, because if you do make a mistake, the only way you'll find out is because gdb refuses to debug your program. And as I work on the code, occasionally bugs creep in, either from my side or occasionally from the LLVM side. The problem is, that I don't always check if the debug information is valid, so several weeks can go by and I don't notice something broke. What is needed is some way to write a unit test for DWARF information, so that if I broke something I would notice it immediately and could either fix it or roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere near strict enough - you can create completely nonsensical DIEs that still pass through Verify(). And even if the Verify() methods were 100% reliable, they only test whether the LLVM metadata is valid - they don't test whether the actual DWARF embedded in the final executable is correct. I suppose you could do something with dwarfdump -ka, although it would be better to have something that worked on all platforms. Even dwarfdump itself has different option syntax on Linux vs. OS X. And I don't think it's possible right now to generate code that passes through dwarfdump with zero error messages, or at least, I've never been able to figure out how to do it. I was thinking that since lldb needs to know how to interpret all this stuff anyway, perhaps there could be a way to use the same code to validate the debug information for an executable. I know lldb doesn't run on every platform yet, but I suspect that the parts of lldb which decode DWARF are fairly generic. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110316/a7d25b91/attachment.html From ashay.rane at tacc.utexas.edu Thu Mar 17 00:44:36 2011 From: ashay.rane at tacc.utexas.edu (Ashay Rane) Date: Thu, 17 Mar 2011 00:44:36 -0500 Subject: [LLVMdev] Operating on contents of virtual registers In-Reply-To: References: Message-ID: Thanks Eli, that worked! For the record, I had to make one change though. I had to bitcast to i32* (instead of i8*) otherwise I was seeing errors like: Instruction referencing instruction not embedded in a basic block! %retval = alloca i32 = bitcast i32* %retval to i8* and sometimes: Instruction does not dominate all uses! That makes me curious to ask, what does the choice of i32 or i8 depend on? Is it the architecture? Ashay On Wed, Mar 16, 2011 at 11:40 PM, Eli Friedman wrote: > On Wed, Mar 16, 2011 at 9:33 PM, Ashay Rane wrote: > > Hi Eli, > > Thanks for the reply. The problem is that getOperand() returns an > > llvm::Instruction (that refers to the definition of the operand). What I > am > > trying to find out is how to get the value of the operand. When you refer > to > > bitcasting to i8*, do you mean casting the return value from getOperand() > > itself? > > Ashay > > I mean calling IRBuilder::CreateBitCast with the return value from > getOperand(). > > -Eli > > > On Wed, Mar 16, 2011 at 11:03 PM, Eli Friedman > > wrote: > >> > >> On Wed, Mar 16, 2011 at 6:00 PM, Ashay Rane wrote: > >> > Hello, > >> > I was facing some difficulty in implementing a transform and I was > >> > wondering > >> > if I could get some help please. > >> > The transform needs to operate on the operands of certain > instructions. > >> > For > >> > example, given an instruction, say "%10 = load i32* %9, align 4", I > have > >> > to > >> > record the value of %9 and process it. Of course, this is only > possible > >> > at > >> > runtime and so I am instrumenting the code such that a particular > >> > function > >> > is invoked just before the instruction of interest is executed. > >> > The problem that I am facing is in getting the value of the operands. > As > >> > I > >> > understand, the operands could either be program variables (e.g. in > "%6 > >> > = > >> > load i32* %old, align 4") or one of the virtual registers (as in the > >> > first > >> > load instruction). For both cases, is it possible to extract the > >> > value/address of the operand (%9 or %old)? > >> > >> Those are both the same case; names for instructions only exist for > >> the sake of readability. > >> > >> > If yes, what should be the best > >> > function type to use so that I can pass this value as an argument to > my > >> > function? For now, I am only concerned with load and store > instructions > >> > and > >> > so (I suppose) I have to deal with pointers only. > >> > >> If you only need pointers, just use i8* as the type of the argument, > >> and bitcast the value to i8*. > >> > >> > An easy way that I can think of is to directly insert the LLVM IR > >> > (e.g. call > >> > void @my_function(%old)) but because I am using > >> > Module::getOrInsertFunction(), I have to have a function type. So > >> > alternatively, is there a way to insert direct LLVM instructions > >> > (without > >> > going through the type hierarchy)? > >> > >> Your approach of inserting a call is fine; making your own instruction > >> or intrinsic wouldn't make things any simpler. There aren't any > >> shortcuts here. > >> > >> -Eli > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110317/09f25deb/attachment.html From rengolin at systemcall.org Thu Mar 17 05:25:48 2011 From: rengolin at systemcall.org (Renato Golin) Date: Thu, 17 Mar 2011 10:25:48 +0000 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: Message-ID: On 17 March 2011 06:40, Talin wrote: > The code that generates debugging information is quite fragile - you have to > generate metadata for each of your files, classes, and functions, and do so > without error, because if you do make a mistake, the only way you'll find > out is because gdb refuses to debug your program. And as I work on the code, > occasionally bugs creep in, either from my side or occasionally from the > LLVM side. The problem is, that I don't always check if the debug > information is valid, so several weeks can go by and I don't notice > something broke. Strongly agree. > What is needed is some way to write a unit test for DWARF information, so > that if I broke something I would notice it immediately and could either fix > it or roll back. Unfortunately, the various DIDescriptor.Verify() methods > are nowhere near strict enough - you can create completely nonsensical DIEs > that still pass through Verify(). And even if the Verify() methods were 100% > reliable, they only test whether the LLVM metadata is valid - they don't > test whether the actual DWARF embedded in the final executable is correct. Strongly agree. But I go further... I could help with the verification process (since it's much better to fail verification than to fail gdb testuite), but I don't know the design decisions being taken for debug information/metadata, and they change too frequently to dig the code to learn. There is no API documentation and the interface (IR metadata) docs are old and inaccurate. I'd say, in order of importance, the three things that need to be done ASAP are: 1. Stick to one representation and document it (like LangRef), so other people could help 2. Enhance Validate() methods to be extremely strict (like Module's), so it fails straight away 3. Create tests (unit and regression) and run them during check-all, so we don't regress The tests are last because it's much easier to catch an assertion than a silent codegen error. After the initial period, we iterate those three steps (and not less!) again and again, until debug information is good. I see the importance of changing the IR (as I've requested quite a few times) but I understand that it's better for every one to have a stable IR. Every new version can have a few changes, not necessarily backward compatible, but those also need to be documented beforehand (mailing list, blog, release notes). If we follow the three steps above in an iterative way, during every release, we can achieve stability AND feature completeness. But (IMHO), stability comes first. cheers, --renato From dpatel at apple.com Thu Mar 17 08:41:10 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 17 Mar 2011 06:41:10 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: Message-ID: <517A3103-873F-4482-B44B-FB993F2104C5@apple.com> Talin, If there is a magic wand, I would be interested to know! DIDescriptor.Verify() is not suitable for you needs. It checks structure of encoded debug info after optimizer has modified the IR. Its main goal is inform Dwarf writer, at the end of code gen, which IR construct it should ignore. If you want to test code gen you have to link compiled code and run it regularly. That's what various build bots for llvm does. Same way, if you want to validate generated debug info you have to go through the debugger. That said, there is a new unit test harness available. All it needs is more unit tests... http://llvm.org/docs/TestingGuide.html#quickdebuginfotests - Devang On Mar 16, 2011, at 11:40 PM, Talin wrote: > One problem that has been vexing me of late: It seems that whenever I run into a problem that requires debugging one of my programs in gdb, before I can do that I have to fix my frontend's broken generation of debugging info. > > The code that generates debugging information is quite fragile - you have to generate metadata for each of your files, classes, and functions, and do so without error, because if you do make a mistake, the only way you'll find out is because gdb refuses to debug your program. And as I work on the code, occasionally bugs creep in, either from my side or occasionally from the LLVM side. The problem is, that I don't always check if the debug information is valid, so several weeks can go by and I don't notice something broke. > > What is needed is some way to write a unit test for DWARF information, so that if I broke something I would notice it immediately and could either fix it or roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere near strict enough - you can create completely nonsensical DIEs that still pass through Verify(). And even if the Verify() methods were 100% reliable, they only test whether the LLVM metadata is valid - they don't test whether the actual DWARF embedded in the final executable is correct. > > I suppose you could do something with dwarfdump -ka, although it would be better to have something that worked on all platforms. Even dwarfdump itself has different option syntax on Linux vs. OS X. And I don't think it's possible right now to generate code that passes through dwarfdump with zero error messages, or at least, I've never been able to figure out how to do it. > > I was thinking that since lldb needs to know how to interpret all this stuff anyway, perhaps there could be a way to use the same code to validate the debug information for an executable. I know lldb doesn't run on every platform yet, but I suspect that the parts of lldb which decode DWARF are fairly generic. > > -- > -- Talin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kd at kendyck.com Thu Mar 17 08:42:55 2011 From: kd at kendyck.com (Ken Dyck) Date: Thu, 17 Mar 2011 09:42:55 -0400 Subject: [LLVMdev] Calls to functions with signext/zeroext return values In-Reply-To: References: Message-ID: On Wed, Mar 16, 2011 at 8:58 AM, Ken Dyck wrote: > I ran into problems in a DSP back end that I'm working on where the > return conventions for i16 and i32 are slightly different (they are > both returned in the same accumulator register, but at different > offsets within the accumulator). The callee promoted the return value > to i32, but the caller was expecting it to be returned as an i16. > > So I made some changes to SelectionDAGBuilder (see attached patch) to > truncate return values back to their declared sizes and that seemed to > fix the problems for my DSP backend. That patch has been rendered obsolete by some recent changes by Cameron. Attached is an updated one. > So my questions: > > ... > 2. If so, any suggestions on how to fix my patch or the MSP430 back > end so it won't crash the MSP430 tests? With Cameron's addition of a getTypeForExtendedInteger() hook in TargetLowering, I've been able to work around the failures in the MSP430 back end by overriding the extension size to i16 (see the attached patch). This seems like it would be the appropriate size for the architecture -- since the current default of i32 isn't a legal type -- but I really have no idea what affects it has on the runtime library or whether it conforms to an official ABI (does one exist?). Anton, do you have any comments? -Ken -------------- next part -------------- A non-text attachment was scrubbed... Name: call-extended-return.r2.patch Type: text/x-patch Size: 2655 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110317/8aab78c2/attachment-0002.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: msp430-i16-extend.patch Type: text/x-patch Size: 1237 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110317/8aab78c2/attachment-0003.bin From dpatel at apple.com Thu Mar 17 08:48:41 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 17 Mar 2011 06:48:41 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: Message-ID: Renato, On Mar 17, 2011, at 3:25 AM, Renato Golin wrote: > could help with the verification process (since it's much better to > fail verification than to fail gdb testuite), but I don't know the > design decisions being taken for debug information/metadata, and they > change too frequently to dig the code to learn. I think you are mistaken here. I maintain and support debug info for two front ends (llvm-gcc and clang). Go ahead and check svn archives for last one year and see how many times I had to update llvm-gcc FE. > There is no API > documentation and the interface (IR metadata) docs are old and > inaccurate. > > I'd say, in order of importance, the three things that need to be done ASAP are: > > 1. Stick to one representation and document it (like LangRef), so > other people could help In last 5 or so llvm releases, encoded debug info representation in llvm IR has changed only once (using metadata, instead of global variables). All other changes are incremental *and* backward compatible. Regarding documentation, it is on my list. However, your argument has same disconnect as some one who looks at LangReg and says I do not know what exactly FE has to generate to produce a working program. Well, what you need is a How To Write a Front End document. > 2. Enhance Validate() methods to be extremely strict (like Module's), > so it fails straight away See my response regarding Verify(). > 3. Create tests (unit and regression) and run them during check-all, > so we don't regress I have already mentioned debuginfo-tests at least once to you earlier. > > The tests are last because it's much easier to catch an assertion than > a silent codegen error. > - Devang From jan_sjodin at yahoo.com Thu Mar 17 09:12:52 2011 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Thu, 17 Mar 2011 07:12:52 -0700 (PDT) Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: <517A3103-873F-4482-B44B-FB993F2104C5@apple.com> References: <517A3103-873F-4482-B44B-FB993F2104C5@apple.com> Message-ID: <843762.38553.qm@web55603.mail.re4.yahoo.com> Could dwarfdump --verify be used to check the debug info? - Jan ________________________________ From: Devang Patel To: Talin Cc: LLVM Developers Mailing List Sent: Thu, March 17, 2011 9:41:10 AM Subject: Re: [LLVMdev] Writing unit tests for DWARF? Talin, If there is a magic wand, I would be interested to know! DIDescriptor.Verify() is not suitable for you needs. It checks structure of encoded debug info after optimizer has modified the IR. Its main goal is inform Dwarf writer, at the end of code gen, which IR construct it should ignore. If you want to test code gen you have to link compiled code and run it regularly. That's what various build bots for llvm does. Same way, if you want to validate generated debug info you have to go through the debugger. That said, there is a new unit test harness available. All it needs is more unit tests... http://llvm.org/docs/TestingGuide.html#quickdebuginfotests - Devang On Mar 16, 2011, at 11:40 PM, Talin wrote: > One problem that has been vexing me of late: It seems that whenever I run into >a problem that requires debugging one of my programs in gdb, before I can do >that I have to fix my frontend's broken generation of debugging info. > > The code that generates debugging information is quite fragile - you have to >generate metadata for each of your files, classes, and functions, and do so >without error, because if you do make a mistake, the only way you'll find out is >because gdb refuses to debug your program. And as I work on the code, >occasionally bugs creep in, either from my side or occasionally from the LLVM >side. The problem is, that I don't always check if the debug information is >valid, so several weeks can go by and I don't notice something broke. > > What is needed is some way to write a unit test for DWARF information, so that >if I broke something I would notice it immediately and could either fix it or >roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere >near strict enough - you can create completely nonsensical DIEs that still pass >through Verify(). And even if the Verify() methods were 100% reliable, they only >test whether the LLVM metadata is valid - they don't test whether the actual >DWARF embedded in the final executable is correct. > > I suppose you could do something with dwarfdump -ka, although it would be >better to have something that worked on all platforms. Even dwarfdump itself has >different option syntax on Linux vs. OS X. And I don't think it's possible right >now to generate code that passes through dwarfdump with zero error messages, or >at least, I've never been able to figure out how to do it. > > I was thinking that since lldb needs to know how to interpret all this stuff >anyway, perhaps there could be a way to use the same code to validate the debug >information for an executable. I know lldb doesn't run on every platform yet, >but I suspect that the parts of lldb which decode DWARF are fairly generic. > > -- > -- Talin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110317/fa19316b/attachment.html From rengolin at systemcall.org Thu Mar 17 09:29:04 2011 From: rengolin at systemcall.org (Renato Golin) Date: Thu, 17 Mar 2011 14:29:04 +0000 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: Message-ID: On 17 March 2011 13:48, Devang Patel wrote: > I think you are mistaken here. I maintain and support debug info for two front ends (llvm-gcc and clang). Go ahead and check svn archives for last one year and see how many times I had to update llvm-gcc FE. Hi Devang, First, I'm not attacking anyone. I said before and will say again: the work you've done is great. I know how complex it is to build something stable and keep it that way, and my comments were about how hard it was for me to help you in that matter. Take my last patch on Dwarf. I've run the tests, added my own, tested on a Mac and still, we found a problem only after the commit. I'm not saying that things like this won't happen, but it was really hard for me to test it and make sure the patch would actually work. > In last 5 or so llvm releases, encoded debug info representation in llvm IR has changed only once (using metadata, instead of global variables). All other changes are incremental *and* backward compatible. Not entirely true. The metadata style is the same, but the mechanism used to build it (DIBuilder) was changed (instead of DIFactory) without warning in Clang. That, per se, wouldn't be a problem, if the metadata generated by both of them were identical, which they were not. As I said before, on December we merged the LLVM tree and it broke our debug generation. It took me until February to be able to have time to fix it, but when I did, lots of arguments were different. Some had their "file" nulled, some integer arguments became boolean (or vice-versa), and some new arguments appeared out of the blue. However, migrating to DIBuilder took me only a couple of days and everything went back to normal again. So, while the infrastructure actually worked in the end, it took me by surprise and I had to guess what was going on to fix it properly. > Regarding documentation, it is on my list. However, your argument has same disconnect as some one who looks at LangReg and says I do not know what exactly FE has to generate to produce a working program. Well, what you need is a How To Write a Front End document. The debug documentation is not up-to-date. Metadata generated by DIBuilder doesn't look like what's in the docs. And the document describes some types and declarations, but it doesn't explain the relationship between them and also doesn't describe all of them. So, different from LangRef, the debug doc is not a spec. It's just a document. I tried to write how to write a front-end for Dwarf (wiki), but as I didn't have enough knowledge on how to really use it, I couldn't go too far. >> 2. Enhance Validate() methods to be extremely strict (like Module's), >> so it fails straight away > > See my response regarding Verify(). So, IR has a natural verification process, while you're building it. Lots of assertions will prevent you from building rubbish, and that makes up for the lack of information in LangRef. After building IR, the validation process will catch up most of what was left over and only a few bugs slip through to the codegen process, which also has loads of assertions. So, the amount of bugs that get through to execution time are as low as possible. But it's way harder to verify Metadata, because of it's inherent variant nature. I get that and am NOT asking for a magic wand (though, if you have it... ;). And Dwarf also doesn't help, because there is a lot you can do with Dwarf that is legal but won't amount to anything in a debugger. What I'm proposing is a simple rule-set, enforced by a validation pass, that will reject dubious metadata. We could start as an optional pass, being very restrictive and failing most known code and unit tests. With time, we can extend and add corner cases to this validation until we're comfortable and turn it on by default. I personally think that it's much easier to relax strict asserts than to rely on gdb for testing. cheers, --renato From dpatel at apple.com Thu Mar 17 09:42:25 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 17 Mar 2011 07:42:25 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: <843762.38553.qm@web55603.mail.re4.yahoo.com> References: <517A3103-873F-4482-B44B-FB993F2104C5@apple.com> <843762.38553.qm@web55603.mail.re4.yahoo.com> Message-ID: <934630AD-64D0-41DE-B06D-FDEA686BF0C2@apple.com> On Mar 17, 2011, at 7:12 AM, Jan Sjodin wrote: > Could dwarfdump --verify be used to check the debug info? Yes, it could be used to validate DWARF structure of debug info. It does not check whether the information communicated through dwarf is correct or not. E.g. if dwarf info says variable is at frame pointer + x offset then you need debugger to verify that, dwarfdump won't help you. - Devang > > - Jan > > From: Devang Patel > To: Talin > Cc: LLVM Developers Mailing List > Sent: Thu, March 17, 2011 9:41:10 AM > Subject: Re: [LLVMdev] Writing unit tests for DWARF? > > Talin, > > If there is a magic wand, I would be interested to know! > > DIDescriptor.Verify() is not suitable for you needs. It checks structure of encoded debug info after optimizer has modified the IR. Its main goal is inform Dwarf writer, at the end of code gen, which IR construct it should ignore. > > If you want to test code gen you have to link compiled code and run it regularly. That's what various build bots for llvm does. Same way, if you want to validate generated debug info you have to go through the debugger. > > That said, there is a new unit test harness available. All it needs is more unit tests... > > http://llvm.org/docs/TestingGuide.html#quickdebuginfotests > > - > Devang > > On Mar 16, 2011, at 11:40 PM, Talin wrote: > > > One problem that has been vexing me of late: It seems that whenever I run into a problem that requires debugging one of my programs in gdb, before I can do that I have to fix my frontend's broken generation of debugging info. > > > > The code that generates debugging information is quite fragile - you have to generate metadata for each of your files, classes, and functions, and do so without error, because if you do make a mistake, the only way you'll find out is because gdb refuses to debug your program. And as I work on the code, occasionally bugs creep in, either from my side or occasionally from the LLVM side. The problem is, that I don't always check if the debug information is valid, so several weeks can go by and I don't notice something broke. > > > > What is needed is some way to write a unit test for DWARF information, so that if I broke something I would notice it immediately and could either fix it or roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere near strict enough - you can create completely nonsensical DIEs that still pass through Verify(). And even if the Verify() methods were 100% reliable, they only test whether the LLVM metadata is valid - they don't test whether the actual DWARF embedded in the final executable is correct. > > > > I suppose you could do something with dwarfdump -ka, although it would be better to have something that worked on all platforms. Even dwarfdump itself has different option syntax on Linux vs. OS X. And I don't think it's possible right now to generate code that passes through dwarfdump with zero error messages, or at least, I've never been able to figure out how to do it. > > > > I was thinking that since lldb needs to know how to interpret all this stuff anyway, perhaps there could be a way to use the same code to validate the debug information for an executable. I know lldb doesn't run on every platform yet, but I suspect that the parts of lldb which decode DWARF are fairly generic. > > > > -- > > -- Talin > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110317/989b1124/attachment.html From dpatel at apple.com Thu Mar 17 09:56:59 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 17 Mar 2011 07:56:59 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: Message-ID: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> On Mar 17, 2011, at 7:29 AM, Renato Golin wrote: > On 17 March 2011 13:48, Devang Patel wrote: >> I think you are mistaken here. I maintain and support debug info for two front ends (llvm-gcc and clang). Go ahead and check svn archives for last one year and see how many times I had to update llvm-gcc FE. > > Hi Devang, > > First, I'm not attacking anyone. I understand. But you're missing the point of my comment :) If the IR used to encode debug is changing rapidly, as you say, then I'd be force to frequently modify llvm-gcc FE. However, I have not modified llvm-gcc FE in last year or so, so I'd say the encoded IR has been stable. In last 6+ months, llvm-gcc build bot running gdb testsuite is consistently reporting same number of passes and fails (if you ignore inherent gdb testsuite stability issues). > I said before and will say again: the > work you've done is great. I know how complex it is to build something > stable and keep it that way, and my comments were about how hard it > was for me to help you in that matter. > > Take my last patch on Dwarf. I've run the tests, added my own, tested > on a Mac and still, we found a problem only after the commit. I'm not > saying that things like this won't happen, but it was really hard for > me to test it and make sure the patch would actually work. In other words, someone is changing target independent code generation and expects llvm regression tests to catch all bugs. If that's true, we don't need any build bots linking and executing and running llvm generated code. > >> In last 5 or so llvm releases, encoded debug info representation in llvm IR has changed only once (using metadata, instead of global variables). All other changes are incremental *and* backward compatible. > > Not entirely true. The metadata style is the same, but the mechanism > used to build it (DIBuilder) was changed (instead of DIFactory) > without warning in Clang. That, per se, wouldn't be a problem, if the > metadata generated by both of them were identical, which they were > not. Again, you're mistaken. llvm-gcc and dragon-egg still uses DIFactory and debug info quality has remained same. This says the IR used to encode debug has not been impacted by DIBuilder vs. DIFactory. Note, DIBuilder etc.. are utilities sued to produce IR, not the interface defined by the IR. In other words, replacement of of OldIRBuilder interface with NewIRBuilder has nothing to do with stability of llvm IR documented by LangRef.html. > What I'm proposing is a simple rule-set, enforced by a validation > pass, that will reject dubious metadata. We could start as an optional > pass, being very restrictive and failing most known code and unit > tests. With time, we can extend and add corner cases to this > validation until we're comfortable and turn it on by default. I > personally think that it's much easier to relax strict asserts than to > rely on gdb for testing. dwarfdump --verify will do this. - Devang From rengolin at systemcall.org Thu Mar 17 10:44:38 2011 From: rengolin at systemcall.org (Renato Golin) Date: Thu, 17 Mar 2011 15:44:38 +0000 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> Message-ID: On 17 March 2011 14:56, Devang Patel wrote: > In other words, someone is changing target independent code generation and expects llvm regression tests to catch all bugs. If that's true, we don't need any build bots linking and executing and running llvm generated code. Ok, that was a bad example... ;) > Again, you're mistaken. ?llvm-gcc and dragon-egg still uses DIFactory and debug info quality has remained same. This says the IR used to encode debug has not been impacted by DIBuilder vs. DIFactory. I see, so that comes back to my original point. I couldn't build a complete debug infrastructure with DIFactory because I was lost on many implementation details of the order and types of metadata information in each IR statement. That's probably the reason why, in my case (and probably Talin's), it all blew up. > Note, DIBuilder etc.. are utilities sued to produce IR, not the interface defined by the IR. ?In other words, replacement of of OldIRBuilder interface with NewIRBuilder has nothing to do with stability of llvm IR documented by LangRef.html. Yes, I know. I'm more concerned with the 'what' and not the 'how'. For me, an up-to-date documentation on what's strictly needed to produce legal Dwarf with a clear, short, explanation for each field and how they relate to each other (as this is more important for debug than instructions), are of a higher priority than a full-blown validation system. > dwarfdump --verify will do this. Is this being used in LLVM tests? This is an idea. I had a look at your debug tests in clang and they're similar to what I do here. The problem with debug tests is that it doesn't depend only on the compiler, but on the debugger for each host/target platform combinations. Though, dwarfdump could help us grep out the basic stuff without the need to resort to a debugger to check for Dwarf structure, just correct locations and line information. I'm using LIT to also check Dwarf structure, but I have to say that my success is limited. While I could get far by creating variables on metadata lines and checking they point to the right types, every time one tiny thing changes, I have to refactor most of the tests. I did the same with Dwarf, checking for addresses of types and later seeing if the variable refers to it, checking if the location points to debug_loc or is just an expression, etc. But debug information is far too volatile to make that approach reasonable in the long run... :( cheers, --renato From dag at cray.com Thu Mar 17 11:32:01 2011 From: dag at cray.com (David A. Greene) Date: Thu, 17 Mar 2011 11:32:01 -0500 Subject: [LLVMdev] Long-Term ISel Design In-Reply-To: (Chris Lattner's message of "Wed, 16 Mar 2011 22:54:14 -0500") References: Message-ID: Chris Lattner writes: >> 1. We have special target-specific operators for certain shuffles in X86, >> such as X86unpckl. > It also eliminates a lot of fragility. Before doing this, X86 > legalize would have to be very careful to specifically form shuffles > that it knew isel would turn into (e.g.) unpck operations. Now > instead of forming specific carefully constructed shuffle masks (not > making sure other code doesn't violate them) it can just directly form > the X86ISD node. Right. What I've presented would reverse this. Rather than making Legalize have to know about what table-driven isel can and cannot do, have table-driven isel run first, see what it can do and then leave the rest for manual selection. We would still keep the existing pre-table-driven-isel passes so we'd still have a chance to do some cleanup before the main table-driven isel. Obviously a lot of details have to be worked out. >> 2. Sometimes DAGs are legal in some contexts but not others and it is a >> pain to deal with. A good example is VBROADCAST, where a <0,0,0,0> >> shuffle is natively supported if the source vector is in memory. >> Otherwise it's not legal and manual lowering is required. In this >> case the legality check is doing the DAG match by hand, replicating >> what TableGen-produced code already does. > Yes, this isn't good. Instead, the shuffle should be legalized to > something that takes a pointer (memory operand). That means that X86 > isel would form the *fully legal* X86ISD node, and nothing would be > able to break it and it could never fail to match. Well, it dopesn't _have_to_ form an X86ISD node. I don't do that now. But it's fragile in the sense that no one else should mess with that piece of the DAG. But the real point is that in forming the X86ISD node currently, I'm doing exaclty what the tblgen-generated code already does. If the shuffle doesn't take a memory operand, then I have to lower it to something else. Where I do that (before or after table-driven isel) doesn't matter. I do the same work either way. But by doing it after I avoid writing duplicate DAG matching code in the case where the operand is in memory. >> These two examples are related: we're duplicating functionality manually >> that's already available automatically. > > Not sure what you mean by this. I mean that in legalize/lowering we're massaging the DAG to get it into a state where tabel-driven isel can match it. There is a lot of code like this: if (shuffle_is_MOVL) do_nothing_and_return It's duplicating exactly the checks that the table-driven isel does later. In the VBROADCASTSS/D case, it's doing an entire DAG match to check whether it's implementable with VBROADCASTSS/D. Why not just let table-driven isel run first and take care of these checks just once? If something doesn't match, we then know it needs manual lowering and selection. >> legalize >> | >> V >> manual lowering (X86ISelLowering) >> | >> V >> manual isel (X86ISelDAGToDAG) >> | >> V >> table-driven isel (.td files/X86GenDAGISel) >> | >> V >> manual isel (some to-be-design piece) > > I'm not sure what you mean here. Are you suggesting that these be > completely separate passes over the dag? Why do "manual isel" and > "table driven isel" as separate passes? If they are interlaced, then > how is this different than what we already have? No, not as separate passes. Right now we have code like this in X86ISelDAGToDAG: X86DAGToDAGISel::Select(SDNode *Node) { ... switch (Opcode) { ...do a bunch of manual selection... } // If we get here we didn't select manually. SelectCode(); // Select via table-driven isel, abort if no match. } What I'm proposing is that we do this: X86DAGToDAGISel::Select(SDNode *Node) { ... switch (Opcode) { ...do a bunch of manual selection, less than before... } // If we get here we didn't select manually. result = SelectCode(); // Select via table-driven isel. if (result_is_good()) { return; } switch (Opcode) { ...do a bunch of manual selection, some that used to be above and in legalize/lowering... } cannot_select_abort(); } > You're saying that we do this on a node-by-node basis? You mean on an SDNode basis? I don't think so. As I said, details have to be worked out but I imagine we'd send the selection DAG to the tblgen-generated code as we do now and any "leftover" bits would have to be processed afterward by the manual selector. Now, there is a phase ordering issue in that some of the legalize/lowering code massages the tree so the tblgen stuff can make a "good" match. We probably still want to do that in some cases so we keep that code where it is. What I'm aiming at getting rid of is all of the code that does: if (this_is_already_matchable()) { return; } > The reason that codegen aborts on unselectable operations is that they > are invalid and should not be formed. Your example of vbroadcast is a > great one: the X86ISD node for it *should not take a vector register > input*. If it does, then the X86ISD node is incorrectly defined. What I'm saying is that there would be no X86ISD node. If the pattern is there, tblgen-produced code will match it. If not, we have to lower it manually and would just do what we'd have done in legalize/lowering before, the difference being we'd now do it after table-driven isel rather than before. -Dave From mclow.lists at gmail.com Thu Mar 17 11:47:43 2011 From: mclow.lists at gmail.com (Marshall Clow) Date: Thu, 17 Mar 2011 09:47:43 -0700 Subject: [LLVMdev] make: *** No rule to make target `/Makefile', needed by `Makefile'. Stop. Message-ID: <0EC2F7B7-45A8-4FB4-B2E9-F375F7BC48CC@gmail.com> I have two different machines with LLVM source trees on them. One builds fine, the other gives the error above. This behavior started on Tuesday. Both machines have the sources on a non-boot disk. The source trees are identical, and in both cases, I have a separate object directory. I have removed the object directories, and reran configure: mkdir llvm-build cd llvm-build ../llvm/configure Still get the same behavior (success on one, failure on the other). $ make -v > GNU Make 3.81 > Copyright (C) 2006 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. > There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A > PARTICULAR PURPOSE. > > This program built for i386-apple-darwin10.0 The only differences that I can think of are: 1) The unsuccessful machine has recently had XCode 4 installed. 2) The successful machine has the sources on a volume with a case-sensitive file system. Any ideas? -- Marshall Marshall Clow Idio Software A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki From Anton.Lokhmotov at arm.com Thu Mar 17 12:17:27 2011 From: Anton.Lokhmotov at arm.com (Anton Lokhmotov) Date: Thu, 17 Mar 2011 17:17:27 -0000 Subject: [LLVMdev] [PATCH] OpenCL half support References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> Message-ID: <000201cbe4c7$310977c0$931c6740$@Lokhmotov@arm.com> Hi Chris, So what do you think about this proposal? If you agree, it would be good to include the patch into the 2.9 release (to avoid breaking compatibility later). Best regards, Anton. > -----Original Message----- > From: Anton Lokhmotov [mailto:Anton.Lokhmotov at arm.com] > Sent: 24 February 2011 10:19 > To: 'Chris Lattner' > Cc: llvmdev at cs.uiuc.edu; cfe-dev at cs.uiuc.edu > Subject: RE: [LLVMdev] [PATCH] OpenCL half support > > Hi Chris, > > > Does the spec force evaluation to happen in half mode, or does it > > specify that there is a promotion to float (or some other type), an > > operation, then truncation back to half? > > The last paragraph in section 9.6 says: "NOTE: Implementations may > perform floating-point operations on half scalar or vector data types > by converting the half values to single precision floating-point values > and performing the operation in single precision floating-point. In > this case, the implementation will use the half scalar or vector data > type as a storage only format." > > That is, an implementation may perform operations on half scalar and > vector values either using half-precision operations (if supported > natively) or using single-precision operations (always supported > natively). In either case, it's desirable to represent half operations > in the IR, and let the backend make the decision. > > Cheers, > Anton. From anton at korobeynikov.info Thu Mar 17 12:56:24 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Thu, 17 Mar 2011 20:56:24 +0300 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <-3007323827937840415@unknownmsgid> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <-3007323827937840415@unknownmsgid> Message-ID: Hi Anton, > So what do you think about this proposal? ?If you agree, it would be good to > include the patch into the 2.9 release (to avoid breaking compatibility > later). Regardless of the review, it's too late for 2.9, stuff was already branched. PS: my 2 cents: do not forget to handle the existing half fp <-> float conversion intrinsics. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dpatel at apple.com Thu Mar 17 13:45:01 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 17 Mar 2011 11:45:01 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> Message-ID: On Mar 17, 2011, at 8:44 AM, Renato Golin wrote: >> dwarfdump --verify will do this. > > Is this being used in LLVM tests? This is an idea. It is not used in llvm/test tests. > I had a look at your debug tests in clang and they're similar to what I do here. > > The problem with debug tests is that it doesn't depend only on the > compiler, but on the debugger for each host/target platform > combinations. Though, dwarfdump could help us grep out the basic stuff > without the need to resort to a debugger to check for Dwarf structure, > just correct locations and line information. Yes, It'd be good to have a setup to build SingleSource and MultiSource tests with debug info and run dwarfdump --verify on them. - Devang From rengolin at systemcall.org Thu Mar 17 16:00:24 2011 From: rengolin at systemcall.org (Renato Golin) Date: Thu, 17 Mar 2011 21:00:24 +0000 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> Message-ID: On 17 March 2011 18:45, Devang Patel wrote: > Yes, It'd be good to have a setup to build SingleSource and MultiSource tests with debug info and run dwarfdump --verify on them. I tried some dwarfdump on a few examples I had and the comparison with codesourcery's gcc is impossible, the resulting Dwarf is very different. For instance, GCC declares the types at the beginning of the tree while LLVM only does when needed (metadata-style). The relocation sections in GCC are huge and they also use debug_loc in many more cases than LLVM, for instance extern functions, global variables and the cases I mentioned in my example before. Of course, Dwarf produced by Armcc is also different (though, closer to what GCC does, for obvious reasons). One way we could do this, slowly and painfully, but surely, is to generate Dwarf, use the debugger to make sure that Dwarf actually produces what GDB is expecting (you probably have many cases already) and take a snapshot of that Dwarf. Once we understand how that Dwarf works and what are the required tags, we create a dwarfdump test that will FileCheck on those. This is more or less how I'm doing my local IR/Dwarf/GDB tests. It takes a while, but have saved me from some regressions already... ;) -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From bwp at bwp.dk Thu Mar 17 16:52:14 2011 From: bwp at bwp.dk (Bjarke Walling) Date: Thu, 17 Mar 2011 22:52:14 +0100 Subject: [LLVMdev] Small improvements to llvm demo page (Bug 1440) Message-ID: Hi, Some time ago I posted a patch against Bug 1440 about adding a compiler and target option to the llvm demo page. I didn't get any response when fixing the last problems in the patch. Will anyone take a look at it? The patch adds a compiler option with the choice of clang and llvm-gcc. The source languages C/C++/Obj-C/Obj-C++ are available from clang and C/C++/Fortran is available from llvm-gcc. An auto-generated target option is provided. The list is read from `llc -version` and split in two groups: Stable targets and Experimental targets. Maybe "stable" is too much, fx. I read the C Backend is not that well supported anymore. Special targets (always listed first) are LLVM assembly (provided by llvm-dis) and LLVM C++ API code. At last but not least the shown version number is taken from `llvm-config --version`. Any comments appreciated. Thanks, Bjarke Walling From dpatel at apple.com Thu Mar 17 16:54:20 2011 From: dpatel at apple.com (Devang Patel) Date: Thu, 17 Mar 2011 14:54:20 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> Message-ID: <96F375C7-9B91-4184-AB61-D1AB0A489202@apple.com> Renato, On Mar 17, 2011, at 2:00 PM, Renato Golin wrote: > On 17 March 2011 18:45, Devang Patel wrote: >> Yes, It'd be good to have a setup to build SingleSource and MultiSource tests with debug info and run dwarfdump --verify on them. > > I tried some dwarfdump on a few examples I had and the comparison with > codesourcery's gcc is impossible, the resulting Dwarf is very > different. I did not mean comparing dwarfdump output. It is never going to work. Sorry for the confusion. I meant letting dwarfdump verify the structure of dwarf info. > For instance, GCC declares the types at the beginning of the tree > while LLVM only does when needed (metadata-style). The relocation > sections in GCC are huge and they also use debug_loc in many more > cases than LLVM, for instance extern functions, global variables and > the cases I mentioned in my example before. Of course, Dwarf produced > by Armcc is also different (though, closer to what GCC does, for > obvious reasons). You'll find that dwarf produced by llvm-gcc and clang is also different (even from the days when clang used DIFactory). And guess what clang generate DIE (Debug Info Entries) ordering is likely to change again in near future! > One way we could do this, slowly and painfully, but surely, is to > generate Dwarf, use the debugger to make sure that Dwarf actually > produces what GDB is expecting (you probably have many cases already) > and take a snapshot of that Dwarf. Once we understand how that Dwarf > works and what are the required tags, we create a dwarfdump test that > will FileCheck on those. Instead, isn't it easier and straight forward to do a FileCheck on debugger output in the first place ? - Devang From eli.friedman at gmail.com Thu Mar 17 19:27:59 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 17 Mar 2011 17:27:59 -0700 Subject: [LLVMdev] [cfe-dev] [release_29] Good status of ppc-redhat-linux on Fedora 12 PS3 In-Reply-To: <08BBD0A2-B080-4FDA-AB59-5D6996DE05AD@apple.com> References: <08BBD0A2-B080-4FDA-AB59-5D6996DE05AD@apple.com> Message-ID: On Wed, Mar 16, 2011 at 2:19 PM, Bill Wendling wrote: > On Mar 15, 2011, at 7:25 PM, NAKAMURA Takumi wrote: > >> Good morning. >> > Hi Nakamura, > >> LLVM and clang can be built successfully on Fedora 12 PS3. >> > Hooray! :-) > >> On RC1, only one test failed. >> test/CodeGen/X86/fold-pcmpeqd-0.ll >> > Eric commented that this should be fixed on the release branch right now. > >> On release_29 branch, all llvm tests can pass. > > Woo! :-) > >> (I don't mention clang tests :p ) >> > >>.> > >> ...Takumi >> >> >> Fedora release 12 (Constantine) >> Linux speedking.localdomain 2.6.32.23-170.fc12.ppc64 #1 SMP Mon Sep 27 >> 17:09:35 UTC 2010 ppc64 ppc64 ppc64 GNU/Linux >> >> llvm config.status 2.9svn >> configured by ../../llvm/configure, generated by GNU Autoconf 2.60, >> ?with options "'-C' '--build=ppc-redhat-linux' '--enable-targets=all' >> '--enable-optimized' 'build_alias=ppc-redhat-linux' >> '--with-optimize-option=-O3 -Werror' >> '--prefix=/home/chapuni/BUILD/llvm-ppc-static/install/stage1'" >> >> Failing Tests (39): > > Hrm. Could you work with the Clang guys to prioritize these failures? If they're "real", we need to get them fixed soon. There do appear to be "real" bugs in both the indexing code and the PCH code. The rest of the failures appear to be a combination of tests which are outright wrong and tests which make questionable assumptions about the host. -Eli > Thank you for the testing! > > -bw > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > From rengolin at systemcall.org Fri Mar 18 04:25:08 2011 From: rengolin at systemcall.org (Renato Golin) Date: Fri, 18 Mar 2011 09:25:08 +0000 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: <96F375C7-9B91-4184-AB61-D1AB0A489202@apple.com> References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> <96F375C7-9B91-4184-AB61-D1AB0A489202@apple.com> Message-ID: On 17 March 2011 21:54, Devang Patel wrote: > I did not mean comparing dwarfdump output. It is never going to work. Sorry for the confusion. I meant letting dwarfdump verify the structure of dwarf info. Yes, using dwarfdump to verify is fine, but producing correct Dwarf is not the same as producing THE correct Dwarf you need. You still need some way of grepping for the symbols you want to have generated, or the test is not testing for the right thing. You could have regressions and not break the dwarf, but break what the dwarf represents. > Instead, isn't it easier and straight forward to do a FileCheck on debugger output in the first place ? Oh, but that is only one level of testing, and it doesn't guarantee your generating correct Dwarf, just "gdb compatible" Dwarf. I'm doing all three levels (IR, Dwarf and gdb) and it's much easier to see it fail in the IR level, or even Dwarf than to debug problems using gdb output. But I need some better validation for both IR and Dwarf. One way of testing without comparing IR and Dwarf would be to have several tests, ALL of them validating with dwarfdump AND stepping through with gdb and testing every single detail in them. That way, you assure that the Dwarf generated is gdb compatible and avoid regressions in that area. If all you care is gdb compatibility, then you're safe. But when you have a regression, you'll have two major problems: 1. You won't know where the regression started. There could have been multiple regressions and only the last one caused validation/gdb to fail, and the last one could even be an unrelated commit that changes ELF sections layout, for instance. 2. Even if there was only one regression, finding the bug would be tiresome. You won't have a "good version" of the IR or the Dwarf to compare and see what went wrong. Someone not used to Dwarf would take a long time to figure out what went wrong and how to fix it. Depends on the level of compatibility you want... -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From geek4civic at gmail.com Fri Mar 18 04:54:53 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Fri, 18 Mar 2011 18:54:53 +0900 Subject: [LLVMdev] [RC1] Status of Visual Studio 8, 9 and 10 Message-ID: Good evening, guys! At first, I apologize my report is a little gross, I have so little time to do checking process minutely. * RC1 RC1 can be built on VS8, 9, 10 with Debug|Release. At one point, r127264(in release_29/trunk) is needed to build with Debug on VS10. RC1 can pass clang-test with any configurations. RC1 fails llvm's check on many tests. * RC1 and patches ToT would be ready to failure-free build with several patches. - r127264 (in release_29/trunk) - r127723 [PR6270] PathV1::makeUnique() - r127731, r127732, r127733, r127734, r127775 [PR9234] test/CodeGen/X86 tweaks. - r127872 [PR6745] format("%e") Even with these patches, a few tests may fail. It seems {VS8 | VS10} Release are good. - VS8 Debug LLVM :: Transforms/SRETPromotion/basictest.ll LLVM-Unit :: support/debug/SupportTests.exe/CastingTest.cast - VS9 Debug LLVM :: Transforms/SRETPromotion/basictest.ll LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast - VS9 Release LLVM :: CodeGen/ARM/bfi.ll LLVM :: CodeGen/ARM/va_arg.ll LLVM :: CodeGen/Thumb2/bfi.ll - VS10 Debug LLVM :: Transforms/SRETPromotion/basictest.ll LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast ...Takumi -------------- next part -------------- Running Clang and LLVM regression tests -- Testing: 8733 tests, 8 threads -- FAIL: LLVM :: Transforms/SRETPromotion/basictest.ll (8221 of 8733) ******************** TEST 'LLVM :: Transforms/SRETPromotion/basictest.ll' FAILED ******************** Script: -- E:/llvm/build/cmake-vs8/bin/Debug/opt.EXE < E:/llvm/llvm/test/Transforms/SRETPromotion/basictest.ll -sretpromotion -S > E:/llvm/build/cmake-vs8/test/Transforms/SRETPromotion/Output/basictest.ll.tmp cat E:/llvm/build/cmake-vs8/test/Transforms/SRETPromotion/Output/basictest.ll.tmp | grep sret | E:/llvm/build/cmake-vs8/bin/Debug/count 1 -- Exit Code: 3 Command Output (stdout): -- Command 0: "E:/llvm/build/cmake-vs8/bin/Debug/opt.EXE" "-sretpromotion" "-S" Command 0 Result: 3 Command 0 Output: None Command 0 Stderr: CRT assert: D:\Program Files (x86)\Microsoft Visual Studio 8\VC\include\vector(238) : Assertion failed: vector iterators incompatible -- ******************** FAIL: LLVM-Unit :: support/debug/SupportTests.exe/CastingTest.cast (8621 of 8733) ******************** TEST 'LLVM-Unit :: support/debug/SupportTests.exe/CastingTest.cast' FAILED ******************** Note: Google Test filter = CastingTest.cast [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from CastingTest [ RUN ] CastingTest.cast ..\..\..\llvm\unittests\Support\Casting.cpp(93): error: Expected: (&F5) != (null_foo), actual: (null) vs (null) [ FAILED ] CastingTest.cast (0 ms) [----------] 1 test from CastingTest (1 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (1 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CastingTest.cast 1 FAILED TEST Classof: 0x7aca7c Classof: 0x7aca7c Classof: 0x7aca7c Classof: 0x0 Classof: 0x7aca7c Classof: 0x0 Classof: 0x7aca7c ******************** Testing Time: 176.39s ******************** Failing Tests (2): LLVM :: Transforms/SRETPromotion/basictest.ll LLVM-Unit :: support/debug/SupportTests.exe/CastingTest.cast Expected Passes : 8103 Expected Failures : 75 Unsupported Tests : 553 Unexpected Failures: 2 lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: lit.cfg:143: note: using clang: 'E:/llvm/build/cmake-vs8/bin/Debug/clang.EXE' 2 warning(s) in tests. -------------- next part -------------- Running Clang and LLVM regression tests -- Testing: 8733 tests, 8 threads -- FAIL: LLVM :: Transforms/SRETPromotion/basictest.ll (8220 of 8733) ******************** TEST 'LLVM :: Transforms/SRETPromotion/basictest.ll' FAILED ******************** Script: -- E:/llvm/build/cmake-vs9/bin/Debug/opt.EXE < E:/llvm/llvm/test/Transforms/SRETPromotion/basictest.ll -sretpromotion -S > E:/llvm/build/cmake-vs9/test/Transforms/SRETPromotion/Output/basictest.ll.tmp cat E:/llvm/build/cmake-vs9/test/Transforms/SRETPromotion/Output/basictest.ll.tmp | grep sret | E:/llvm/build/cmake-vs9/bin/Debug/count 1 -- Exit Code: 3 Command Output (stdout): -- Command 0: "E:/llvm/build/cmake-vs9/bin/Debug/opt.EXE" "-sretpromotion" "-S" Command 0 Result: 3 Command 0 Output: None Command 0 Stderr: CRT assert: d:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\vector(251) : Assertion failed: vector iterators incompatible -- ******************** FAIL: LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast (8619 of 8733) ******************** TEST 'LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast' FAILED ******************** Note: Google Test filter = CastingTest.cast [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from CastingTest [ RUN ] CastingTest.cast ..\..\..\llvm\unittests\Support\Casting.cpp(93): error: Expected: (&F5) != (null_foo), actual: (null) vs (null) [ FAILED ] CastingTest.cast (0 ms) [----------] 1 test from CastingTest (0 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (0 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CastingTest.cast 1 FAILED TEST Classof: 0xc64a64 Classof: 0xc64a64 Classof: 0xc64a64 Classof: 0x0 Classof: 0xc64a64 Classof: 0x0 Classof: 0xc64a64 ******************** Testing Time: 123.81s ******************** Failing Tests (2): LLVM :: Transforms/SRETPromotion/basictest.ll LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast Expected Passes : 8103 Expected Failures : 75 Unsupported Tests : 553 Unexpected Failures: 2 lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: lit.cfg:143: note: using clang: 'E:/llvm/build/cmake-vs9/bin/Debug/clang.EXE' 2 warning(s) in tests. -------------- next part -------------- Running Clang and LLVM regression tests -- Testing: 8724 tests, 8 threads -- FAIL: LLVM :: CodeGen/ARM/bfi.ll (3435 of 8724) ******************** TEST 'LLVM :: CodeGen/ARM/bfi.ll' FAILED ******************** Script: -- E:/llvm/build/cmake-vs9/bin/Release/llc.EXE -march=arm -mattr=+v6t2 < E:/llvm/llvm/test/CodeGen/ARM/bfi.ll | E:/llvm/build/cmake-vs9/bin/Release/FileCheck.EXE E:/llvm/llvm/test/CodeGen/ARM/bfi.ll -- Exit Code: 1 Command Output (stdout): -- Command 0: "E:/llvm/build/cmake-vs9/bin/Release/llc.EXE" "-march=arm" "-mattr=+v6t2" Command 0 Result: -3 Command 0 Output: Command 0 Stderr: Stack dump: 0. Program arguments: E:/llvm/build/cmake-vs9/bin/Release/llc.EXE -march=arm -mattr=+v6t2 1. Running pass 'Function Pass Manager' on module ''. 2. Running pass 'ARM Instruction Selection' on function '@f1' 00374D64 (0x012D0A48 0x012E8598 0x01396D30 0x00000000) 012CDA60 (0x012E8598 0x01396D30 0x00000000 0x00000000) 012D0A48 (0x01396D30 0x00000000 0x00000000 0x012EE690) 012E8598 (0x00000000 0x00000000 0x012EE690 0x00000001) 01396D30 (0x012D7340 0x012D7330 0x012D41B0 0x012EAF10) 012CD688 (0x012D7330 0x012D41B0 0x012EAF10 0x012EB2C4) 012D7340 (0x012D41B0 0x012EAF10 0x012EB2C4 0x012EAFA0) 012D7330 (0x012EAF10 0x012EB2C4 0x012EAFA0 0x012EB220) 012D41B0 (0x012EB2C4 0x012EAFA0 0x012EB220 0x00000000) 012EAF10 (0x012EAFA0 0x012EB220 0x00000000 0x012E2C50) 012EB2C4 (0x00000000 0x012EBF10 0x012EBF18 0x012EC710) Command 1: "E:/llvm/build/cmake-vs9/bin/Release/FileCheck.EXE" "E:/llvm/llvm/test/CodeGen/ARM/bfi.ll" Command 1 Result: 1 Command 1 Output: Command 1 Stderr: FileCheck error: '-' is empty. -- ******************** FAIL: LLVM :: CodeGen/ARM/va_arg.ll (3611 of 8724) ******************** TEST 'LLVM :: CodeGen/ARM/va_arg.ll' FAILED ******************** Script: -- E:/llvm/build/cmake-vs9/bin/Release/llc.EXE < E:/llvm/llvm/test/CodeGen/ARM/va_arg.ll -mtriple=armv7-none-linux-gnueabi | E:/llvm/build/cmake-vs9/bin/Release/FileCheck.EXE E:/llvm/llvm/test/CodeGen/ARM/va_arg.ll -- Exit Code: 1 Command Output (stdout): -- Command 0: "E:/llvm/build/cmake-vs9/bin/Release/llc.EXE" "-mtriple=armv7-none-linux-gnueabi" Command 0 Result: -3 Command 0 Output: Command 0 Stderr: Stack dump: 0. Program arguments: E:/llvm/build/cmake-vs9/bin/Release/llc.EXE -mtriple=armv7-none-linux-gnueabi 1. Running pass 'Function Pass Manager' on module ''. 2. Running pass 'ARM Instruction Selection' on function '@test1' 01444D64 (0x0025FA2C 0x00C8F170 0x00C8F178 0x00C8F180) Command 1: "E:/llvm/build/cmake-vs9/bin/Release/FileCheck.EXE" "E:/llvm/llvm/test/CodeGen/ARM/va_arg.ll" Command 1 Result: 1 Command 1 Output: Command 1 Stderr: FileCheck error: '-' is empty. -- ******************** FAIL: LLVM :: CodeGen/Thumb2/bfi.ll (4555 of 8724) ******************** TEST 'LLVM :: CodeGen/Thumb2/bfi.ll' FAILED ******************** Script: -- E:/llvm/build/cmake-vs9/bin/Release/llc.EXE -march=thumb -mattr=+v6t2 < E:/llvm/llvm/test/CodeGen/Thumb2/bfi.ll | E:/llvm/build/cmake-vs9/bin/Release/FileCheck.EXE E:/llvm/llvm/test/CodeGen/Thumb2/bfi.ll -- Exit Code: 1 Command Output (stdout): -- Command 0: "E:/llvm/build/cmake-vs9/bin/Release/llc.EXE" "-march=thumb" "-mattr=+v6t2" Command 0 Result: -3 Command 0 Output: Command 0 Stderr: Stack dump: 0. Program arguments: E:/llvm/build/cmake-vs9/bin/Release/llc.EXE -march=thumb -mattr=+v6t2 1. Running pass 'Function Pass Manager' on module ''. 2. Running pass 'ARM Instruction Selection' on function '@f1' 01364D64 (0x00C4F7A0 0x00C68818 0x019F6D30 0x00000000) 00C4C7B8 (0x00C68818 0x019F6D30 0x00000000 0x00000000) 00C4F7A0 (0x019F6D30 0x00000000 0x00000000 0x00C6E910) 00C68818 (0x00000000 0x00000000 0x00C6E910 0x00000001) 019F6D30 (0x00C566E8 0x00C566D8 0x00C540A8 0x00C6B190) 00C4C4A8 (0x00C566D8 0x00C540A8 0x00C6B190 0x00C6B544) 00C566E8 (0x00C540A8 0x00C6B190 0x00C6B544 0x00C6B220) 00C566D8 (0x00C6B190 0x00C6B544 0x00C6B220 0x00C6B4A0) 00C540A8 (0x00C6B544 0x00C6B220 0x00C6B4A0 0x00000000) 00C6B190 (0x00C6B220 0x00C6B4A0 0x00000000 0x00C62B28) 00C6B544 (0x00000000 0x00C6C190 0x00C6C198 0x00C6C990) Command 1: "E:/llvm/build/cmake-vs9/bin/Release/FileCheck.EXE" "E:/llvm/llvm/test/CodeGen/Thumb2/bfi.ll" Command 1 Result: 1 Command 1 Output: Command 1 Stderr: FileCheck error: '-' is empty. -- ******************** Testing Time: 94.80s ******************** Failing Tests (3): LLVM :: CodeGen/ARM/bfi.ll LLVM :: CodeGen/ARM/va_arg.ll LLVM :: CodeGen/Thumb2/bfi.ll Expected Passes : 8093 Expected Failures : 75 Unsupported Tests : 553 Unexpected Failures: 3 lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: lit.cfg:143: note: using clang: 'E:/llvm/build/cmake-vs9/bin/Release/clang.EXE' 2 warning(s) in tests. -------------- next part -------------- Running Clang and LLVM regression tests -- Testing: 8763 tests, 8 threads -- FAIL: LLVM :: Transforms/SRETPromotion/basictest.ll (8221 of 8763) ******************** TEST 'LLVM :: Transforms/SRETPromotion/basictest.ll' FAILED ******************** Script: -- E:/llvm/build/cmake-vs10/bin/Debug/opt.EXE < E:/llvm/llvm/test/Transforms/SRETPromotion/basictest.ll -sretpromotion -S > E:/llvm/build/cmake-vs10/test/Transforms/SRETPromotion/Output/basictest.ll.tmp cat E:/llvm/build/cmake-vs10/test/Transforms/SRETPromotion/Output/basictest.ll.tmp | grep sret | E:/llvm/build/cmake-vs10/bin/Debug/count 1 -- Exit Code: 3 Command Output (stdout): -- Command 0: "E:/llvm/build/cmake-vs10/bin/Debug/opt.EXE" "-sretpromotion" "-S" Command 0 Result: 3 Command 0 Output: None Command 0 Stderr: CRT assert: D:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\vector(238) : Assertion failed: vector iterators incompatible -- ******************** FAIL: LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast (8618 of 8763) ******************** TEST 'LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast' FAILED ******************** Note: Google Test filter = CastingTest.cast [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from CastingTest [ RUN ] CastingTest.cast 161>..\..\..\llvm\unittests\Support\Casting.cpp(93): error : Expected: (&F5) != (null_foo), actual: (null) vs (null) [ FAILED ] CastingTest.cast (1 ms) [----------] 1 test from CastingTest (1 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (1 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CastingTest.cast 1 FAILED TEST Classof: 0x6438fc Classof: 0x6438fc Classof: 0x6438fc Classof: 0x0 Classof: 0x6438fc Classof: 0x0 Classof: 0x6438fc ******************** Testing Time: 125.83s ******************** Failing Tests (2): LLVM :: Transforms/SRETPromotion/basictest.ll LLVM-Unit :: Support/Debug/SupportTests.exe/CastingTest.cast Expected Passes : 8133 Expected Failures : 75 Unsupported Tests : 553 Unexpected Failures: 2 lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: LitConfig.py:99: warning: Unable to find 'bash.exe'. lit.py: lit.cfg:143: note: using clang: 'E:/llvm/build/cmake-vs10/bin/Debug/clang.EXE' 2 warning(s) in tests. From geek4civic at gmail.com Fri Mar 18 05:44:04 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Fri, 18 Mar 2011 19:44:04 +0900 Subject: [LLVMdev] [RC1] Status of Mingw MSYS Message-ID: Good evening, guys! I suppose mingw build would be stable, though, I would like some patches to be picked up. * RC1 LLVM and clang can be built on either msys/autoconf, msys/cmake and mingw/cmake. By CMake, all tests can run but 37 of LLVM and 5 of clang tests would fail. On mingw by configure tests cannot be executed. [PR9505] For compiling, I saw a warning, in llvm-bcanalyzer.cpp. (fixed in r127858) * RC1 3-stage build Please note configure and cmake will set optimizer option to -O2 by default even if clang++ were used. I overwrote for clang --with-optimize-option=-O3. Generated clang would be stable. But stage2-bin and stage3-bin will not match. Investigating. [PR6270] Without r127723, generated clang would be unavailable for parallel build. [PR9505] With -Asserts, I saw more two warnings, lib/Analysis/LazyValueInfo.cpp and lib/Transforms/Instrumentation/PathProfiling.cpp. * RC1 with patches. I have committed many patches for MSYS and mingw. With them, clang and llvm can pass tests without any failures. (On TOT, we can run mingw tests without failures!) [LLVM] - r127239, r127240 Availability of Lit on MSYS configure. - r127723 [PR6270] PathV1::makeUnique() - r127726 Fix failure in test/Other/close-stderr.ll on Windows 7 - r127730 [PR9234] Fix test/CodeGen/X86/dyn-stackalloc.ll with MSYS bash - r127731, r127732, r127733, r127734, r127775 [PR9234] test/CodeGen/X86 tweaks. - r127858 [PR9505] Warning in llvm-bcanalyzer.cpp - r127872 [PR6745] format("%e") [clang] - r127284 Availability of Lit on MSYS configure. - r127729 test/Driver/hello.c: Tweak for cygming. - r127860, r127861 tweak 2 tests for MSYS bash. ...Takumi From atrick at apple.com Fri Mar 18 12:10:34 2011 From: atrick at apple.com (Andrew Trick) Date: Fri, 18 Mar 2011 10:10:34 -0700 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> Message-ID: <5B577C73-1798-4997-B275-F1DC976FBA46@apple.com> On Mar 13, 2011, at 2:01 PM, Arnaud Allard de Grandmaison wrote: > Hi all, > > The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. > > I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. > > Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. > > The patch could probably be made smarter : I am welcoming all suggestions. Hi Arnaud, This should be fixed in r127884. In some cases, your patch could result in multiple IVs for the same recurrence, and LSR was not able to cleanup afterward. Dan Gohman proposed an alternative, which seems to work great. See http://llvm.org/bugs/show_bug.cgi?id=9490. -Andy From dpatel at apple.com Fri Mar 18 13:14:27 2011 From: dpatel at apple.com (Devang Patel) Date: Fri, 18 Mar 2011 11:14:27 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> <96F375C7-9B91-4184-AB61-D1AB0A489202@apple.com> Message-ID: <44CD7927-74E2-4722-8896-AEAB4F2EDA0E@apple.com> On Mar 18, 2011, at 2:25 AM, Renato Golin wrote: >> Instead, isn't it easier and straight forward to do a FileCheck on debugger output in the first place ? > > Oh, but that is only one level of testing, and it doesn't guarantee > your generating correct Dwarf, just "gdb compatible" Dwarf. > > I'm doing all three levels (IR, Dwarf and gdb) and it's much easier to > see it fail in the IR level, or even Dwarf than to debug problems > using gdb output. > > But I need some better validation for both IR and Dwarf. > We have setup to test all these three levels you mention above. All you need is several thousands tests. Do you think that it will show up magically one day ? - Devang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110318/3eb9f3e7/attachment.html From douglasdocouto at gmail.com Fri Mar 18 13:43:27 2011 From: douglasdocouto at gmail.com (Douglas do Couto Teixeira) Date: Fri, 18 Mar 2011 15:43:27 -0300 Subject: [LLVMdev] How to integrate an analysis into LVI? In-Reply-To: References: Message-ID: Hi guys, I am trying to figure out how to use your Lazy Value Info pass, but I am having some questions. First, it seems that the implementation contains infra-structure to deal with range intervals, but the main interface only gives the client information about which values are constants. Is this true? Second, reading the code I see that the ranges are being computed, but I could not tell if this part of the code is already completely done. Would it be possible to print the ranges of the values computed by LVI? Third, besides JumpThreading.cpp is there any other client that uses LVI? I would like to couple my range analysis ( http://homepages.dcc.ufmg.br/~douglas/projects/RangeAnalysis/RangeAnalysis.paper.pdf) with the LVI interface, so that is why I am interested on it. Kind regards, Douglas On Mon, Mar 14, 2011 at 6:55 PM, Douglas do Couto Teixeira < douglasdocouto at gmail.com> wrote: > Hi guys, > > I have an analysis that is able to answer questions like this: given an > integer variable, what is the interval of values that this variable can > assume during the program's execution? > > I want to integrate this analysis into LLVM and it seems LVI (Lazy Value > Info) is the best place to do this kind of stuff. Can someone give some > hints about what I have to do to integrate my analysis into LVI? > > Best regards, > > Douglas > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110318/0df3b6fa/attachment.html From jlapre at gmail.com Fri Mar 18 14:33:22 2011 From: jlapre at gmail.com (Justin M. LaPre) Date: Fri, 18 Mar 2011 15:33:22 -0400 Subject: [LLVMdev] Reversing a function's CFG? Message-ID: <8BB50157-7FC9-4119-8E95-04384790A1EA@gmail.com> Hello, I was wondering if there was a quick way to reverse a function's CFG and, in turn, all basic blocks within it. Assuming all variables are globals, is there a quick way to generate a function's reversal? I highly doubt such functionality exists but I figured it was worth asking. I'm trying to develop an "undo function" generator pass that would be able to restore system state (without state-saving) after determining an error occurred. This is documented in, "Efficient Optimistic Parallel Simulations using Reverse Computation" by Carothers et al.[1] Thanks, -Justin [1] http://www.cs.rpi.edu/~chrisc/publications/carothers-tomacs-1999.html From rengolin at systemcall.org Fri Mar 18 15:08:42 2011 From: rengolin at systemcall.org (Renato Golin) Date: Fri, 18 Mar 2011 20:08:42 +0000 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: <44CD7927-74E2-4722-8896-AEAB4F2EDA0E@apple.com> References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> <96F375C7-9B91-4184-AB61-D1AB0A489202@apple.com> <44CD7927-74E2-4722-8896-AEAB4F2EDA0E@apple.com> Message-ID: On 18 March 2011 18:14, Devang Patel wrote: > We have setup to test all these three levels you mention above. All you need > is several thousands tests. Do you think that it will show up magically one > day ? Hehehe, it might... ;) I'm trying to get some debug tests here (maybe a few thousand, I don't know yet). I'll let you know how it goes when we get something. -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From clattner at apple.com Fri Mar 18 15:14:38 2011 From: clattner at apple.com (Chris Lattner) Date: Fri, 18 Mar 2011 13:14:38 -0700 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> Message-ID: <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> On Mar 17, 2011, at 10:17 AM, Anton Lokhmotov wrote: > Hi Chris, > > So what do you think about this proposal? If you agree, it would be good to > include the patch into the 2.9 release (to avoid breaking compatibility > later). Hi Anton, I'm sorry I don't have the patch anymore. Please resend. It is too late for new features in 2.9 though. >> The last paragraph in section 9.6 says: "NOTE: Implementations may >> perform floating-point operations on half scalar or vector data types >> by converting the half values to single precision floating-point values >> and performing the operation in single precision floating-point. In >> this case, the implementation will use the half scalar or vector data >> type as a storage only format." Ok. >> That is, an implementation may perform operations on half scalar and >> vector values either using half-precision operations (if supported >> natively) or using single-precision operations (always supported >> natively). In either case, it's desirable to represent half operations >> in the IR, and let the backend make the decision. It doesn't impact the utility of your approach, but I could not disagree more here. It would be *absolutely* the wrong thing to do for backends to compile IR half float operations into full float operations. Doing this would cause all sorts of problems with constant folding being inconsistent etc. Adding half float to LLVM IR is *only* reasonable if you have hardware that supports half float, or if you want to add softfloat operations for these. If we have a fp16 datatype in the IR, code generation *must* codegen these to something that implements the correct fp16 semantics. C is not a portable language, and trying to make LLVM IR magically fix this is a bad approach. Just like C compilers need to know sizeof(long), sizeof(void*) and many many other target specific details, an OpenCL compiler would need to know whether to generate fp16 or not. -Chris From clattner at apple.com Fri Mar 18 15:19:20 2011 From: clattner at apple.com (Chris Lattner) Date: Fri, 18 Mar 2011 13:19:20 -0700 Subject: [LLVMdev] Long-Term ISel Design In-Reply-To: References: Message-ID: On Mar 17, 2011, at 9:32 AM, David A. Greene wrote: > Chris Lattner writes: >>> 1. We have special target-specific operators for certain shuffles in X86, >>> such as X86unpckl. > >> It also eliminates a lot of fragility. Before doing this, X86 >> legalize would have to be very careful to specifically form shuffles >> that it knew isel would turn into (e.g.) unpck operations. Now >> instead of forming specific carefully constructed shuffle masks (not >> making sure other code doesn't violate them) it can just directly form >> the X86ISD node. > > Right. What I've presented would reverse this. Rather than making > Legalize have to know about what table-driven isel can and cannot do, > have table-driven isel run first, see what it can do and then leave > the rest for manual selection. > > We would still keep the existing pre-table-driven-isel passes so we'd > still have a chance to do some cleanup before the main table-driven > isel. > > Obviously a lot of details have to be worked out. I'm not seeing how this is useful for shuffles. Since tblgen doesn't generate table based matching for *any* shuffles, all of the matching code would end up as C++ code in X86ISelDagToDag, which would give us all of the problems we had before by moving to X86ISD nodes. >>> 2. Sometimes DAGs are legal in some contexts but not others and it is a >>> pain to deal with. A good example is VBROADCAST, where a <0,0,0,0> >>> shuffle is natively supported if the source vector is in memory. >>> Otherwise it's not legal and manual lowering is required. In this >>> case the legality check is doing the DAG match by hand, replicating >>> what TableGen-produced code already does. > >> Yes, this isn't good. Instead, the shuffle should be legalized to >> something that takes a pointer (memory operand). That means that X86 >> isel would form the *fully legal* X86ISD node, and nothing would be >> able to break it and it could never fail to match. > > Well, it dopesn't _have_to_ form an X86ISD node. I don't do that now. > But it's fragile in the sense that no one else should mess with that > piece of the DAG. I don't consider that an acceptable approach. There is no way to prevent something from CSE'ing a load away and "breaking" the dag, or moving an add between the nodes etc. You're violating a design principle of selection dags. > But the real point is that in forming the X86ISD node currently, I'm > doing exaclty what the tblgen-generated code already does. If the > shuffle doesn't take a memory operand, then I have to lower it to > something else. Where I do that (before or after table-driven isel) > doesn't matter. I do the same work either way. But by doing it after I > avoid writing duplicate DAG matching code in the case where the operand > is in memory. I don't agree, and I don't see why this is as bad as you're saying. The code creating these nodes is already target specific. You seem to be objecting to this because it is easier to write .td files (which turn into generated isel code) than it is to write legalize code in C++. If that is the problem you've identified, then why not make it possible to write legalize code in .td files? >>> These two examples are related: we're duplicating functionality manually >>> that's already available automatically. >> >> Not sure what you mean by this. > > I mean that in legalize/lowering we're massaging the DAG to get it into > a state where tabel-driven isel can match it. There is a lot of code > like this: > > if (shuffle_is_MOVL) > do_nothing_and_return > > It's duplicating exactly the checks that the table-driven isel does > later. In the VBROADCASTSS/D case, it's doing an entire DAG match > to check whether it's implementable with VBROADCASTSS/D. > > Why not just let table-driven isel run first and take care of these > checks just once? If something doesn't match, we then know it needs > manual lowering and selection. I think that this is just because the current code is in a half converted state. Bruno can say more, but the ultimate goal is to make ISD::SHUFFLE completely illegal on X86, so you'd never have this sort of thing. > >>> legalize >>> | >>> V >>> manual lowering (X86ISelLowering) >>> | >>> V >>> manual isel (X86ISelDAGToDAG) >>> | >>> V >>> table-driven isel (.td files/X86GenDAGISel) >>> | >>> V >>> manual isel (some to-be-design piece) >> > >> I'm not sure what you mean here. Are you suggesting that these be >> completely separate passes over the dag? Why do "manual isel" and >> "table driven isel" as separate passes? If they are interlaced, then >> how is this different than what we already have? > > No, not as separate passes. Right now we have code like this in > X86ISelDAGToDAG: > > X86DAGToDAGISel::Select(SDNode *Node) { > ... > switch (Opcode) { > ...do a bunch of manual selection... > } > // If we get here we didn't select manually. > SelectCode(); // Select via table-driven isel, abort if no match. > } > > What I'm proposing is that we do this: > > X86DAGToDAGISel::Select(SDNode *Node) { > ... > switch (Opcode) { > ...do a bunch of manual selection, less than before... > } > > // If we get here we didn't select manually. > > result = SelectCode(); // Select via table-driven isel. > > if (result_is_good()) { > return; > } > > switch (Opcode) { > ...do a bunch of manual selection, some that used to be above and in > legalize/lowering... > } > > cannot_select_abort(); > } Ok, much better than separate passes, thanks for the clarification! -Chris From czhao at eecg.toronto.edu Fri Mar 18 15:21:06 2011 From: czhao at eecg.toronto.edu (Chuck Zhao) Date: Fri, 18 Mar 2011 16:21:06 -0400 Subject: [LLVMdev] standard Data Flow Analysis available in LLVM? Message-ID: <4D83BEB2.2010202@eecg.toronto.edu> I am working on implementing an algorithm that needs one of the standard Data Flow Analysis as its precondition (VeryBusyExpression to be precise). Thus I take a look into LLVM (2.8) and check their availability. I do expect to see all of the following standard ones: - Reaching Definition (RD) - Live Variable (LV) - Available Expression (AE) - Very Busy Expression (VBE) To my surprise, I didn't find any. The only one that is kind of close to what I am looking for is: lib/Analysis/LiveValue.cpp I wonder what happens with all the standard DataFlow Analysis Passes? Do they ever exist? I know LLVM is based on SSA, which may utilize algorithms that bypassing the standard Data Flows. I am asking for suggestions if I do need these DataFlow ones, assuming I don't have to write them myself. Thank you very much Chuck From Micah.Villmow at amd.com Fri Mar 18 15:30:17 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 18 Mar 2011 15:30:17 -0500 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> Message-ID: > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Chris Lattner > Sent: Friday, March 18, 2011 1:15 PM > To: Anton.Lokhmotov at arm.com > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [PATCH] OpenCL half support > > > On Mar 17, 2011, at 10:17 AM, Anton Lokhmotov wrote: > > > Hi Chris, > > > > So what do you think about this proposal? If you agree, it would be > good to > > include the patch into the 2.9 release (to avoid breaking > compatibility > > later). > > Hi Anton, I'm sorry I don't have the patch anymore. Please resend. It > is too late for new features in 2.9 though. > > >> The last paragraph in section 9.6 says: "NOTE: Implementations may > >> perform floating-point operations on half scalar or vector data > types > >> by converting the half values to single precision floating-point > values > >> and performing the operation in single precision floating-point. In > >> this case, the implementation will use the half scalar or vector > data > >> type as a storage only format." > > Ok. > > >> That is, an implementation may perform operations on half scalar and > >> vector values either using half-precision operations (if supported > >> natively) or using single-precision operations (always supported > >> natively). In either case, it's desirable to represent half > operations > >> in the IR, and let the backend make the decision. > > It doesn't impact the utility of your approach, but I could not > disagree more here. It would be *absolutely* the wrong thing to do for > backends to compile IR half float operations into full float > operations. Doing this would cause all sorts of problems with constant > folding being inconsistent etc. > > Adding half float to LLVM IR is *only* reasonable if you have hardware > that supports half float, or if you want to add softfloat operations > for these. If we have a fp16 datatype in the IR, code generation > *must* codegen these to something that implements the correct fp16 > semantics. > > C is not a portable language, and trying to make LLVM IR magically fix > this is a bad approach. Just like C compilers need to know > sizeof(long), sizeof(void*) and many many other target specific > details, an OpenCL compiler would need to know whether to generate fp16 > or not. [Villmow, Micah] Chris, In OpenCL, the user has to explicitly state that they want to use fp16 and it is illegal to use the half data type for computation if it isn't natively supported. I think it would be useful to have fp16 in the IR for the reason that we support load/stores of the data type, but not operations on the data type. Right now we handle that by treating them like 16bit ints, but it would be nice to be able to represent them correctly. > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From dpatel at apple.com Fri Mar 18 15:44:23 2011 From: dpatel at apple.com (Devang Patel) Date: Fri, 18 Mar 2011 13:44:23 -0700 Subject: [LLVMdev] Writing unit tests for DWARF? In-Reply-To: References: <7251472C-A7E8-4991-876C-FB61580151A2@apple.com> <96F375C7-9B91-4184-AB61-D1AB0A489202@apple.com> <44CD7927-74E2-4722-8896-AEAB4F2EDA0E@apple.com> Message-ID: On Mar 18, 2011, at 1:08 PM, Renato Golin wrote: > On 18 March 2011 18:14, Devang Patel wrote: >> We have setup to test all these three levels you mention above. All you need >> is several thousands tests. Do you think that it will show up magically one >> day ? > > Hehehe, it might... ;) > > I'm trying to get some debug tests here (maybe a few thousand, I don't > know yet). I'll let you know how it goes when we get something. That'd be great! - Devang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110318/803d1236/attachment.html From drb at dneg.com Fri Mar 18 15:47:42 2011 From: drb at dneg.com (Dan Bailey) Date: Fri, 18 Mar 2011 20:47:42 +0000 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> Message-ID: <4D83C4EE.7020602@dneg.com> An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110318/df8e72ee/attachment.html From nojb at math.harvard.edu Fri Mar 18 16:56:33 2011 From: nojb at math.harvard.edu (Nicolas Ojeda Bar) Date: Fri, 18 Mar 2011 17:56:33 -0400 Subject: [LLVMdev] LLVM ERROR: No such instruction: `vmovsd ...' ? Message-ID: <908CFB1F-32B9-4327-83FE-C2BFFF5E5DBC@math.harvard.edu> Hello, I am running a i7 MacBook Pro 2011. If I write: @g = global double 0.000000e+00 define i32 @main() { entry: %0 = load double* @g %1 = fmul double 1.000000e+06, %0 store double %1, double* @g ret i32 0 } in test.ll and I run > llc test.ll > gcc test.s I get: test.s:12:no such instruction: `vmovsd _g(%rip), %xmm0' test.s:13:no such instruction: `vmulsd LCPI0_0(%rip), %xmm0,%xmm0' test.s:14:no such instruction: `vmovsd %xmm0, _g(%rip)' I'm completely puzzled. Help? Thanks! N From damien.llvm at gmail.com Fri Mar 18 17:16:00 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Fri, 18 Mar 2011 15:16:00 -0700 Subject: [LLVMdev] Text or Data symbol Message-ID: I am again calling for help from LLVM developers ;) For my DSP backend, at the lowering stage and also at the AsmPrinter stage, I need to know if a GlobalAddress is a code or a data address. So I tried at the lowering stage to use: GlobalAddressSDNode *GSDN = cast(Op); const GlobalValue *GV = GSDN->getGlobal(); GV->hasSection() and GV->getSection() But the section is not set at this stage (hasSection = false) And at the AsmPrinter stage: const GlobalValue *GV = MO.getGlobal(); SectionKind sectionKind = Mang->getSymbol(GV)->getSection().getKind(); But again the section does not seem to be set (sectionKind.isInSection() = false) Do you know a way to tell if a global address corresponds to data or code ? I have to process differently text and data address... Thank you ! Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110318/ac2258e0/attachment.html From j.wilhelmy at arcor.de Fri Mar 18 17:43:55 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Fri, 18 Mar 2011 23:43:55 +0100 Subject: [LLVMdev] new vector resize instruction could be useful Message-ID: <4D83E02B.80406@arcor.de> Hi! If I build a vector of some length (e.g. 4) from a vector of another length (e.g. 3) then I get tons of extractelement and insertelement instructions. since vectors of length 3 and 4 both map to an sse register it could be useful to introduce an instruction that changes the length of a vector, either truncating or extending by zero or undef values (whichever makes more sense). for lengths 3 and 4 this maps to no-op but could be useful for e.g. concatenating two vectors of length 8 to a vector of length 16 by first resizing to length 16 and then using shufflevector. what do you think? -jochen From eli.friedman at gmail.com Fri Mar 18 17:50:26 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 18 Mar 2011 15:50:26 -0700 Subject: [LLVMdev] LLVM ERROR: No such instruction: `vmovsd ...' ? In-Reply-To: <908CFB1F-32B9-4327-83FE-C2BFFF5E5DBC@math.harvard.edu> References: <908CFB1F-32B9-4327-83FE-C2BFFF5E5DBC@math.harvard.edu> Message-ID: On Fri, Mar 18, 2011 at 2:56 PM, Nicolas Ojeda Bar wrote: > Hello, > > I am running a i7 MacBook Pro 2011. If I write: > > @g = global double 0.000000e+00 > > define i32 @main() { > entry: > ?%0 = load double* @g > ?%1 = fmul double 1.000000e+06, %0 > ?store double %1, double* @g > ?ret i32 0 > } > > in test.ll and I run > >> llc test.ll >> gcc test.s > > I get: > > test.s:12:no such instruction: `vmovsd _g(%rip), %xmm0' > test.s:13:no such instruction: `vmulsd LCPI0_0(%rip), %xmm0,%xmm0' > test.s:14:no such instruction: `vmovsd %xmm0, _g(%rip)' > > I'm completely puzzled. Help? > > Thanks! > N It looks like llc is generating AVX instructions. IIRC, it isn't supposed to at the moment if you don't explicitly request it; what version are you using? -Eli From eli.friedman at gmail.com Fri Mar 18 17:55:43 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 18 Mar 2011 15:55:43 -0700 Subject: [LLVMdev] new vector resize instruction could be useful In-Reply-To: <4D83E02B.80406@arcor.de> References: <4D83E02B.80406@arcor.de> Message-ID: On Fri, Mar 18, 2011 at 3:43 PM, Jochen Wilhelmy wrote: > Hi! > > If I build a vector of some length (e.g. 4) from a vector of another > length (e.g. 3) > then I get tons of extractelement and insertelement instructions. since > vectors of length 3 and 4 both map to an sse register it could be useful to > introduce an instruction that changes the length of a vector, either > truncating > or extending by zero or undef values (whichever makes more sense). > for lengths 3 and 4 this maps to no-op but could be useful for e.g. > concatenating > two vectors of length 8 to a vector of length 16 by first resizing to > length 16 > and then using shufflevector. > what do you think? You should already be able to use shufflevector with the output length different from the inputs. -Eli From j.wilhelmy at arcor.de Fri Mar 18 17:58:13 2011 From: j.wilhelmy at arcor.de (Jochen Wilhelmy) Date: Fri, 18 Mar 2011 23:58:13 +0100 Subject: [LLVMdev] new vector resize instruction could be useful In-Reply-To: References: <4D83E02B.80406@arcor.de> Message-ID: <4D83E385.5070007@arcor.de> > You should already be able to use shufflevector with the output length > different from the inputs. ah, that's cool -Jochen From resistor at mac.com Fri Mar 18 18:02:28 2011 From: resistor at mac.com (Owen Anderson) Date: Fri, 18 Mar 2011 16:02:28 -0700 Subject: [LLVMdev] How to integrate an analysis into LVI? In-Reply-To: References: Message-ID: On Mar 18, 2011, at 11:43 AM, Douglas do Couto Teixeira wrote: > Hi guys, > > I am trying to figure out how to use your Lazy Value Info pass, but I am having some questions. First, it seems that the implementation contains infra-structure to deal with range intervals, but the main interface only gives the client information about which values are constants. Is this true? Second, reading the code I see that the ranges are being computed, but I could not tell if this part of the code is already completely done. Would it be possible to print the ranges of the values computed by LVI? Third, besides JumpThreading.cpp is there any other client that uses LVI? LazyValueInfo reasons about ranges internally, but exposes a much more limited external interface. The results are used both in JumpThreading and in CorrelatedValuePropagation. --Owen From anton at korobeynikov.info Fri Mar 18 18:06:18 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sat, 19 Mar 2011 02:06:18 +0300 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <4D83C4EE.7020602@dneg.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <4D83C4EE.7020602@dneg.com> Message-ID: > Maybe worth pointing out that there are architectures that natively support > 16bit floating point in llvm. PTX, the new backend of which has just been > added to 2.9 can handle fp16 -> fp32 conversion in hardware. FWIW: there are already intrinsics for such conversions (currently only used in ARM backend). There is no need for new type if you want just to convert stuff. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From eli.friedman at gmail.com Fri Mar 18 18:56:34 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 18 Mar 2011 16:56:34 -0700 Subject: [LLVMdev] LLVM ERROR: No such instruction: `vmovsd ...' ? In-Reply-To: References: <908CFB1F-32B9-4327-83FE-C2BFFF5E5DBC@math.harvard.edu> Message-ID: On Fri, Mar 18, 2011 at 4:23 PM, Nicolas Ojeda Bar wrote: > Hi Eli, > > I'm using 2.8. Ouch; too late to fix 2.8 :(. If you can, use trunk or the 2.9 branch instead. Otherwise, passing -mattr=-avx to llc should do the trick. -Eli > On Mar 18, 2011, at 6:50 PM, Eli Friedman wrote: > >> On Fri, Mar 18, 2011 at 2:56 PM, Nicolas Ojeda Bar >> wrote: >>> Hello, >>> >>> I am running a i7 MacBook Pro 2011. If I write: >>> >>> @g = global double 0.000000e+00 >>> >>> define i32 @main() { >>> entry: >>> ?%0 = load double* @g >>> ?%1 = fmul double 1.000000e+06, %0 >>> ?store double %1, double* @g >>> ?ret i32 0 >>> } >>> >>> in test.ll and I run >>> >>>> llc test.ll >>>> gcc test.s >>> >>> I get: >>> >>> test.s:12:no such instruction: `vmovsd _g(%rip), %xmm0' >>> test.s:13:no such instruction: `vmulsd LCPI0_0(%rip), %xmm0,%xmm0' >>> test.s:14:no such instruction: `vmovsd %xmm0, _g(%rip)' >>> >>> I'm completely puzzled. Help? >>> >>> Thanks! >>> N >> >> It looks like llc is generating AVX instructions. ?IIRC, it isn't >> supposed to at the moment if you don't explicitly request it; what >> version are you using? >> >> -Eli > > From Micah.Villmow at amd.com Fri Mar 18 19:11:14 2011 From: Micah.Villmow at amd.com (Villmow, Micah) Date: Fri, 18 Mar 2011 19:11:14 -0500 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <4D83C4EE.7020602@dneg.com> Message-ID: > -----Original Message----- > From: Anton Korobeynikov [mailto:anton at korobeynikov.info] > Sent: Friday, March 18, 2011 4:06 PM > To: Dan Bailey > Cc: Villmow, Micah; llvmdev at cs.uiuc.edu; Anton.Lokhmotov at arm.com > Subject: Re: [LLVMdev] [PATCH] OpenCL half support > > > Maybe worth pointing out that there are architectures that natively > support > > 16bit floating point in llvm. PTX, the new backend of which has just > been > > added to 2.9 can handle fp16 -> fp32 conversion in hardware. > FWIW: there are already intrinsics for such conversions (currently > only used in ARM backend). > There is no need for new type if you want just to convert stuff. > [Villmow, Micah] I've looked into this, but the problem with the intrinsic is that they only support scalar types and do not support any saturation or rounding modes. If there was a way in the current approach to handle these cases, then I would say use what is there, but what is currently there is very basic and doesn't cover even all of the load/store + conversion cases. > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University From balicki.aleksander at gmail.com Fri Mar 18 19:15:05 2011 From: balicki.aleksander at gmail.com (Aleksander Balicki) Date: Sat, 19 Mar 2011 00:15:05 +0000 Subject: [LLVMdev] KLEE GSoC Project Message-ID: Is there a possibility of accepting a KLEE project as an LLVM GSoC project? -- Aleksander "Alistra" Balicki email: balicki.aleksander at gmail.com jabber: wszystkie.inne.byly.zajete at gmail.com From rob.nikander at gmail.com Fri Mar 18 21:26:31 2011 From: rob.nikander at gmail.com (Rob Nikander) Date: Fri, 18 Mar 2011 22:26:31 -0400 Subject: [LLVMdev] how to debug with interpreter Message-ID: Hi, I'm using the JIT execution engine for my language, and I'm finding it extremely painful to find bugs in generated code. I get a "seg fault" and I can't see where it happened. Writing a debugger or generating info for GDB seems like too much work at this point. Is there a way to use the Interpreter to run this code, stepping through and printing the LLVM instructions, to find the one that triggers the seg fault? I have not been writing any files to disk and have not yet used much in LLVM except the C++ API. thanks, Rob From clattner at apple.com Sat Mar 19 00:41:21 2011 From: clattner at apple.com (Chris Lattner) Date: Fri, 18 Mar 2011 22:41:21 -0700 Subject: [LLVMdev] KLEE GSoC Project In-Reply-To: References: Message-ID: On Mar 18, 2011, at 5:15 PM, Aleksander Balicki wrote: > Is there a possibility of accepting a KLEE project as an LLVM GSoC project? Yes, absolutely. -Chris From clattner at apple.com Sat Mar 19 00:59:24 2011 From: clattner at apple.com (Chris Lattner) Date: Fri, 18 Mar 2011 22:59:24 -0700 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <4D83C4EE.7020602@dneg.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <4D83C4EE.7020602@dneg.com> Message-ID: <1E03D67E-197E-4DAC-AD7A-0F2C08F945EA@apple.com> On Mar 18, 2011, at 1:47 PM, Dan Bailey wrote: >> [Villmow, Micah] Chris, In OpenCL, the user has to explicitly state that they want to use fp16 and it is illegal to use the half data type for computation if it isn't natively supported. I think it would be useful to have fp16 in the IR for the reason that we support load/stores of the data type, but not operations on the data type. Right now we handle that by treating them like 16bit ints, but it would be nice to be able to represent them correctly. My understanding is that OpenCL allows promoting to float (32-bit) types. OpenCL doesn't (afaik) require support for fp16. >> > Maybe worth pointing out that there are architectures that natively support 16bit floating point in llvm. PTX, the new backend of which has just been added to 2.9 can handle fp16 -> fp32 conversion in hardware. I agree we should have support for fp16 in the IR, it's fiddly trying to make do without this and gets used frequently in simulations and graphics in particular. LLVM already fully supports fp16 <-> fp32 conversions. If you want to add saturation support for these conversions, that is completely orthogonal to adding fp16 as a "native" llvm type: Adding fp16 as a "native" LLVM IR type doesn't give you saturating conversions. -Chris From rcsaba at gmail.com Sat Mar 19 03:44:05 2011 From: rcsaba at gmail.com (Csaba Raduly) Date: Sat, 19 Mar 2011 09:44:05 +0100 Subject: [LLVMdev] Apparent optimizer bug on X86_64 Message-ID: Compiling a simple automaton created by GNU bison with -O1 or -O2 resulted in the following machine code: 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000400c14 : mov r15d,r14d 0x0000000000400c17 : movzx r12d,BYTE PTR [r15+0x4015e2] 0x0000000000400c1f : mov eax,0x1 0x0000000000400c24 : mov r13,rax 0x0000000000400c27 : sub r13,r12 0x0000000000400c2a : mov eax,r13d // assignment to zero-extends into rax 1306 1307 /* If YYLEN is nonzero, implement the default value of the action: 1308 `$$ = $1'. 1309 1310 Otherwise, the following line sets YYVAL to garbage. 1311 This behavior is undocumented and Bison 1312 users should not rely upon it. Assigning to YYVAL 1313 unconditionally makes the parser a bit smaller, and it avoids a 1314 GCC warning that YYVAL may be used uninitialized. */ 1315 yyval = yyvsp[1-yylen]; => 0x0000000000400c2d : movsd xmm0,QWORD PTR [rbx+rax*8] 0x0000000000400c32 : movsd QWORD PTR [rbp-0x808],xmm0 As far as I understand it, assigning to eax zero-extends to rax. However, eax holds the result of "1-yylen" which is expected to be negative, so it should be sign-extended before using its value as rax. Indexing "in the wrong direction" causes a segfault at the instruction indicated by '=>' Here's the disassembly from -O0, which does a sign extension (movsxd): 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000401069 <+1945>: movsxd rax,DWORD PTR [rbp-0x80c] 0x0000000000401070 <+1952>: movzx ecx,BYTE PTR [rax*1+0x401f0f] 0x0000000000401078 <+1960>: mov DWORD PTR [rbp-0x824],ecx 0x000000000040107e <+1966>: mov ecx,0x1 1306 1307 /* If YYLEN is nonzero, implement the default value of the action: 1308 `$$ = $1'. 1309 1310 Otherwise, the following line sets YYVAL to garbage. 1311 This behavior is undocumented and Bison 1312 users should not rely upon it. Assigning to YYVAL 1313 unconditionally makes the parser a bit smaller, and it avoids a 1314 GCC warning that YYVAL may be used uninitialized. */ 1315 yyval = yyvsp[1-yylen]; 0x0000000000401083 <+1971>: sub ecx,DWORD PTR [rbp-0x824] 0x0000000000401089 <+1977>: movsxd rax,ecx 0x000000000040108c <+1980>: mov rdx,QWORD PTR [rbp-0x800] 0x0000000000401093 <+1987>: movsd xmm0,QWORD PTR [rdx+rax*8] 0x0000000000401098 <+1992>: movsd QWORD PTR [rbp-0x820],xmm0 yylen is of type YYSIZE_T, which is a macro that expands to size_t or 'unsigned int'. Perhaps clang/LLVM considers "1-yylen" to be unsigned? Am I completely off-base? This is clang version 3.0 (trunk 127463) Target: x86_64-unknown-linux-gnu Thread model: posix Csaba -- GCS a+ e++ d- C++ ULS$ L+$ !E- W++ P+++$ w++$ tv+ b++ DI D++ 5++ The Tao of math: The numbers you can count are not the real numbers. Life is complex, with real and imaginary parts. "Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds "People disagree with me. I just ignore them." -- Linus Torvalds From eli.friedman at gmail.com Sat Mar 19 04:10:37 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Sat, 19 Mar 2011 02:10:37 -0700 Subject: [LLVMdev] Apparent optimizer bug on X86_64 In-Reply-To: References: Message-ID: On Sat, Mar 19, 2011 at 1:44 AM, Csaba Raduly wrote: > Compiling a simple automaton created by GNU bison with -O1 or -O2 > resulted in the following machine code: > > 1300 ? ?/*-----------------------------. > 1301 ? ?| yyreduce -- Do a reduction. ?| > 1302 ? ?`-----------------------------*/ > 1303 ? ?yyreduce: > 1304 ? ? ?/* yyn is the number of a rule to reduce with. ?*/ > 1305 ? ? ?yylen = yyr2[yyn]; > ? 0x0000000000400c14 : ? ? ? mov ? ?r15d,r14d > ? 0x0000000000400c17 : ? ? ? movzx ?r12d,BYTE PTR > [r15+0x4015e2] > ? 0x0000000000400c1f : ? ? ? mov ? ?eax,0x1 > ? 0x0000000000400c24 : ? ? ? mov ? ?r13,rax > ? 0x0000000000400c27 : ? ? ? sub ? ?r13,r12 > ? 0x0000000000400c2a : ? ? ? mov ? ?eax,r13d // > assignment to zero-extends into rax > > 1306 > 1307 ? ? ?/* If YYLEN is nonzero, implement the default value of the action: > 1308 ? ? ? ? `$$ = $1'. > 1309 > 1310 ? ? ? ? Otherwise, the following line sets YYVAL to garbage. > 1311 ? ? ? ? This behavior is undocumented and Bison > 1312 ? ? ? ? users should not rely upon it. ?Assigning to YYVAL > 1313 ? ? ? ? unconditionally makes the parser a bit smaller, and it avoids a > 1314 ? ? ? ? GCC warning that YYVAL may be used uninitialized. ?*/ > 1315 ? ? ?yyval = yyvsp[1-yylen]; > => 0x0000000000400c2d : ? ? ? movsd ?xmm0,QWORD PTR > [rbx+rax*8] > ? 0x0000000000400c32 : ? ? ? movsd ?QWORD PTR > [rbp-0x808],xmm0 > > > As far as I understand it, assigning to eax zero-extends to rax. > However, eax holds the result of "1-yylen" which is expected to be > negative, so it should be sign-extended before using its value as rax. > Indexing "in the wrong direction" causes a segfault at the instruction > indicated by '=>' > > Here's the disassembly from -O0, which does a sign extension (movsxd): > > 1300 ? ?/*-----------------------------. > 1301 ? ?| yyreduce -- Do a reduction. ?| > 1302 ? ?`-----------------------------*/ > 1303 ? ?yyreduce: > 1304 ? ? ?/* yyn is the number of a rule to reduce with. ?*/ > 1305 ? ? ?yylen = yyr2[yyn]; > ? 0x0000000000401069 <+1945>: ?movsxd rax,DWORD PTR [rbp-0x80c] > ? 0x0000000000401070 <+1952>: ?movzx ?ecx,BYTE PTR [rax*1+0x401f0f] > ? 0x0000000000401078 <+1960>: ?mov ? ?DWORD PTR [rbp-0x824],ecx > ? 0x000000000040107e <+1966>: ?mov ? ?ecx,0x1 > > 1306 > 1307 ? ? ?/* If YYLEN is nonzero, implement the default value of the action: > 1308 ? ? ? ? `$$ = $1'. > 1309 > 1310 ? ? ? ? Otherwise, the following line sets YYVAL to garbage. > 1311 ? ? ? ? This behavior is undocumented and Bison > 1312 ? ? ? ? users should not rely upon it. ?Assigning to YYVAL > 1313 ? ? ? ? unconditionally makes the parser a bit smaller, and it avoids a > 1314 ? ? ? ? GCC warning that YYVAL may be used uninitialized. ?*/ > 1315 ? ? ?yyval = yyvsp[1-yylen]; > ? 0x0000000000401083 <+1971>: ?sub ? ?ecx,DWORD PTR [rbp-0x824] > ? 0x0000000000401089 <+1977>: ?movsxd rax,ecx > ? 0x000000000040108c <+1980>: ?mov ? ?rdx,QWORD PTR [rbp-0x800] > ? 0x0000000000401093 <+1987>: ?movsd ?xmm0,QWORD PTR [rdx+rax*8] > ? 0x0000000000401098 <+1992>: ?movsd ?QWORD PTR [rbp-0x820],xmm0 > > yylen is of type YYSIZE_T, which is a macro that expands to size_t or > 'unsigned int'. Perhaps clang/LLVM considers "1-yylen" to be unsigned? > Am I completely off-base? > > > > This is > clang version 3.0 (trunk 127463) > Target: x86_64-unknown-linux-gnu > Thread model: posix Please file a bug in Bugzilla, attach the complete preprocessed source, and show the full steps required to reproduce the issue. You might be right, but it's hard to tell without context. -Eli From rcsaba at gmail.com Sat Mar 19 04:43:46 2011 From: rcsaba at gmail.com (Csaba Raduly) Date: Sat, 19 Mar 2011 10:43:46 +0100 Subject: [LLVMdev] Apparent optimizer bug on X86_64 In-Reply-To: References: Message-ID: Created http://llvm.org/bugs/show_bug.cgi?id=9512 Csaba -- GCS a+ e++ d- C++ ULS$ L+$ !E- W++ P+++$ w++$ tv+ b++ DI D++ 5++ The Tao of math: The numbers you can count are not the real numbers. Life is complex, with real and imaginary parts. "Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds "People disagree with me. I just ignore them." -- Linus Torvalds From christian.perone at gmail.com Sat Mar 19 09:08:09 2011 From: christian.perone at gmail.com (Christian S. Perone) Date: Sat, 19 Mar 2011 11:08:09 -0300 Subject: [LLVMdev] Cyclic dependencies while building llvm shared libraries using CMake Message-ID: Hello, I'm facing some problems while building LLVM 2.8 shared libraries from source using CMake: cmake -DBUILD_SHARED_LIBS=true .. This is the output error: CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle): "LLVMARMCodeGen" of type SHARED_LIBRARY depends on "LLVMARMAsmPrinter" "LLVMARMAsmPrinter" of type SHARED_LIBRARY depends on "LLVMARMCodeGen" At least one of these targets is not a STATIC_LIBRARY. Cyclic dependencies are allowed only among static libraries. Does anyone else had the same problem ? Thank you ! -- "Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great big joke on me." http://pyevolve.sourceforge.net/wordpress/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110319/786451e3/attachment.html From yueguoguo1024 at gmail.com Sat Mar 19 10:06:01 2011 From: yueguoguo1024 at gmail.com (Zhang Le) Date: Sat, 19 Mar 2011 23:06:01 +0800 Subject: [LLVMdev] How to get the operand value in the instruction Message-ID: Hi all, I am trying to get the value of operands inside an instruction or the value produced by the instruction. I noticed the method of getOperand(n) which can get the reference of an operand. However I don't know clearly if the address I get refers to the value of the operand. Could anyone please help? -- *with best regards* Zhang Le -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110319/38c41378/attachment.html From ezengbin at gmail.com Sat Mar 19 10:06:52 2011 From: ezengbin at gmail.com (Bin Zeng) Date: Sat, 19 Mar 2011 11:06:52 -0400 Subject: [LLVMdev] X86 instruction encoding Message-ID: <4D84C68C.2090407@gmail.com> Hi all, Just a quick question about the X86 instruction naming convention in LLVM. Most instruction names in LLVM are self-explanatory. Some are a little confusing. What is the difference between instruction MULSSrr and MULSSrr_Int? What does the suffix '_Int' stand for? Are these instructions exchangeable: "ST_F64m", "ST_FP64m", "ST_Fp64m" "ST_FpP64m32"? Any direction will be appreciated. Thanks a lot in advance. Bin From ofv at wanadoo.es Sat Mar 19 10:10:33 2011 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Sat, 19 Mar 2011 16:10:33 +0100 Subject: [LLVMdev] Cyclic dependencies while building llvm shared libraries using CMake References: Message-ID: <87tyez16za.fsf@wanadoo.es> "Christian S. Perone" writes: > Hello, I'm facing some problems while building LLVM 2.8 shared libraries > from source using CMake: > > cmake -DBUILD_SHARED_LIBS=true .. > > This is the output error: [snip] > Does anyone else had the same problem ? Building LLVM as shared libraries is not a widely used configuration and at the time there was an strange flip-flop of dependencies among asmprinters and codegens, so it is no surprise that 2.8 is broken. I've just checked that the 2.9 branch builds fine, though. From schaub.johannes at googlemail.com Sat Mar 19 12:05:03 2011 From: schaub.johannes at googlemail.com (Johannes Schaub (litb)) Date: Sat, 19 Mar 2011 18:05:03 +0100 Subject: [LLVMdev] [Patch] Fix bug in llvm::SmallVectorIml<>::insert Message-ID: This fixes a bug in SmallVectorImpl<>::insert, which were not behaving correctly on inserting an empty range into an empty vector: #include #include int main() { llvm::SmallVector v, w; llvm::SmallVector::iterator it = v.insert(v.end(), w.begin(), w.end()); assert(it == v.end()); } The insert function(s) would incorrectly return "this->end()-1". I attached the patch which I diff'ed from trunk. Is it important enough to be backported to llvm2.9 ? I would like to ask someone to commit it to wherever it fits. Thanks! -------------- next part -------------- A non-text attachment was scrubbed... Name: fix_smallvector.patch Type: text/x-patch Size: 906 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110319/9943fcd2/attachment.bin From Matthieu.Moy at grenoble-inp.fr Sat Mar 19 12:12:17 2011 From: Matthieu.Moy at grenoble-inp.fr (Matthieu Moy) Date: Sat, 19 Mar 2011 18:12:17 +0100 Subject: [LLVMdev] [PATCH] Fix weak/linkonce linkage in execution engine In-Reply-To: (Matthieu Moy's message of "Tue, 15 Mar 2011 19:39:44 +0100") References: Message-ID: Hi, I sent this a few days ago, but got no reply, so re-sending to make sure the patches are not dropped. To summarize, the weak_odr and linkonce_odr linkage are badly managed in the JIT Execution engine, the patches add testcases and fix the problem (more details in the commit messages within the patches). Regards, -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-add-testcase-for-weak_odr-and-linkonce_odr-in-JIT.patch Type: text/x-diff Size: 2683 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110319/428f9b22/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-JIT-fix-linkage-of-weak-and-linkonce-symbols.patch Type: text/x-diff Size: 2880 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110319/428f9b22/attachment-0001.bin -------------- next part -------------- Matthieu Moy writes: > Hi, > > I've had problem with a program using LLVM that tried to dynamic_cast > objects created in the JIT execution engine, from the native part of the > program (for the curious, the program is PinaVM > http://gitorious.org/pinavm/pages/Home). > > I've narrowed down the issue to the linkage of weak_odr and linkonce_odr > symbols, used for the vtables, and that _must_ be unique for > dynamic_cast to work. > > Attached are two patches: the first adds a (failing) testcase, the > second fixes the issue. > > I'm not familiar with patch submission on this list, let me know if > there's a better way to submit patches. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ From schaub.johannes at googlemail.com Sat Mar 19 13:26:32 2011 From: schaub.johannes at googlemail.com (Johannes Schaub (litb)) Date: Sat, 19 Mar 2011 19:26:32 +0100 Subject: [LLVMdev] [Patch] Fix bug in llvm::SmallVectorIml<>::insert References: Message-ID: Johannes Schaub (litb) wrote: > This fixes a bug in SmallVectorImpl<>::insert, which were not behaving > correctly on inserting an empty range into an empty vector: > > #include > #include > > int main() { > llvm::SmallVector v, w; > llvm::SmallVector::iterator it = > v.insert(v.end(), w.begin(), w.end()); > assert(it == v.end()); > } > > The insert function(s) would incorrectly return "this->end()-1". I > attached the patch which I diff'ed from trunk. Is it important enough to > be backported to llvm2.9 ? > > I would like to ask someone to commit it to wherever it fits. Thanks! Hmm, I don't understand the rationale of the return value of the range- insert and repeated-insert. The range-insert returns an iterator pointing exactly at the last value inserted. But the repeated-insert (i.e Insert(Position, RepeatCount, Value)) apparently returns an iterator pointing to the first value inserted. Is this actually intended? My patch is inconsistent with this in mind, because it always returns a pointer to the first value inserted. Please don't apply it yet. We need to clear this up first and fix my patch. From clattner at apple.com Sat Mar 19 13:43:30 2011 From: clattner at apple.com (Chris Lattner) Date: Sat, 19 Mar 2011 11:43:30 -0700 Subject: [LLVMdev] X86 instruction encoding In-Reply-To: <4D84C68C.2090407@gmail.com> References: <4D84C68C.2090407@gmail.com> Message-ID: <179FA178-5E5F-485D-8C56-F1DC030FE4E9@apple.com> On Mar 19, 2011, at 8:06 AM, Bin Zeng wrote: > Hi all, > > Just a quick question about the X86 instruction naming convention in > LLVM. Most instruction names in LLVM are self-explanatory. Some are a > little confusing. What is the difference between instruction MULSSrr and > MULSSrr_Int? What does the suffix '_Int' stand for? "_Int" stands for "intrinsic", because the pattern matches an intrinsic instead of normal nodes. That said, this is old and bad. These instructions should be replaced with Pat<> patterns, to avoid duplicating the encoding an other information about the pattern. > Are these > instructions exchangeable: "ST_F64m", "ST_FP64m", No, these generate different mnemonics (fst vs fstp) the difference is that the "p" version pops the floating point stack. > "ST_Fp64m" "ST_FpP64m32"? Any direction will be appreciated. Thanks a lot in advance. These ones are related to register classes. The later one does a 32-bit store of a "64-bit floating point register". This is a modeling artifact of how we represent the floating point stack registers. -Chris From schaub.johannes at googlemail.com Sat Mar 19 14:10:57 2011 From: schaub.johannes at googlemail.com (Johannes Schaub (litb)) Date: Sat, 19 Mar 2011 20:10:57 +0100 Subject: [LLVMdev] [Patch] Fix for PR9499 (confusing behavior of FastFoldingSetNode). Message-ID: I've made a patch which provides one possible fix for the unintuitive behavior of FastFoldingSetNode. I'm not sure whether that's a good way to solve this. It's based largely on the documentation found in the FastFoldingSet.h file, which says Profile functions should use "Add" instead of simply setting their parameter. Patch is attached on this mail and on the PR. Thanks! -------------- next part -------------- A non-text attachment was scrubbed... Name: fixes.patch Type: text/x-patch Size: 1452 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110319/5b644ea0/attachment.bin From reid.kleckner at gmail.com Sun Mar 20 10:56:06 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Sun, 20 Mar 2011 08:56:06 -0700 Subject: [LLVMdev] how to debug with interpreter In-Reply-To: References: Message-ID: x/20i $pc - 20 or so is your friend. In general, you can't disassemble x86 backwards, but gdb does a reasonably good job if you just guess. Alternatively, if you're on Linux, there's the gdb-jit interface, which should give you symbols and unwind tables without any extra effort on your part: http://llvm.org/docs/DebuggingJITedCode.html To answer your original question about the interpreter, you can pass -force-interpreter to lli or pass args to the execution engine creation, but the interpreter is considered incomplete and buggy. Its original purpose was to help debug the JIT, but now the JIT works reliably, so it hasn't been maintained. Reid On Fri, Mar 18, 2011 at 7:26 PM, Rob Nikander wrote: > Hi, > > I'm using the JIT execution engine for my language, and I'm finding it > extremely painful to find bugs in generated code. ?I get a "seg fault" > and I can't see where it happened. ?Writing a debugger or generating > info for GDB seems like too much work at this point. ? Is there a way > to use the Interpreter to run this code, stepping through and printing > the LLVM instructions, to find the one that triggers the seg fault? ?I > have not been writing any files to disk and have not yet used much in > LLVM except the C++ API. > > thanks, > Rob > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From mitnick.lyu at gmail.com Sun Mar 20 15:24:09 2011 From: mitnick.lyu at gmail.com (Lyu Mitnick) Date: Mon, 21 Mar 2011 04:24:09 +0800 Subject: [LLVMdev] CDECL Calling Convention Message-ID: Hello all, I am a beginner of LLVM and I want to add a new backend into LLVM. The calling convention of the target I ported is CDECL. I am wondering to know whether there is already CDECL calling convention implemented in LLVM?? Which CallingConv.td file should I copy and modify for my target?? thanks a lot Mitnick -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/a6df3edc/attachment.html From anton at korobeynikov.info Sun Mar 20 16:05:06 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 21 Mar 2011 00:05:06 +0300 Subject: [LLVMdev] CDECL Calling Convention In-Reply-To: References: Message-ID: Hello > I am a beginner of LLVM and I want to add a new backend into LLVM. The > calling convention of the target I ported is CDECL. I am wondering to know > whether there is already CDECL calling convention implemented in LLVM?? > Which CallingConv.td file should I copy and modify for my target?? Everything depends on what exactly "cdecl" means for your target. Usually it's just an ordinary C calling convention, which is surely implemented for a bunch of different LLVM targets. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From jgu222 at gmail.com Sun Mar 20 19:05:32 2011 From: jgu222 at gmail.com (Junjie Gu) Date: Sun, 20 Mar 2011 17:05:32 -0700 Subject: [LLVMdev] linkage type In-Reply-To: <4D8104EC.3050201@klickverbot.at> References: <4D8104EC.3050201@klickverbot.at> Message-ID: On Wed, Mar 16, 2011 at 11:43 AM, David Nadlinger wrote: > There is a description of all the possible linkage types at > http://llvm.org/docs/LangRef.html#linkage ? does this answer your > question? (Basically, an extern_weak resp. ExternalWeakLinkage symbol > becomes null instead of being an undefined reference) > Well, I was trying to know what in source gets to be translated to WeakAnyLinkage or ExternalWeakLinkage. I recently had a problem related to weak linkage handling in llvm on WIndows 7. Here is the problem. I have two files: m.c and foo.c m.c ----- #include __attribute__ ((weak)) int foo (void) { printf ("weak foo()\n"); return 1; } int main() { return foo(); } foo.c ----- #include int foo (void) { printf ("strong foo()\n"); return 2; } ubuntu linux %llvm-gcc m.c foo.c; ./a.out strong foo() %llvm-gcc m.c; ./a.out weak foo() which is what I expect. However, on windows 7, llvm-gcc (built using mingw) gives: mingw/msys %llvm-gcc m.c; ./a.exe weak foo() %llvm-gcc m.c foo.c C:/Users/jugu/AppData/Local/Temp/ccGcz9jg.o:fake:(.text+0x0): multiple definition of `foo' C:/Users/jugu/AppData/Local/Temp/ccUZmVfP.o:fake:(.text$foo[_foo]+0x0): first defined here collect2: ld returned 1 exit status which is not what I expect. Somehow, "weak" is missing. m.s generated by this mingw llvm-gcc is .section .text$foo,"xr" .linkonce discard .globl _foo .align 16, 0x90 _foo: Lllvm$workaround$fake$stub$_foo: "weak' isn't there! It is a bug, no ? (By the way, the older llvm generates the folloiwng for a code similar to m.c .section .text$linkonce_foo,"xrn" .linkonce same_contents .globl _foo I think that the section flag "xrn" is wrong, and "n" should not be there, and fortunately, this got fixed in the latest llvm) Thanks Junjie > David > > > On 3/16/11 7:06 PM, Junjie Gu wrote: >> What is the difference between WeakAnyLinkage and ExternalWeakLinkage >> ? ?They are defined in GlobalValue.h. ?Thanks >> >> Junjie >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From Anton.Lokhmotov at arm.com Mon Mar 21 05:44:21 2011 From: Anton.Lokhmotov at arm.com (Anton Lokhmotov) Date: Mon, 21 Mar 2011 10:44:21 -0000 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> Message-ID: <000201cbe7b4$efeb83f0$cfc28bd0$@Lokhmotov@arm.com> > Adding half float to LLVM IR is *only* reasonable if you have hardware > that supports half float, or if you want to add softfloat operations > for these. Yes, our graphics hardware natively supports some fp16 arithmetic operations. > Just like C compilers need to know sizeof(long), sizeof(void*) and > many many other target specific details, an OpenCL compiler would need > to know whether to generate fp16 or not. Yes, it's just another example of LLVM-IR non-portability. Basically, any fp16 arithmetic code can be generated only if the cl_khr_fp16 extension is supported (otherwise, the frontend would reject even declaring fp16 variables, leave alone performing arithmetic on them). Anton. From sarevokcc at gmail.com Mon Mar 21 05:46:40 2011 From: sarevokcc at gmail.com (Dongrui She) Date: Mon, 21 Mar 2011 11:46:40 +0100 Subject: [LLVMdev] How to get register liveness information for each MachineBasicBlock Message-ID: Hi all, I try to print the live-in and live-out registers for each basic block in a backend for my own target. And I can get a list of live-in registers directly in MachineBasicBlock. Is there a quick way to also get the list of live-out registers without redoing the analysis. I think this information is computed and stored somewhere. -- Regards, Dongrui -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/d57dea89/attachment.html From rob.nikander at gmail.com Mon Mar 21 07:42:44 2011 From: rob.nikander at gmail.com (Rob Nikander) Date: Mon, 21 Mar 2011 08:42:44 -0400 Subject: [LLVMdev] how to debug with interpreter In-Reply-To: References: Message-ID: On Sun, Mar 20, 2011 at 11:56 AM, Reid Kleckner wrote: > Alternatively, if you're on Linux, there's the gdb-jit interface, > which should give you symbols and unwind tables without any extra > effort on your part: > http://llvm.org/docs/DebuggingJITedCode.html This page says to run `lli -jit-emit-debug myfile.bc' under gdb. Does the entire program need to be .bc files running through lli, or is there a way to pass that -jit-emit-debug flag to the C++ api, inside a normally compiled program? thanks, Rob From reid.kleckner at gmail.com Mon Mar 21 09:19:45 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Mon, 21 Mar 2011 07:19:45 -0700 Subject: [LLVMdev] how to debug with interpreter In-Reply-To: References: Message-ID: On Mon, Mar 21, 2011 at 5:42 AM, Rob Nikander wrote: > On Sun, Mar 20, 2011 at 11:56 AM, Reid Kleckner wrote: > >> Alternatively, if you're on Linux, there's the gdb-jit interface, >> which should give you symbols and unwind tables without any extra >> effort on your part: >> http://llvm.org/docs/DebuggingJITedCode.html > > This page says to run `lli -jit-emit-debug myfile.bc' under gdb. ?Does > the entire program need to be .bc files running through lli, or is > there a way to pass that -jit-emit-debug flag to the C++ api, inside a > normally compiled program? I thought there was a mechanism for doing that, but I can't find it. If you build LLVM in debug mode, it will be on by default. Or wherever you wrap LLVM's parsing of command line flags you can pass it in. Reid From Arnaud.AllardDeGrandMaison at dibcom.com Mon Mar 21 11:21:14 2011 From: Arnaud.AllardDeGrandMaison at dibcom.com (Arnaud Allard de Grandmaison) Date: Mon, 21 Mar 2011 17:21:14 +0100 Subject: [LLVMdev] IndVarSimplify too aggressive ? In-Reply-To: <5B577C73-1798-4997-B275-F1DC976FBA46@apple.com> References: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1EF36C7@FRPAR1CL009.coe.adi.dibcom.com> <5B577C73-1798-4997-B275-F1DC976FBA46@apple.com> Message-ID: <57C38DA176A0A34A9B9F3CCCE33D3C4A0136E1F3BC08@FRPAR1CL009.coe.adi.dibcom.com> Thanks Andy & Dan for looking into this. Tested it on my own backends, and it works great ! Best regards, -- Arnaud de Grandmaison -----Original Message----- From: Andrew Trick [mailto:atrick at apple.com] Sent: Friday, March 18, 2011 6:11 PM To: Arnaud Allard de Grandmaison Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] IndVarSimplify too aggressive ? On Mar 13, 2011, at 2:01 PM, Arnaud Allard de Grandmaison wrote: > Hi all, > > The IndVarSimplify pass seems to be too aggressive when it enlarge the induction variable type ; this can pessimize the generated code when the new induction variable size is not natively supported by the target. This is probably not an issue for x86_64, which supports natively all types, but it is a real one for several embedded targets, with very few native types. > > I attached a patch to address this issue; if TargetData is available, the patch attempts to keep the induction variable to a native type when going thru the induction variable users. > > Also attached my test-case in C, as well as the resulting assembly output, with and without the patch applied, for arm and x86_32 targets. You will note the loop instructions count can be reduced by 30% in several cases. > > The patch could probably be made smarter : I am welcoming all suggestions. Hi Arnaud, This should be fixed in r127884. In some cases, your patch could result in multiple IVs for the same recurrence, and LSR was not able to cleanup afterward. Dan Gohman proposed an alternative, which seems to work great. See http://llvm.org/bugs/show_bug.cgi?id=9490. -Andy From ermeleh at hotmail.com Mon Mar 21 11:47:31 2011 From: ermeleh at hotmail.com (NaJeM ErMeLeH) Date: Mon, 21 Mar 2011 16:47:31 +0000 Subject: [LLVMdev] Profiling support in LLVM Message-ID: Hello LLVM developers, I'm assisting my doctor who is doing a research and he wants to use the llvm compiler, my job is to profile build the benchmarks using llvm-prof. What i want to know is the following 1- does llvm support profile feedback optimizations!? 2- when i've used the llvm-prof it's input is an object file (not binary as other compilers) my question is how could I profile a whole benchmark program using the llvm-prof ? 3- is there a way to print the spill code information (e.g. spill code count in a single function or basic block) ? your help is appreciated. regards, ~Najem -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/4f2db9e3/attachment.html From baah at cc.gatech.edu Mon Mar 21 13:16:29 2011 From: baah at cc.gatech.edu (George Baah) Date: Mon, 21 Mar 2011 14:16:29 -0400 Subject: [LLVMdev] newbie questions Message-ID: <821CE6D6-D952-4B55-B82A-B78C343D6782@cc.gatech.edu> Hi Everyone, I am new to LLVM. I have two questions. 1) Is there a program dependence graph implementation in llvm? 2) I am trying to instrument a program to print line number, function containing computation, and variable values. For example, print(function-name:line-number: k : y : ">"); if(k > y) x = a + b; print(function-name: line-number: x); How can I go about doing this? Thanks. George From raghesh.a at gmail.com Mon Mar 21 13:20:51 2011 From: raghesh.a at gmail.com (raghesh) Date: Mon, 21 Mar 2011 23:50:51 +0530 Subject: [LLVMdev] Contributing to Polly with GSOC 2011 Message-ID: Dear all, I am Raghesh, a student pursuing M.Tech at Indian Institute of Technology, Madras, India. I would like to make contribution to the Polly project (http://wiki.llvm.org/Polyhedral_optimization_framework) as part of GSOC 2011. I have gained some experience working in OpenMP Code generation for Polly. This is almost stable now and planning to test with the polybench benchmarks. Some of the ideas that can be implemented in Polly is listed below. Please let me know your comments on this. 1. Increasing the Stability and Coverage of Polly ------------------------------------------------------------- Polly can show good speedup on several test cases. there are still many programs that cannot be optimized by it. One reason is that Polly does not yet support casts like (sext, zext, trunk). Those often appear implicit in programs, as they often have 32 bit induction variables, but require 64 bit indexes for array subscripts. For example: for (int i = 0; i < N; i++) A[i] = If we translate this to LLVM-IR and keep i an i32 but use an i64 to calculate the access to A[i] there will be a sext necessary. Polly currently do not handle this. 2. Testing Real Programs. --------------------------------- Testing with well known benchmarks like SPEC CPU2006. This will help us to find out cases where Polly cannot optimize and improve the coverage of Polly on existing Programs. 3. Profiling in polly ----------------------- The idea is explained below with a few examples. Consider the following code. scanf ( ?%d? , &b ) ; for ( i = 0 ; i < N; i +=b) { body ; } Polly may not detect this as a SCoP because the variable b is read as an user input. So to detect this as a SCoP we instrument the IR with the information provided by profiling. Suppose using profiling we figure out that most of the time the value of b is say 2. we can convert the above code as follows. scanf ( ?%d? , &b ) ; if ( b == 2 ) { for ( i = 0 ; i < N; i += 2 ) { body ; } } else { f o r ( i = 0 ; i < N; i += b ) { body ; } } Now with the transformed code the for loop inside ?if? will be detected as a SCoP and can be parallelised. Since value of N is 100 most of the time, the overall performance will be improved. Consider another scenario. f o r ( i = 0 ; i < N; i ++) { body ; } Suppose using profiling we know that N is always very small. So there wont be much gain from parallelising it. So we have to tell polly that don?t detect this as a SCoP if N is less than a specific value. Some other immediate applications would be * Automatially derive the best scheduling strategy supported by OpenMP. * Adding simple profiling support, to understand how much time is spent inside each scop. Andreas Simbuerger has done some significant work on this and can be extended. 5. Porting Polly to Various architectures. ------------------------------------------------- Currently Polly generates everything as 64 bit integer, which is problamatic for embedded platforms. 6. Vectorization in Polly ------------------------------ Lot of work needed to be done in this area. -- Raghesh II MTECH Room No: 0xFF Mahanadhi Hostel IIT Madras From clattner at apple.com Mon Mar 21 13:26:14 2011 From: clattner at apple.com (Chris Lattner) Date: Mon, 21 Mar 2011 11:26:14 -0700 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <000201cbe7b4$efeb83f0$cfc28bd0$%Lokhmotov@arm.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <000201cbe7b4$efeb83f0$cfc28bd0$%Lokhmotov@arm.com> Message-ID: On Mar 21, 2011, at 3:44 AM, Anton Lokhmotov wrote: >> Adding half float to LLVM IR is *only* reasonable if you have hardware >> that supports half float, or if you want to add softfloat operations >> for these. > Yes, our graphics hardware natively supports some fp16 arithmetic > operations. Ok. >> Just like C compilers need to know sizeof(long), sizeof(void*) and >> many many other target specific details, an OpenCL compiler would need >> to know whether to generate fp16 or not. > Yes, it's just another example of LLVM-IR non-portability. Basically, any > fp16 arithmetic code can be generated only if the cl_khr_fp16 extension is > supported (otherwise, the frontend would reject even declaring fp16 > variables, leave alone performing arithmetic on them). If the backend generates softfloat (or some other expansion) for fp16, then a native fp16 type would be perfectly portable. This is just not the "portability" that you're looking for (which is not behavior preserving, so it isn't portability by its standard definition). -Chris From minjang at gatech.edu Mon Mar 21 13:31:16 2011 From: minjang at gatech.edu (Minjang Kim) Date: Mon, 21 Mar 2011 14:31:16 -0400 Subject: [LLVMdev] Efficient instrumentation of loads and stores Message-ID: Hello, I'd like to listen your opinions regarding my research with LLVM. My work is a dynamic analysis of data dependences [1]. Briefly speaking, I'm instrumenting memory loads/stores and loop entries/exits/back edges, and then calculating data dependences in runtime, especially focusing on loop-carried dependences. So far, I have been working with a binary-level instrumentation tool (Pin). However, doing sophisticated static analysis with binaries is really daunting, so I'm porting my code to LLVM. The implementation for LLVM is actually simple because my core analysis module is already orthogonal to instrumentation frameworks. So, all I need to do is inserting calls of stub functions that send events to the analysis module (e.g., loop A has been started). First, instrumenting loop events such as entry/exit/back edges was very straightforward and simple, comparing to my previous binary-level approach, though there was quite a learning curve to use LLVM methods correctly and efficiently. FYI, in a binary-level approach, even extracting loops is challenging. No single pre-header and hard to capture loop exits as well. However, instrumenting loads and stores efficiently is somewhat tricky. An obvious way is instrumenting every loads and stores, which is what I did with binaries. But, I'd like to filter loads and stores whose dependences can be decided in static time to minimize runtime overhead. Notable examples would be (1) induction variables and (2) local temporary variables: for (int i = 0; i < N; ++i) { int temp = 10; ... } In the above example, it is obvious 'i' and 'temp' do not need to be analyzed in runtime. So, I'm writing the code that skip instrumentations for such local scalar variables. I do believe this is doable. But, as a non-expert in static and compiler-level analysis, implementing such instrumentation code isn't straightforward so far. One reason that makes this problem tricky would be I can't do any significant optimizations before instrumentation, because optimizations heavily change code such as loop interchanges, instruction reordering and elimination of variables. I'm currently doing instrumentation before any optimizations. Such restriction makes some problems: before any optimizations, the IR code isn't a SSA-form, so sophisticated loop analysis is impossible such as identifying induction variable easily. Without SSA form, there are bunch of loads and stores, so extracting such variables require dirty hack. I think I can implement this loads/stores instrumentation code anyway. However, I'd like to hear opinions from experts on this domain. What would be the most elegant and best approach for this problem? I will appreciate any comments for my work. Thank you for reading such a long article, Minjang [1] http://www.cc.gatech.edu/~minjang/micro10-mjkim.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/c95aed00/attachment.html From damien.llvm at gmail.com Mon Mar 21 14:00:19 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Mon, 21 Mar 2011 12:00:19 -0700 Subject: [LLVMdev] Text or Data symbol In-Reply-To: References: Message-ID: I reply to myself... I didn't go in the right direction in my previous email. There is an easy way to tell if a GlobalValue corresponds to data or code: const GlobalValue *GV; if(Function::classof(GV)) ... // process the global value as a function else ... // process the global value as data Damien On Fri, Mar 18, 2011 at 3:16 PM, Damien Vincent wrote: > > I am again calling for help from LLVM developers ;) > > For my DSP backend, at the lowering stage and also at the AsmPrinter stage, > I need to know if a GlobalAddress is a code or a data address. > > So I tried at the lowering stage to use: > GlobalAddressSDNode *GSDN = cast(Op); > const GlobalValue *GV = GSDN->getGlobal(); > GV->hasSection() and GV->getSection() > But the section is not set at this stage (hasSection = false) > > And at the AsmPrinter stage: > const GlobalValue *GV = MO.getGlobal(); > SectionKind sectionKind = Mang->getSymbol(GV)->getSection().getKind(); > But again the section does not seem to be set (sectionKind.isInSection() = > false) > > Do you know a way to tell if a global address corresponds to data or code ? > I have to process differently text and data address... > > Thank you ! > > Damien > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/d0bfa164/attachment.html From criswell at illinois.edu Mon Mar 21 14:08:56 2011 From: criswell at illinois.edu (John Criswell) Date: Mon, 21 Mar 2011 14:08:56 -0500 Subject: [LLVMdev] Text or Data symbol In-Reply-To: References: Message-ID: <4D87A248.80208@illinois.edu> On 3/21/11 2:00 PM, Damien Vincent wrote: > I reply to myself... I didn't go in the right direction in my previous > email. > > There is an easy way to tell if a GlobalValue corresponds to data or code: > const GlobalValue *GV; > if(Function::classof(GV)) > ... // process the global value as a function > else > ... // process the global value as data > > Damien You should be able to use isa(GV) to determine if GV is a function. You may have to put in an additional check to see if GV is a GlobalAlias and to determine if the GlobalAlias is an alias for a function: if (GlobalAlias * GA = dyn_cast(GV)) { ... Check to see if GA is an alias for a function } I recommend looking at the doxygen documentation on llvm.org to learn the class hierarchy relationships between GlobalValue, GlobalVariable, GlobalAlias, and Function. You should also read the Programmer's Guide to get familiar with the isa<>() and dyn_cast<>() functions if you are not familiar with them already. -- John T. > > > > > On Fri, Mar 18, 2011 at 3:16 PM, Damien Vincent > wrote: > > > I am again calling for help from LLVM developers ;) > > For my DSP backend, at the lowering stage and also at the > AsmPrinter stage, I need to know if a GlobalAddress is a code or a > data address. > > So I tried at the lowering stage to use: > GlobalAddressSDNode *GSDN = cast(Op); > const GlobalValue *GV = GSDN->getGlobal(); > GV->hasSection() and GV->getSection() > But the section is not set at this stage (hasSection = false) > > And at the AsmPrinter stage: > const GlobalValue *GV = MO.getGlobal(); > SectionKind sectionKind = Mang->getSymbol(GV)->getSection().getKind(); > But again the section does not seem to be set > (sectionKind.isInSection() = false) > > Do you know a way to tell if a global address corresponds to data > or code ? I have to process differently text and data address... > > Thank you ! > > Damien > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/8227224c/attachment.html From reid.kleckner at gmail.com Mon Mar 21 14:20:49 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Mon, 21 Mar 2011 12:20:49 -0700 Subject: [LLVMdev] Efficient instrumentation of loads and stores In-Reply-To: References: Message-ID: Hello Minjang, Exactly what information do you need to get in your instrumentation? Is it essentially all loop entry-continue-exit events, interspersed with the memory access trace (address, read/write, pc)? It seems the Thread Sanitizer people have already done something similar: http://code.google.com/p/data-race-test/source/browse/trunk/llvm/opt/ThreadSanitizer/ThreadSanitizer.cpp Not sure what state it's in, I was just browsing their repo. I would say it's a good idea to do your instrumentation after the optimization of the program. The optimized version is probably closer to what would have been produced in the final binary than the unoptimized IR. If you don't run the optimizations, you'll have to do a lot of work to sort out what memory accesses are obviously not dependent on the previous iteration. Optimizations will clean this up a lot. The only issue then is that there are some dependencies that stay only in registers and not in memory, like the temporary in a summation loop. It will be a phi node between the base case and the last iteration. There should be some machinery in LLVM for finding these loop carried dependencies, and you should be able to record them in your instrumentation another way. --- Finally, if you're strongly concerned with the performance of your tool, I think doing something similar to PiPA with DynamoRIO would give you the best performance: http://dynamorio.org/pubs/PiPA-pipelined-profiling-cgo08.pdf Disclaimer: I work on DynamoRIO. :) It would be a very large amount of work, but I think the approach is very clever. The gist of it is that you try to precompute a table of all of the memory access offsets, read-write info, and application pc for every BB, and you record only the base registers and the id of the bb's table every time the basic block is executed. Beyond that, you just pipeline the analysis, which is straightforward. Good luck, Reid On Mon, Mar 21, 2011 at 11:31 AM, Minjang Kim wrote: > Hello, I'd like to listen your opinions regarding my research with LLVM. > > My work is a dynamic analysis of data dependences [1]. Briefly speaking, > I'm instrumenting memory loads/stores and loop entries/exits/back edges, and > then calculating data dependences in runtime, especially focusing on > loop-carried dependences. > > So far, I have been working with a binary-level instrumentation tool (Pin). > However, doing sophisticated static analysis with binaries is really > daunting, so I'm porting my code to LLVM. > > The implementation for LLVM is actually simple because my core analysis > module is already orthogonal to instrumentation frameworks. So, all I need > to do is inserting calls of stub functions that send events to the analysis > module (e.g., loop A has been started). > > First, instrumenting loop events such as entry/exit/back edges was very > straightforward and simple, comparing to my previous binary-level approach, > though there was quite a learning curve to use LLVM methods correctly and > efficiently. FYI, in a binary-level approach, even extracting loops is > challenging. No single pre-header and hard to capture loop exits as well. > > > However, instrumenting loads and stores efficiently is somewhat tricky. An > obvious way is instrumenting every loads and stores, which is what I did > with binaries. But, I'd like to filter loads and stores whose dependences > can be decided in static time to minimize runtime overhead. Notable examples > would be (1) induction variables and (2) local temporary variables: > > for (int i = 0; i < N; ++i) { > int temp = 10; > ... > } > > In the above example, it is obvious 'i' and 'temp' do not need to be > analyzed in runtime. So, I'm writing the code that skip instrumentations for > such local scalar variables. > > I do believe this is doable. But, as a non-expert in static and > compiler-level analysis, implementing such instrumentation code > isn't straightforward so far. > > One reason that makes this problem tricky would be I can't do any > significant optimizations before instrumentation, because optimizations > heavily change code such as loop interchanges, instruction reordering and > elimination of variables. I'm currently doing instrumentation before any > optimizations. > > Such restriction makes some problems: before any optimizations, the IR code > isn't a SSA-form, so sophisticated loop analysis is impossible such as > identifying induction variable easily. Without SSA form, there are bunch of > loads and stores, so extracting such variables require dirty hack. > > I think I can implement this loads/stores instrumentation code anyway. > However, I'd like to hear opinions from experts on this domain. What would > be the most elegant and best approach for this problem? I will appreciate > any comments for my work. > > > Thank you for reading such a long article, > Minjang > > [1] http://www.cc.gatech.edu/~minjang/micro10-mjkim.pdf > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/9648c9dd/attachment.html From damien.llvm at gmail.com Mon Mar 21 15:00:24 2011 From: damien.llvm at gmail.com (Damien Vincent) Date: Mon, 21 Mar 2011 13:00:24 -0700 Subject: [LLVMdev] Text or Data symbol In-Reply-To: <4D87A248.80208@illinois.edu> References: <4D87A248.80208@illinois.edu> Message-ID: In fact, my initial idea was to use "dyn_cast(GV)", but I didn't use the resulting pointer (except for testing if the pointer is null). That's why I used classof... but I forgot about the existence of isa<>. Thank you for this clarification, Damien On Mon, Mar 21, 2011 at 12:08 PM, John Criswell wrote: > On 3/21/11 2:00 PM, Damien Vincent wrote: > > I reply to myself... I didn't go in the right direction in my previous > email. > > There is an easy way to tell if a GlobalValue corresponds to data or code: > const GlobalValue *GV; > if(Function::classof(GV)) > ... // process the global value as a function > else > ... // process the global value as data > > Damien > > > You should be able to use isa(GV) to determine if GV is a > function. You may have to put in an additional check to see if GV is a > GlobalAlias and to determine if the GlobalAlias is an alias for a function: > > if (GlobalAlias * GA = dyn_cast(GV)) { > ... Check to see if GA is an alias for a function > } > > I recommend looking at the doxygen documentation on llvm.org to learn the > class hierarchy relationships between GlobalValue, GlobalVariable, > GlobalAlias, and Function. You should also read the Programmer's Guide to > get familiar with the isa<>() and dyn_cast<>() functions if you are not > familiar with them already. > > -- John T. > > > > > > > On Fri, Mar 18, 2011 at 3:16 PM, Damien Vincent wrote: > >> >> I am again calling for help from LLVM developers ;) >> >> For my DSP backend, at the lowering stage and also at the AsmPrinter >> stage, I need to know if a GlobalAddress is a code or a data address. >> >> So I tried at the lowering stage to use: >> GlobalAddressSDNode *GSDN = cast(Op); >> const GlobalValue *GV = GSDN->getGlobal(); >> GV->hasSection() and GV->getSection() >> But the section is not set at this stage (hasSection = false) >> >> And at the AsmPrinter stage: >> const GlobalValue *GV = MO.getGlobal(); >> SectionKind sectionKind = Mang->getSymbol(GV)->getSection().getKind(); >> But again the section does not seem to be set (sectionKind.isInSection() = >> false) >> >> Do you know a way to tell if a global address corresponds to data or code >> ? I have to process differently text and data address... >> >> Thank you ! >> >> Damien >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/02fd7db1/attachment.html From atrick at apple.com Mon Mar 21 15:22:08 2011 From: atrick at apple.com (Andrew Trick) Date: Mon, 21 Mar 2011 13:22:08 -0700 Subject: [LLVMdev] LLVM Register allocation In-Reply-To: <4D7F7C45.4030509@nus.edu.sg> References: <4D7F7C45.4030509@nus.edu.sg> Message-ID: <51EA532B-7771-4D95-804A-E8CE9C9C777F@apple.com> On Mar 15, 2011, at 7:48 AM, Vijayaraghavan Murali wrote: > I'm relatively a newcomer to this forum and LLVM. I wish to do the > following: > > 1) play with LLVM's register allocation without any other optimizations > performed, such as inlining. This is because I'm trying to observe the > effects of our path-sensitive tool on register allocation but other > optimizations could influence the results. In other words, I would like > to perform register allocation with -O0, if that's possible. > > 2) view the results of register allocation. That is, the mapping of > variables to physical registers. I'm comfortable with reading dwarf > information. For eg, using gcc I would do: gcc -c -g hello.c ; dwarfdump > hello.o > > Kindly guide me in performing the above steps. If they are not possible, > is there any workaround? I'm not sure the best way to do what you're asking, but I haven't seen any other responses... For regalloc experiments, you probably want to suppress individual optimizations rather than using -O0. I always refer to the gcc docs and assume clang supports the option, because I haven't found any equivalent doc for clang. You can get some undocumented help text using the secret command "clang -cc1 -help", which is kind of like "gcc -v -help t.c". e.g. clang -fno-inline ... If you want more control, you can split up the compilation path into these steps: I'm not sure if -O0 is needed here but it should ensure clang won't inline before generating bitcode. $ clang -O0 -emit-llvm -c t.c -o t.bc -emit-llvm is the equivalent of -flto that shows up in --help. Normally, you would optimize bitcode using this. $ opt -std-compile-opts t.bc -o t.bc But for true -O0, you can skip "opt" altogether. For regalloc, I suggest at least running "opt -mem2reg". See opt -help for more passes. You may then want to run optimal codegen: llc -O3 t.bc -o t.s If not, you can override regalloc: llc -O0 -regalloc=linearscan t.bc -o t.s See llc -help for more options. You can trace regalloc using: $ llc -debug-only=regalloc Other traceable modules are: liveintervals, virtregmap, spiller, virtregrewriter -Andy From bob.wilson at apple.com Mon Mar 21 15:47:34 2011 From: bob.wilson at apple.com (Bob Wilson) Date: Mon, 21 Mar 2011 13:47:34 -0700 Subject: [LLVMdev] GIT mirroring In-Reply-To: References: Message-ID: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> I've been using Jakob's commands below, but it has stopped working for me. This happened roughly around the time when Anton added the other branches, but I'm not sure if that was the cause or not. The symptom is that the "git svn rebase -l" command does nothing except say that the master branch is already up to date, and "git svn dcommit" complains that files are out of date. In both cases, "git svn rebase" (without "-l") solves the problem, but without using the GIT mirror, so it's slow. I've tried re-creating my git repos from scratch but that did not fix the problem. Any ideas? On Feb 1, 2011, at 1:54 PM, Jakob Stoklund Olesen wrote: > > On Feb 1, 2011, at 12:20 PM, Anton Korobeynikov wrote: > >> Hello Everyone, >> >> It seems given the decent amount of discussions it's time to make >> small announcement. >> >> So, official git mirrors are available for some subset of LLVM >> projects. They were used by some LLVM developers for couple of months >> already and seem to be stable enough. > > Thank you for setting this up, Anton! > > This is how I use the Git mirror with git-svn: > > For the initial clone and setup: > > $ git clone http://llvm.org/git/llvm.git > $ cd llvm > $ git config --add remote.origin.fetch '+refs/remotes/git-svn:refs/remotes/git-svn' > $ git fetch > $ git svn init https://llvm.org/svn/llvm-project/llvm/trunk > $ git svn rebase -l > > This will quickly build the git-svn metadata by using the magical remotes/git-svn branch fetched from the origin. > > To update I run: > > $ git fetch > $ git svn rebase -l > > And to commit: > > $ git svn dcommit > $ git fetch > $ git svn rebase -l > > I have sometimes seen git-svn refusing to dcommit, claiming that I have uncommitted files in my tree. I think this happens when I forget to resynchronize the metadata after committing. Anyway, the solution is to wipe away all of .git/svn and rebuild it with "git svn rebase -l" > > /jakob > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From stoklund at 2pi.dk Mon Mar 21 15:54:10 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 21 Mar 2011 13:54:10 -0700 Subject: [LLVMdev] GIT mirroring In-Reply-To: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> References: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> Message-ID: <8918B53B-0A4C-48CE-897F-BA43F637690C@2pi.dk> On Mar 21, 2011, at 1:47 PM, Bob Wilson wrote: > I've been using Jakob's commands below, but it has stopped working for me. This happened roughly around the time when Anton added the other branches, but I'm not sure if that was the cause or not. The symptom is that the "git svn rebase -l" command does nothing except say that the master branch is already up to date, and "git svn dcommit" complains that files are out of date. In both cases, "git svn rebase" (without "-l") solves the problem, but without using the GIT mirror, so it's slow. > > I've tried re-creating my git repos from scratch but that did not fix the problem. Any ideas? The trick with 'git config --add remote.origin.fetch' doesn't work anymore. I tried changing it to refer to the remote's master branch instead, but that has caused strange problems. I now run 'git update-ref' every time I fetch from the mirror: git fetch -p origin git update-ref refs/remotes/git-svn origin/master git svn rebase -l Same for committing: git svn dcommit git fetch -p origin git update-ref refs/remotes/git-svn origin/master git svn rebase -l For the initial clone and setup: git clone http://llvm.org/git/llvm.git cd llvm git svn init https://llvm.org/svn/llvm-project/llvm/trunk git update-ref refs/remotes/git-svn origin/master git svn rebase -l /jakob From czhang at qualcomm.com Mon Mar 21 15:59:16 2011 From: czhang at qualcomm.com (Zhang, Chihong) Date: Mon, 21 Mar 2011 13:59:16 -0700 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <000201cbe7b4$efeb83f0$cfc28bd0$%Lokhmotov@arm.com> Message-ID: Hi Chris, It is important for embedded/mobile computation to have efficient fp16 support, otherwise those users will suffer from the merging problem with their local LLVM with native fp16 type they add (locally). So we should either add full fp16 support as a basic floating point type or enhance the LLVM infrastructure to make floating point type as scalable as int type. -Chihong -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Lattner Sent: Monday, March 21, 2011 11:26 AM To: Anton.Lokhmotov at arm.com Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] [PATCH] OpenCL half support On Mar 21, 2011, at 3:44 AM, Anton Lokhmotov wrote: >> Adding half float to LLVM IR is *only* reasonable if you have hardware >> that supports half float, or if you want to add softfloat operations >> for these. > Yes, our graphics hardware natively supports some fp16 arithmetic > operations. Ok. >> Just like C compilers need to know sizeof(long), sizeof(void*) and >> many many other target specific details, an OpenCL compiler would need >> to know whether to generate fp16 or not. > Yes, it's just another example of LLVM-IR non-portability. Basically, any > fp16 arithmetic code can be generated only if the cl_khr_fp16 extension is > supported (otherwise, the frontend would reject even declaring fp16 > variables, leave alone performing arithmetic on them). If the backend generates softfloat (or some other expansion) for fp16, then a native fp16 type would be perfectly portable. This is just not the "portability" that you're looking for (which is not behavior preserving, so it isn't portability by its standard definition). -Chris _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Mon Mar 21 16:16:46 2011 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 22 Mar 2011 00:16:46 +0300 Subject: [LLVMdev] GIT mirroring In-Reply-To: <8918B53B-0A4C-48CE-897F-BA43F637690C@2pi.dk> References: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> <8918B53B-0A4C-48CE-897F-BA43F637690C@2pi.dk> Message-ID: >> I've tried re-creating my git repos from scratch but that did not fix the problem. ?Any ideas? > > The trick with 'git config --add remote.origin.fetch' doesn't work anymore. I tried changing it to refer to the remote's master branch instead, but that has caused strange problems. I'm not a git-svn expert, but I suspect the real problem is that git-svn automagically updates master from git-svn remote. Right now we're exporting just master and thus stuff appears to form a cycle. This is just a random thought though :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From grosser at fim.uni-passau.de Mon Mar 21 16:27:44 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 21 Mar 2011 17:27:44 -0400 Subject: [LLVMdev] GIT mirroring In-Reply-To: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> References: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> Message-ID: <4D87C2D0.50509@fim.uni-passau.de> On 03/21/2011 04:47 PM, Bob Wilson wrote: > I've been using Jakob's commands below, but it has stopped working for me. This happened roughly around the time when Anton added the other branches, but I'm not sure if that was the cause or not. The symptom is that the "git svn rebase -l" command does nothing except say that the master branch is already up to date, and "git svn dcommit" complains that files are out of date. In both cases, "git svn rebase" (without "-l") solves the problem, but without using the GIT mirror, so it's slow. > > I've tried re-creating my git repos from scratch but that did not fix the problem. Any ideas? Yes, here the changes necessary: >> $ git clone http://llvm.org/git/llvm.git >> $ cd llvm >> $ git config --add remote.origin.fetch '+refs/remotes/git-svn:refs/remotes/git-svn' >> $ git fetch Skip the last two lines line. >> $ git svn init https://llvm.org/svn/llvm-project/llvm/trunk Add here: git config svn-remote.svn.fetch ':refs/remotes/origin/master' >> $ git svn rebase -l And here I personally just use a 'git svn fetch' This is the complete sequence $ git clone http://llvm.org/git/llvm.git $ cd llvm $ git svn init https://llvm.org/svn/llvm-project/llvm/trunk $ git config svn-remote.svn.fetch ':refs/remotes/origin/master' $ git svn fetch LLVM trunk is now in origin/master and can be accessed e.g. by $git log origin/master It can be updated by using $git remote update $git fetch $git pull Cheers Tobi From grosser at fim.uni-passau.de Mon Mar 21 16:29:43 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 21 Mar 2011 17:29:43 -0400 Subject: [LLVMdev] GIT mirroring In-Reply-To: <8918B53B-0A4C-48CE-897F-BA43F637690C@2pi.dk> References: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> <8918B53B-0A4C-48CE-897F-BA43F637690C@2pi.dk> Message-ID: <4D87C347.60503@fim.uni-passau.de> On 03/21/2011 04:54 PM, Jakob Stoklund Olesen wrote: > > On Mar 21, 2011, at 1:47 PM, Bob Wilson wrote: > >> I've been using Jakob's commands below, but it has stopped working for me. This happened roughly around the time when Anton added the other branches, but I'm not sure if that was the cause or not. The symptom is that the "git svn rebase -l" command does nothing except say that the master branch is already up to date, and "git svn dcommit" complains that files are out of date. In both cases, "git svn rebase" (without "-l") solves the problem, but without using the GIT mirror, so it's slow. >> >> I've tried re-creating my git repos from scratch but that did not fix the problem. Any ideas? > > The trick with 'git config --add remote.origin.fetch' doesn't work anymore. I tried changing it to refer to the remote's master branch instead, but that has caused strange problems. > > I now run 'git update-ref' every time I fetch from the mirror: You should be able to get rid of this by calling: git config svn-remote.svn.fetch ':refs/remotes/origin/master' Now the default svn branch points to refs/remotes/origin/master instead of refs/remotes/git-svn and everything should work automatically. Let me know if there are any problems with this approach. Tobi From atrick at apple.com Mon Mar 21 16:46:37 2011 From: atrick at apple.com (Andrew Trick) Date: Mon, 21 Mar 2011 14:46:37 -0700 Subject: [LLVMdev] Profiling support in LLVM In-Reply-To: References: Message-ID: Hi Najem, On Mar 21, 2011, at 9:47 AM, NaJeM ErMeLeH wrote: > I'm assisting my doctor who is doing a research and he wants to use the llvm compiler, my job is to profile build the benchmarks using llvm-prof. > > What i want to know is the following > > 1- does llvm support profile feedback optimizations!? Not yet. Please see Bob's proposal: http://article.gmane.org/gmane.comp.compilers.llvm.devel/37107/match=profile > 2- when i've used the llvm-prof it's input is an object file (not binary as other compilers) my question is how could I profile a whole benchmark program using the llvm-prof ? I haven't done it, but I think the correct answer is to use llvm-ld to generate a single bitcode file, then run llvm-prof. > 3- is there a way to print the spill code information (e.g. spill code count in a single function or basic block) ? -stats give you aggregate counts. Unfortunately, I don't know a way to do per-function reporting without using llvm-extract. You might be able to scrape -debug-only=spiller output for block info. -Andy > your help is appreciated. > > regards, > ~Najem > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/ef0bf5ca/attachment.html From criswell at illinois.edu Mon Mar 21 17:19:53 2011 From: criswell at illinois.edu (John Criswell) Date: Mon, 21 Mar 2011 17:19:53 -0500 Subject: [LLVMdev] Profiling support in LLVM In-Reply-To: References: Message-ID: <4D87CF09.7000902@illinois.edu> Dear Najem, You might want to read the llvm-prof documentation if you haven't already: http://llvm.org/cmds/llvm-prof.html. The documentation mentions a script in the utils directory that automates some of the profiling tasks for you. I suspect the way that llvm-prof works is to compile your whole program to a single LLVM bitcode file, run a transform on it, and then generate native code, link in the LLVM profiling run-time library, and then run the program. You then use llvm-prof to analyze the original bitcode file and the output from running the program to get the report. That's just a guess, though; I've never used llvm-prof myself. I bet looking at the script in the utils directory will shed light on how to use llvm-prof. -- John T. On 3/21/11 4:46 PM, Andrew Trick wrote: > Hi Najem, > > On Mar 21, 2011, at 9:47 AM, NaJeM ErMeLeH wrote: >> I'm assisting my doctor who is doing a research and he wants to use >> the llvm compiler, my job is to profile build the benchmarks using >> llvm-prof. >> >> What i want to know is the following >> >> 1- does llvm support profile feedback optimizations!? > > Not yet. Please see Bob's proposal: > http://article.gmane.org/gmane.comp.compilers.llvm.devel/37107/match=profile > >> 2- when i've used the llvm-prof it's input is an object file (not >> binary as other compilers) my question is how could I profile a whole >> benchmark program using the llvm-prof ? > > I haven't done it, but I think the correct answer is to use llvm-ld to > generate a single bitcode file, then run llvm-prof. > >> 3- is there a way to print the spill code information (e.g. spill >> code count in a single function or basic block) ? > > -stats give you aggregate counts. Unfortunately, I don't know a way to > do per-function reporting without using llvm-extract. > > You might be able to scrape -debug-only=spiller output for block info. > > -Andy > >> your help is appreciated. >> >> regards, >> ~Najem >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/b407a361/attachment-0001.html From logytech at gmail.com Mon Mar 21 17:34:09 2011 From: logytech at gmail.com (Andrei Alvares) Date: Mon, 21 Mar 2011 19:34:09 -0300 Subject: [LLVMdev] Profiling support in LLVM In-Reply-To: <4D87CF09.7000902@illinois.edu> References: <4D87CF09.7000902@illinois.edu> Message-ID: Hi Najem, Our compiler group had a meeting sometime ago which we discussed how to profile a simple program with gcc and llvm. You can find our discussion here: http://www2.dcc.ufmg.br/laboratorios/llp/wiki/doku.php?id=blog:optmeetings:2009_05_13_-_profiling Please note that it is a very basic tutorial, with a simple example. But it can be used as a first contact with llvm profiling framework. I hope it can be useful. Best regards, Andrei On Mon, Mar 21, 2011 at 7:19 PM, John Criswell wrote: > Dear Najem, > > You might want to read the llvm-prof documentation if you haven't already: > http://llvm.org/cmds/llvm-prof.html.? The documentation mentions a script in > the utils directory that automates some of the profiling tasks for you. > > I suspect the way that llvm-prof works is to compile your whole program to a > single LLVM bitcode file, run a transform on it, and then generate native > code, link in the LLVM profiling run-time library, and then run the > program.? You then use llvm-prof to analyze the original bitcode file and > the output from running the program to get the report.? That's just a guess, > though; I've never used llvm-prof myself. > > I bet looking at the script in the utils directory will shed light on how to > use llvm-prof. > > -- John T. > > > On 3/21/11 4:46 PM, Andrew Trick wrote: > > Hi Najem, > On Mar 21, 2011, at 9:47 AM, NaJeM ErMeLeH wrote: > > I'm assisting my doctor who is doing a research and he wants to use the llvm > compiler, my job is to profile build the benchmarks using llvm-prof. > > What i want to know is the following > > 1- does llvm support profile feedback optimizations!? > > Not yet. Please see Bob's proposal: > http://article.gmane.org/gmane.comp.compilers.llvm.devel/37107/match=profile > > 2- when i've used the llvm-prof it's input is an object file (not binary as > other compilers) my question is how could I profile a whole benchmark > program using the llvm-prof ? > > I haven't done it, but I think the correct answer is to use llvm-ld to > generate a single bitcode file, then run llvm-prof. > > 3- is there a way to print the spill code information (e.g. spill code count > in a single function or basic block) ? > > -stats give you aggregate counts. Unfortunately, I don't know a way to do > per-function reporting without using llvm-extract. > You might be able to scrape -debug-only=spiller output for block info. > -Andy > > your help is appreciated. > > regards, > ~Najem > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu?????????http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > From wendling at apple.com Mon Mar 21 18:09:11 2011 From: wendling at apple.com (Bill Wendling) Date: Mon, 21 Mar 2011 16:09:11 -0700 Subject: [LLVMdev] Announcement: 2.9rc2 Delay Message-ID: <54392E98-6036-4F41-8684-C088003A81BF@apple.com> Hi, There is still one open bug that is worrisome: http://llvm.org/bugs/show_bug.cgi?id=9469 It shows a regression in the test-suite. However, it can't be replicated in mainline. I won't be tagging the release candidate 2 today as scheduled because I would first like to understand what the error here is. If you have a Linux machine and can help out, please do. It may involve performing a bisect of mainline between the release revision (~r127210) and ToT. As a side note, please make sure that all patches which should be merged into the 2.9 branch are indeed there. I.e., make sure that I got the emails and did my job. :-) If the code owners haven't yet looked at the patches, please ping them to do so. This is only a small delay and shouldn't affect the projected release date. Thanks! -bw From clattner at apple.com Mon Mar 21 19:04:32 2011 From: clattner at apple.com (Chris Lattner) Date: Mon, 21 Mar 2011 17:04:32 -0700 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <000201cbe7b4$efeb83f0$cfc28bd0$%Lokhmotov@arm.com> Message-ID: <1EC88802-72BD-4509-A815-100429552C8C@apple.com> On Mar 21, 2011, at 1:59 PM, Zhang, Chihong wrote: > Hi Chris, > > It is important for embedded/mobile computation to have efficient fp16 support, otherwise those users will suffer from the merging problem with their local LLVM with native fp16 type they add (locally). So we should either add full fp16 support as a basic floating point type or enhance the LLVM infrastructure to make floating point type as scalable as int type. As I've said several times now :), I'm ok with having fp16 as a native LLVM type so long as there is hardware that implements fp16 arithmetic operations like add and sub with correct fp16 rounding etc. -Chris > > > -Chihong > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Lattner > Sent: Monday, March 21, 2011 11:26 AM > To: Anton.Lokhmotov at arm.com > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [PATCH] OpenCL half support > > > On Mar 21, 2011, at 3:44 AM, Anton Lokhmotov wrote: > >>> Adding half float to LLVM IR is *only* reasonable if you have hardware >>> that supports half float, or if you want to add softfloat operations >>> for these. >> Yes, our graphics hardware natively supports some fp16 arithmetic >> operations. > > Ok. > >>> Just like C compilers need to know sizeof(long), sizeof(void*) and >>> many many other target specific details, an OpenCL compiler would need >>> to know whether to generate fp16 or not. >> Yes, it's just another example of LLVM-IR non-portability. Basically, any >> fp16 arithmetic code can be generated only if the cl_khr_fp16 extension is >> supported (otherwise, the frontend would reject even declaring fp16 >> variables, leave alone performing arithmetic on them). > > If the backend generates softfloat (or some other expansion) for fp16, then a native fp16 type would be perfectly portable. This is just not the "portability" that you're looking for (which is not behavior preserving, so it isn't portability by its standard definition). > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From sangmin.park at gmail.com Mon Mar 21 20:13:16 2011 From: sangmin.park at gmail.com (Sangmin Park) Date: Mon, 21 Mar 2011 21:13:16 -0400 Subject: [LLVMdev] gold plugin example Message-ID: Hi all, I got an error from the gold plugin example in the following link: http://llvm.org/docs/GoldPlugin.html#example1 Here is the error message: sangmin at sangmin-desktop:/tmp$ llvm-gcc -use-gold-plugin a.a b.o -o main /usr/bin/ld: error: a.a: no archive symbol table (run ranlib) /usr/bin/ld: /usr/lib/crt1.o:(.text+0x18): error: undefined reference to 'main' /usr/bin/ld: b.o: in function foo1:b.c(.text+0x4): error: undefined reference to 'foo2' collect2: ld returned 1 exit status I followed the instructions from the previous thread below, but I still have the problem. http://old.nabble.com/llvm-gold-plugin-example-td28140005.html I got the error with both 2.7 and 2.8. Here are my settings: LLVMgold.so in (with LLVM 2.8) - $HOME/llvm-gcc-4.2-2.8-i686-linux/libexec/gcc/i686-pc-linux-gnu/4.2.1/LLVMgold.so - $HOME/llvm-gcc-4.2-2.8-i686-linux/lib/bfd-plugins/LLVMgold.so - /usr/lib/bfd-plugins/LLVMgold.so libLLVMgold.so in (with LLVM 2.7) - $HOME/llvm-gcc-4.2-2.7-i686-linux/libexec/gcc/i686-pc-linux-gnu/4.2.1/libLLVMgold.so - $HOME/llvm-gcc-4.2-2.7-i686-linux/lib/bfd-plugins/libLLVMgold.so - /usr/lib/bfd-plugins/libLLVMgold.so Compiled ld-new, ar, nm-new are linked to /usr/bin/ld, /usr/bin/ar, /uar/bin/nm, respectively. sangmin at sangmin-desktop:/tmp$ ld -v GNU gold (GNU Binutils 2.21.51.20110316) 1.11 Thanks in advance for your help. Sangmin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/40b27f1f/attachment.html From zhaber at yopmail.com Mon Mar 21 20:36:48 2011 From: zhaber at yopmail.com (stackunderflow) Date: Mon, 21 Mar 2011 18:36:48 -0700 (PDT) Subject: [LLVMdev] -emit-llvm on ubuntu is broken Message-ID: <31206382.post@talk.nabble.com> I try to generate a human readable .ll file on Linux. I installed llvm-gcc but as I see it can generate only assembly code (-S option). Is there any way to get something like what is generated by llvm online compiler? That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: .file "hello.c" .ident "GCC: (Ubuntu/Linaro 4.5.1-7ubuntu2) 4.5.1 LLVM: " .text .globl main .align 16, 0x90 .type main, at function main: pushl %ebp movl %esp, %ebp subl $8, %esp movl $.L.str, 4(%esp) movl $1, (%esp) call __printf_chk xorl %eax, %eax addl $8, %esp popl %ebp ret .Ltmp0: .size main, .Ltmp0-main .type .L.str, at object .section .rodata.str1.1,"aMS", at progbits,1 .L.str: .asciz "hello world\n" .size .L.str, 13 .section .note.GNU-stack,"", at progbits That's what I am trying to get: ; ModuleID = '/tmp/webcompile/_7829_0.bc' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-linux-gnu" @.str = private constant [12 x i8] c"hello world\00", align 1 ; <[12 x i8]*> [#uses=1] define i32 @main() nounwind { entry: %0 = tail call i32 @puts(i8* getelementptr inbounds ([12 x i8]* @.str, i64 0, i64 0)) nounwind ; [#uses=0] ret i32 0 } declare i32 @puts(i8* nocapture) nounwind On windows I successfully get this file with the same command: llvm-gcc -S -emit-llvm hello.c. llvm -- View this message in context: http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206382.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From echristo at apple.com Mon Mar 21 20:44:39 2011 From: echristo at apple.com (Eric Christopher) Date: Mon, 21 Mar 2011 18:44:39 -0700 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <31206382.post@talk.nabble.com> References: <31206382.post@talk.nabble.com> Message-ID: On Mar 21, 2011, at 6:36 PM, stackunderflow wrote: > > I try to generate a human readable .ll file on Linux. I installed llvm-gcc > but as I see it can generate only assembly code (-S option). Is there any > way to get something like what is generated by llvm online compiler? > > That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: llvm-gcc -v ? -eric From zhaber at yopmail.com Mon Mar 21 21:05:29 2011 From: zhaber at yopmail.com (stackunderflow) Date: Mon, 21 Mar 2011 19:05:29 -0700 (PDT) Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: References: <31206382.post@talk.nabble.com> Message-ID: <31206493.post@talk.nabble.com> Hi Eric, here is my -emit-llvm -S -v output: Using built-in specs. COLLECT_GCC=gcc-4.5 COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.5.1/lto-wrapper Target: i686-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.1-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --with-plugin-ld=ld.gold --enable-objc-gc --enable-targets=all --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu --target=i686-linux-gnu Thread model: posix gcc version 4.5.1 (Ubuntu/Linaro 4.5.1-7ubuntu2) COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' /usr/lib/gcc/i686-linux-gnu/4.5.1/cc1 -quiet -v -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin hello.c -D_FORTIFY_SOURCE=2 -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin -quiet -dumpbase hello.c -mtune=generic -march=i686 -auxbase hello -version -fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so -o hello.s -fstack-protector GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version 3.0.0-p3, MPC version 0.8.2 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Versions of loaded plugins: dragonegg: ignoring nonexistent directory "/usr/local/include/i686-linux-gnu" ignoring nonexistent directory "/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../i686-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /usr/lib/gcc/i686-linux-gnu/4.5.1/include /usr/lib/gcc/i686-linux-gnu/4.5.1/include-fixed /usr/include/i686-linux-gnu /usr/include End of search list. GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version 3.0.0-p3, MPC version 0.8.2 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Versions of loaded plugins: dragonegg: Compiler executable checksum: ee807c30bb3adc8f3aa917a64443d0ec COMPILER_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/ LIBRARY_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../:/lib/:/usr/lib/:/usr/lib/i686-linux-gnu/ COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' Eric Christopher-2 wrote: > > > On Mar 21, 2011, at 6:36 PM, stackunderflow wrote: > >> >> I try to generate a human readable .ll file on Linux. I installed >> llvm-gcc >> but as I see it can generate only assembly code (-S option). Is there any >> way to get something like what is generated by llvm online compiler? >> >> That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: > > llvm-gcc -v ? > > -eric > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -- View this message in context: http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206493.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From echristo at apple.com Mon Mar 21 21:07:11 2011 From: echristo at apple.com (Eric Christopher) Date: Mon, 21 Mar 2011 19:07:11 -0700 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <31206493.post@talk.nabble.com> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> Message-ID: <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> Looks like something wonky with DragonEgg. Duncan? -eric On Mar 21, 2011, at 7:05 PM, stackunderflow wrote: > > Hi Eric, > > here is my -emit-llvm -S -v output: > > Using built-in specs. > COLLECT_GCC=gcc-4.5 > COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.5.1/lto-wrapper > Target: i686-linux-gnu > Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro > 4.5.1-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs > --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr > --program-suffix=-4.5 --enable-shared --enable-multiarch > --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib > --without-included-gettext --enable-threads=posix > --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib --enable-nls > --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug > --enable-libstdcxx-time=yes --enable-plugin --enable-gold > --with-plugin-ld=ld.gold --enable-objc-gc --enable-targets=all > --disable-werror --with-arch-32=i686 --with-tune=generic > --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu > --target=i686-linux-gnu > Thread model: posix > gcc version 4.5.1 (Ubuntu/Linaro 4.5.1-7ubuntu2) > COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' > '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' > /usr/lib/gcc/i686-linux-gnu/4.5.1/cc1 -quiet -v > -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin hello.c > -D_FORTIFY_SOURCE=2 -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin > -quiet -dumpbase hello.c -mtune=generic -march=i686 -auxbase hello -version > -fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so -o hello.s > -fstack-protector > GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) > compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version 3.0.0-p3, > MPC version 0.8.2 > GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 > Versions of loaded plugins: > dragonegg: > ignoring nonexistent directory "/usr/local/include/i686-linux-gnu" > ignoring nonexistent directory > "/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../i686-linux-gnu/include" > #include "..." search starts here: > #include <...> search starts here: > /usr/local/include > /usr/lib/gcc/i686-linux-gnu/4.5.1/include > /usr/lib/gcc/i686-linux-gnu/4.5.1/include-fixed > /usr/include/i686-linux-gnu > /usr/include > End of search list. > GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) > compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version 3.0.0-p3, > MPC version 0.8.2 > GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 > Versions of loaded plugins: > dragonegg: > Compiler executable checksum: ee807c30bb3adc8f3aa917a64443d0ec > COMPILER_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/ > LIBRARY_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../:/lib/:/usr/lib/:/usr/lib/i686-linux-gnu/ > COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' > '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' > > > > > > Eric Christopher-2 wrote: >> >> >> On Mar 21, 2011, at 6:36 PM, stackunderflow wrote: >> >>> >>> I try to generate a human readable .ll file on Linux. I installed >>> llvm-gcc >>> but as I see it can generate only assembly code (-S option). Is there any >>> way to get something like what is generated by llvm online compiler? >>> >>> That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: >> >> llvm-gcc -v ? >> >> -eric >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > -- > View this message in context: http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206493.html > Sent from the LLVM - Dev mailing list archive at Nabble.com. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From zhaber at yopmail.com Mon Mar 21 21:23:02 2011 From: zhaber at yopmail.com (stackunderflow) Date: Mon, 21 Mar 2011 19:23:02 -0700 (PDT) Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> Message-ID: <31206558.post@talk.nabble.com> I am new to LLVM, what do you mean by Duncan? Btw, I installed llvm from the repository: sudo apt-get install llvm llvm-gcc Eric Christopher-2 wrote: > > Looks like something wonky with DragonEgg. > > Duncan? > > -eric > > On Mar 21, 2011, at 7:05 PM, stackunderflow wrote: > >> >> Hi Eric, >> >> here is my -emit-llvm -S -v output: >> >> Using built-in specs. >> COLLECT_GCC=gcc-4.5 >> COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.5.1/lto-wrapper >> Target: i686-linux-gnu >> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro >> 4.5.1-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs >> --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr >> --program-suffix=-4.5 --enable-shared --enable-multiarch >> --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib >> --without-included-gettext --enable-threads=posix >> --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib >> --enable-nls >> --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug >> --enable-libstdcxx-time=yes --enable-plugin --enable-gold >> --with-plugin-ld=ld.gold --enable-objc-gc --enable-targets=all >> --disable-werror --with-arch-32=i686 --with-tune=generic >> --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu >> --target=i686-linux-gnu >> Thread model: posix >> gcc version 4.5.1 (Ubuntu/Linaro 4.5.1-7ubuntu2) >> COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' >> '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' >> /usr/lib/gcc/i686-linux-gnu/4.5.1/cc1 -quiet -v >> -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin hello.c >> -D_FORTIFY_SOURCE=2 -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin >> -quiet -dumpbase hello.c -mtune=generic -march=i686 -auxbase hello >> -version >> -fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so -o hello.s >> -fstack-protector >> GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) >> compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version >> 3.0.0-p3, >> MPC version 0.8.2 >> GGC heuristics: --param ggc-min-expand=100 --param >> ggc-min-heapsize=131072 >> Versions of loaded plugins: >> dragonegg: >> ignoring nonexistent directory "/usr/local/include/i686-linux-gnu" >> ignoring nonexistent directory >> "/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../i686-linux-gnu/include" >> #include "..." search starts here: >> #include <...> search starts here: >> /usr/local/include >> /usr/lib/gcc/i686-linux-gnu/4.5.1/include >> /usr/lib/gcc/i686-linux-gnu/4.5.1/include-fixed >> /usr/include/i686-linux-gnu >> /usr/include >> End of search list. >> GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) >> compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version >> 3.0.0-p3, >> MPC version 0.8.2 >> GGC heuristics: --param ggc-min-expand=100 --param >> ggc-min-heapsize=131072 >> Versions of loaded plugins: >> dragonegg: >> Compiler executable checksum: ee807c30bb3adc8f3aa917a64443d0ec >> COMPILER_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/ >> LIBRARY_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../:/lib/:/usr/lib/:/usr/lib/i686-linux-gnu/ >> COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' >> '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' >> >> >> >> >> >> Eric Christopher-2 wrote: >>> >>> >>> On Mar 21, 2011, at 6:36 PM, stackunderflow wrote: >>> >>>> >>>> I try to generate a human readable .ll file on Linux. I installed >>>> llvm-gcc >>>> but as I see it can generate only assembly code (-S option). Is there >>>> any >>>> way to get something like what is generated by llvm online compiler? >>>> >>>> That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: >>> >>> llvm-gcc -v ? >>> >>> -eric >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206493.html >> Sent from the LLVM - Dev mailing list archive at Nabble.com. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -- View this message in context: http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206558.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From echristo at apple.com Mon Mar 21 21:30:14 2011 From: echristo at apple.com (Eric Christopher) Date: Mon, 21 Mar 2011 19:30:14 -0700 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <31206558.post@talk.nabble.com> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> <31206558.post@talk.nabble.com> Message-ID: <004A1F05-64DE-436F-8016-AF29DC2EFE33@apple.com> On Mar 21, 2011, at 7:23 PM, stackunderflow wrote: > > I am new to LLVM, what do you mean by Duncan? > The guy I cc'd on the last email. > Btw, I installed llvm from the repository: sudo apt-get install llvm > llvm-gcc > I have no idea how ubuntu packages anything. -eric From sangmin.park at gmail.com Mon Mar 21 21:35:16 2011 From: sangmin.park at gmail.com (Sangmin Park) Date: Mon, 21 Mar 2011 22:35:16 -0400 Subject: [LLVMdev] gold plugin example In-Reply-To: References: Message-ID: Hi, I fixed error by using different version of ld with LLVM 2.7. I used ld.gold that comes with Ubuntu 10.04. Here is the version info: sangmin at sangmin-desktop:/tmp$ ld -v GNU gold (GNU Binutils for Ubuntu 2.20.1-system.20100303) 1.9 My experience of gold plugin is as follows: LLVM 2.7 (with libLLVMgold.so) + ld 2.20.1 (from Ubuntu 10.04) : O LLVM 2.7 (with libLLVMgold.so) + ld 2.21.51 (manually compiled) : X LLVM 2.8 (with LLVMgold.so) + ld 2.20.1 (from Ubuntu 10.04): X LLVM 2.8 (with LLVMgold.so) + ld 2.21.51 (manually compiled): X When compiling (lib)LLVMgold.so file, I used the source file from ld 2.21.51. I can work with LLVM 2.7 and ld 2.20.1 now, but am curious with the result. Can you explain what part I did wrong? Thanks, Sangmin On Mon, Mar 21, 2011 at 9:13 PM, Sangmin Park wrote: > Hi all, > > I got an error from the gold plugin example in the following link: > http://llvm.org/docs/GoldPlugin.html#example1 > > Here is the error message: > > sangmin at sangmin-desktop:/tmp$ llvm-gcc -use-gold-plugin a.a b.o -o main > /usr/bin/ld: error: a.a: no archive symbol table (run ranlib) > /usr/bin/ld: /usr/lib/crt1.o:(.text+0x18): error: undefined reference to > 'main' > /usr/bin/ld: b.o: in function foo1:b.c(.text+0x4): error: undefined > reference to 'foo2' > collect2: ld returned 1 exit status > > I followed the instructions from the previous thread below, but I still > have the problem. > http://old.nabble.com/llvm-gold-plugin-example-td28140005.html > > I got the error with both 2.7 and 2.8. > Here are my settings: > > LLVMgold.so in (with LLVM 2.8) > > - $HOME/llvm-gcc-4.2-2.8-i686-linux/libexec/gcc/i686-pc-linux-gnu/4.2.1/LLVMgold.so > - $HOME/llvm-gcc-4.2-2.8-i686-linux/lib/bfd-plugins/LLVMgold.so > - /usr/lib/bfd-plugins/LLVMgold.so > > libLLVMgold.so in (with LLVM 2.7) > > - $HOME/llvm-gcc-4.2-2.7-i686-linux/libexec/gcc/i686-pc-linux-gnu/4.2.1/libLLVMgold.so > - $HOME/llvm-gcc-4.2-2.7-i686-linux/lib/bfd-plugins/libLLVMgold.so > - /usr/lib/bfd-plugins/libLLVMgold.so > > Compiled ld-new, ar, nm-new are linked to /usr/bin/ld, /usr/bin/ar, > /uar/bin/nm, respectively. > > sangmin at sangmin-desktop:/tmp$ ld -v > GNU gold (GNU Binutils 2.21.51.20110316) 1.11 > > > Thanks in advance for your help. > > Sangmin > -- Sangmin Park / Ph.D. student College of Computing Georgia Institute of Technology -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110321/07386d26/attachment.html From baldrick at free.fr Tue Mar 22 03:30:05 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 22 Mar 2011 09:30:05 +0100 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> Message-ID: <4D885E0D.7040904@free.fr> Hi Eric, > Looks like something wonky with DragonEgg. you need to use -fplugin-arg-dragonegg-emit-ir or -flto with dragonegg, not -emit-llvm. Also, you currently have to use -S (getting human readable IR) rather than -c because with -c gcc will run cc1 with -S (getting human readable IR) then pass the result to the system assembler which of course barfs. This is documented on the web-page and in the README. Ciao, Duncan. > > Duncan? > > -eric > > On Mar 21, 2011, at 7:05 PM, stackunderflow wrote: > >> >> Hi Eric, >> >> here is my -emit-llvm -S -v output: >> >> Using built-in specs. >> COLLECT_GCC=gcc-4.5 >> COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.5.1/lto-wrapper >> Target: i686-linux-gnu >> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro >> 4.5.1-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs >> --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr >> --program-suffix=-4.5 --enable-shared --enable-multiarch >> --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib >> --without-included-gettext --enable-threads=posix >> --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib --enable-nls >> --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug >> --enable-libstdcxx-time=yes --enable-plugin --enable-gold >> --with-plugin-ld=ld.gold --enable-objc-gc --enable-targets=all >> --disable-werror --with-arch-32=i686 --with-tune=generic >> --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu >> --target=i686-linux-gnu >> Thread model: posix >> gcc version 4.5.1 (Ubuntu/Linaro 4.5.1-7ubuntu2) >> COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' >> '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' >> /usr/lib/gcc/i686-linux-gnu/4.5.1/cc1 -quiet -v >> -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin hello.c >> -D_FORTIFY_SOURCE=2 -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin >> -quiet -dumpbase hello.c -mtune=generic -march=i686 -auxbase hello -version >> -fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so -o hello.s >> -fstack-protector >> GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) >> compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version 3.0.0-p3, >> MPC version 0.8.2 >> GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 >> Versions of loaded plugins: >> dragonegg: >> ignoring nonexistent directory "/usr/local/include/i686-linux-gnu" >> ignoring nonexistent directory >> "/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../i686-linux-gnu/include" >> #include "..." search starts here: >> #include<...> search starts here: >> /usr/local/include >> /usr/lib/gcc/i686-linux-gnu/4.5.1/include >> /usr/lib/gcc/i686-linux-gnu/4.5.1/include-fixed >> /usr/include/i686-linux-gnu >> /usr/include >> End of search list. >> GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) >> compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version 3.0.0-p3, >> MPC version 0.8.2 >> GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 >> Versions of loaded plugins: >> dragonegg: >> Compiler executable checksum: ee807c30bb3adc8f3aa917a64443d0ec >> COMPILER_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/ >> LIBRARY_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../:/lib/:/usr/lib/:/usr/lib/i686-linux-gnu/ >> COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' >> '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' >> >> >> >> >> >> Eric Christopher-2 wrote: >>> >>> >>> On Mar 21, 2011, at 6:36 PM, stackunderflow wrote: >>> >>>> >>>> I try to generate a human readable .ll file on Linux. I installed >>>> llvm-gcc >>>> but as I see it can generate only assembly code (-S option). Is there any >>>> way to get something like what is generated by llvm online compiler? >>>> >>>> That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: >>> >>> llvm-gcc -v ? >>> >>> -eric >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> >> >> -- >> View this message in context: http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206493.html >> Sent from the LLVM - Dev mailing list archive at Nabble.com. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From echristo at apple.com Tue Mar 22 03:31:53 2011 From: echristo at apple.com (Eric Christopher) Date: Tue, 22 Mar 2011 01:31:53 -0700 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <4D885E0D.7040904@free.fr> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> <4D885E0D.7040904@free.fr> Message-ID: On Mar 22, 2011, at 1:30 AM, Duncan Sands wrote: > Hi Eric, > >> Looks like something wonky with DragonEgg. > > you need to use -fplugin-arg-dragonegg-emit-ir or -flto with dragonegg, > not -emit-llvm. Also, you currently have to use -S (getting human readable > IR) rather than -c because with -c gcc will run cc1 with -S (getting human > readable IR) then pass the result to the system assembler which of course > barfs. This is documented on the web-page and in the README. Interesting. Makes complete sense. -eric From baldrick at free.fr Tue Mar 22 03:33:00 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 22 Mar 2011 09:33:00 +0100 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> Message-ID: <4D885EBC.1040901@free.fr> PS: I assume this is with debian's dragonegg package which has a script called "llvm-gcc" which runs gcc-4.5+dragonegg. It would be neat if the script intercepted -emit-llvm and turned it into the right thing, but last time I look this wasn't implemented. From echristo at apple.com Tue Mar 22 03:35:34 2011 From: echristo at apple.com (Eric Christopher) Date: Tue, 22 Mar 2011 01:35:34 -0700 Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <4D885EBC.1040901@free.fr> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> <4D885EBC.1040901@free.fr> Message-ID: <935020F4-71F9-470C-A01E-E92D2BB8379A@apple.com> On Mar 22, 2011, at 1:33 AM, Duncan Sands wrote: > PS: I assume this is with debian's dragonegg package which has a script called > "llvm-gcc" which runs gcc-4.5+dragonegg. It would be neat if the script > intercepted -emit-llvm and turned it into the right thing, but last time I look > this wasn't implemented. It is. I haven't played with it at all so I wasn't sure what the correct thing would be. Apparently there's an open bug on it in the ubuntu system. -eric From syoyofujita at gmail.com Tue Mar 22 06:29:12 2011 From: syoyofujita at gmail.com (Syoyo Fujita) Date: Tue, 22 Mar 2011 20:29:12 +0900 Subject: [LLVMdev] sitofp inst selection in x86/AVX target [PR9473] Message-ID: Hello LLVMer's I am now trying to fix a bug PR9473. sitofp instruction in LLVM IR is converted to vcvtsi2sd(also applied to vcvtsi2ss case) for x86/AVX backend, but vcvtsi2sd is somewhat odd instruction format. VCVTSI2SD xmm1, xmm2, r/m32 VCVTSI2SD xmm1, xmm2, r/m64 bits(127:64) of xmm2 is copied to corresponding bits of xmm1, thus in many case xmm1 and xmm2 could be same register. Currently, the definition of VCVTSI2SD in X86InstrSSE.td expects 3 operand(1 dst, 2 srcs). This is OK for asm parser, but NG for LLVM IR inst selection since sitofp instruction dag just takes 1 dst and 1 src. I am not so familiar with .td format yet, but after some investigation I found it seems impossible to share .td definition of vcvtsi2sd for asm parser and isel. I got success by defining separate .td definition for VCVTSI2SD to fix bug PR9473: define new definition of VCVTSI2SD for isel in isAsmParserOnly = 0 block and move existing VCVTSI2SD definition from isAsmParserOnly = 0 into isAsmParserOnly = 1 block so that existing VCVTSI2SD definition takes effect only in asm parser. Example solution is as follows. lib/Target/X86/x86InstrSSE.td ... multiclass sse12_vcvt_avx_s opc, RegisterClass SrcRC, RegisterClass DstRC, SDNode OpNode, X86MemOperand x86memop, PatFrag ld_frag, string asm> { def rr : SI; def rm : SI; } let isAsmParserOnly = 0 in { defm SInt_VCVTSI2SD : sse12_vcvt_avx_s<0x2A, GR32, FR64, sint_to_fp, i32mem, loadi32, "cvtsi2sd\t{$src, $dst, $dst|$dst, $dst, $src}">, XD, VEX; ... } let isAsmParserOnly = 1 in { defm VCVTSI2SD : sse12_vcvt_avx<0x2A, GR32, FR64, i32mem, "cvtsi2sd">, XD, VEX_4V; ... } If this style of modification is OK for people working on x86/AVX .td, I am ready to provide a patch. Or is there any better way? -- Syoyo From rafael.espindola at gmail.com Tue Mar 22 08:19:02 2011 From: rafael.espindola at gmail.com (Rafael Avila de Espindola) Date: Tue, 22 Mar 2011 09:19:02 -0400 Subject: [LLVMdev] gold plugin example In-Reply-To: References: Message-ID: <4D88A1C6.90007@gmail.com> On 11-03-21 10:35 PM, Sangmin Park wrote: > Hi, > > I fixed error by using different version of ld with LLVM 2.7. > I used ld.gold that comes with Ubuntu 10.04. > Here is the version info: > > sangmin at sangmin-desktop:/tmp$ ld -v > GNU gold (GNU Binutils for Ubuntu 2.20.1-system.20100303) 1.9 > > My experience of gold plugin is as follows: > > LLVM 2.7 (with libLLVMgold.so) + ld 2.20.1 (from Ubuntu 10.04) : O > LLVM 2.7 (with libLLVMgold.so) + ld 2.21.51 (manually compiled) : X > LLVM 2.8 (with LLVMgold.so) + ld 2.20.1 (from Ubuntu 10.04): X > LLVM 2.8 (with LLVMgold.so) + ld 2.21.51 (manually compiled): X > > When compiling (lib)LLVMgold.so file, I used the source file from ld > 2.21.51. > > I can work with LLVM 2.7 and ld 2.20.1 now, but am curious with the result. > Can you explain what part I did wrong? > > Thanks, > Sangmin > > > > On Mon, Mar 21, 2011 at 9:13 PM, Sangmin Park > wrote: > > Hi all, > > I got an error from the gold plugin example in the following link: > http://llvm.org/docs/GoldPlugin.html#example1 > > Here is the error message: > > sangmin at sangmin-desktop:/tmp$ llvm-gcc -use-gold-plugin a.a b.o -o main > /usr/bin/ld: error: a.a: no archive symbol table (run ranlib) Looks like ar is not using the plugin. Have you created the links in bfd-plugins? Can you run nm in a IL file? Cheers, Rafael From ramaswamy at cse.iitb.ac.in Tue Mar 22 10:19:32 2011 From: ramaswamy at cse.iitb.ac.in (Gokul Ramaswamy) Date: Tue, 22 Mar 2011 20:49:32 +0530 (IST) Subject: [LLVMdev] Parallelization Message-ID: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> Hi All, I am new to LLVM. So please help me out. Here is what I am trying to achieve: If there are 2 statements in a source program - S1; S2; and I know these is no data and control dependency between them and both take large amount of time to execute. So I want to execute them in parallel. So as S1 starts executing, I want to launch another thread and execute S2 in parallel. I need help on how to launch a new thread and schedule some specific code on this new thread. I searched for it but did not get satisfiable results. Please help me out LLVM Developers. Regards, Gokul Ramaswamy H.C From stoklund at 2pi.dk Tue Mar 22 11:36:08 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 22 Mar 2011 09:36:08 -0700 Subject: [LLVMdev] GIT mirroring In-Reply-To: <4D87C347.60503@fim.uni-passau.de> References: <8E63ECD1-47ED-4870-B2A8-5FC71993783B@apple.com> <8918B53B-0A4C-48CE-897F-BA43F637690C@2pi.dk> <4D87C347.60503@fim.uni-passau.de> Message-ID: <0D54EF78-D610-4ABA-90CA-1E94CF4EB223@2pi.dk> On Mar 21, 2011, at 2:29 PM, Tobias Grosser wrote: > On 03/21/2011 04:54 PM, Jakob Stoklund Olesen wrote: >> >> On Mar 21, 2011, at 1:47 PM, Bob Wilson wrote: >> >>> I've been using Jakob's commands below, but it has stopped working for me. This happened roughly around the time when Anton added the other branches, but I'm not sure if that was the cause or not. The symptom is that the "git svn rebase -l" command does nothing except say that the master branch is already up to date, and "git svn dcommit" complains that files are out of date. In both cases, "git svn rebase" (without "-l") solves the problem, but without using the GIT mirror, so it's slow. >>> >>> I've tried re-creating my git repos from scratch but that did not fix the problem. Any ideas? >> >> The trick with 'git config --add remote.origin.fetch' doesn't work anymore. I tried changing it to refer to the remote's master branch instead, but that has caused strange problems. >> >> I now run 'git update-ref' every time I fetch from the mirror: > > You should be able to get rid of this by calling: > > git config svn-remote.svn.fetch ':refs/remotes/origin/master' That would work too. /jakob From baldrick at free.fr Tue Mar 22 11:38:22 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 22 Mar 2011 17:38:22 +0100 Subject: [LLVMdev] Parallelization In-Reply-To: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> References: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> Message-ID: <4D88D07E.5020408@free.fr> Hi Gokul Ramaswamy, > I am new to LLVM. So please help me out. Here is what I am trying to > achieve: > > If there are 2 statements in a source program - > S1; > S2; > > and I know these is no data and control dependency between them and > both take large amount of time to execute. So I want to execute them > in parallel. > > So as S1 starts executing, I want to launch another thread and > execute S2 in parallel. > > I need help on how to launch a new thread and schedule some specific > code on this new thread. I searched for it but did not get satisfiable > results. Please help me out LLVM Developers. llvm-gcc and dragonegg support GOMP (gnu open-mp). The way it works is that the front-end lowers parallel constructs into library calls, extra functions and so on. Ciao, Duncan. > > Regards, > Gokul Ramaswamy H.C > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From czhang at qualcomm.com Tue Mar 22 12:02:34 2011 From: czhang at qualcomm.com (Zhang, Chihong) Date: Tue, 22 Mar 2011 10:02:34 -0700 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <1EC88802-72BD-4509-A815-100429552C8C@apple.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> <000201cbe7b4$efeb83f0$cfc28bd0$%Lokhmotov@arm.com> <1EC88802-72BD-4509-A815-100429552C8C@apple.com> Message-ID: Sorry. I should have clearly said: "there are already quite some embedded/mobile chips providing fp16 ALU operations for performance". Thanks, Chihong -----Original Message----- From: Chris Lattner [mailto:clattner at apple.com] Sent: Monday, March 21, 2011 5:05 PM To: Zhang, Chihong Cc: Anton.Lokhmotov at arm.com; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] [PATCH] OpenCL half support On Mar 21, 2011, at 1:59 PM, Zhang, Chihong wrote: > Hi Chris, > > It is important for embedded/mobile computation to have efficient fp16 support, otherwise those users will suffer from the merging problem with their local LLVM with native fp16 type they add (locally). So we should either add full fp16 support as a basic floating point type or enhance the LLVM infrastructure to make floating point type as scalable as int type. As I've said several times now :), I'm ok with having fp16 as a native LLVM type so long as there is hardware that implements fp16 arithmetic operations like add and sub with correct fp16 rounding etc. -Chris > > > -Chihong > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Lattner > Sent: Monday, March 21, 2011 11:26 AM > To: Anton.Lokhmotov at arm.com > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [PATCH] OpenCL half support > > > On Mar 21, 2011, at 3:44 AM, Anton Lokhmotov wrote: > >>> Adding half float to LLVM IR is *only* reasonable if you have hardware >>> that supports half float, or if you want to add softfloat operations >>> for these. >> Yes, our graphics hardware natively supports some fp16 arithmetic >> operations. > > Ok. > >>> Just like C compilers need to know sizeof(long), sizeof(void*) and >>> many many other target specific details, an OpenCL compiler would need >>> to know whether to generate fp16 or not. >> Yes, it's just another example of LLVM-IR non-portability. Basically, any >> fp16 arithmetic code can be generated only if the cl_khr_fp16 extension is >> supported (otherwise, the frontend would reject even declaring fp16 >> variables, leave alone performing arithmetic on them). > > If the backend generates softfloat (or some other expansion) for fp16, then a native fp16 type would be perfectly portable. This is just not the "portability" that you're looking for (which is not behavior preserving, so it isn't portability by its standard definition). > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From sangmin.park at gmail.com Tue Mar 22 12:20:08 2011 From: sangmin.park at gmail.com (Sangmin Park) Date: Tue, 22 Mar 2011 13:20:08 -0400 Subject: [LLVMdev] gold plugin example In-Reply-To: <4D88A1C6.90007@gmail.com> References: <4D88A1C6.90007@gmail.com> Message-ID: On Tue, Mar 22, 2011 at 9:19 AM, Rafael Avila de Espindola < rafael.espindola at gmail.com> wrote: > On 11-03-21 10:35 PM, Sangmin Park wrote: > > Hi, > > > > I fixed error by using different version of ld with LLVM 2.7. > > I used ld.gold that comes with Ubuntu 10.04. > > Here is the version info: > > > > sangmin at sangmin-desktop:/tmp$ ld -v > > GNU gold (GNU Binutils for Ubuntu 2.20.1-system.20100303) 1.9 > > > > My experience of gold plugin is as follows: > > > > LLVM 2.7 (with libLLVMgold.so) + ld 2.20.1 (from Ubuntu 10.04) : O > > LLVM 2.7 (with libLLVMgold.so) + ld 2.21.51 (manually compiled) : X > > LLVM 2.8 (with LLVMgold.so) + ld 2.20.1 (from Ubuntu 10.04): X > > LLVM 2.8 (with LLVMgold.so) + ld 2.21.51 (manually compiled): X > > > > When compiling (lib)LLVMgold.so file, I used the source file from ld > > 2.21.51. > > > > I can work with LLVM 2.7 and ld 2.20.1 now, but am curious with the > result. > > Can you explain what part I did wrong? > > > > Thanks, > > Sangmin > > > > > > > > On Mon, Mar 21, 2011 at 9:13 PM, Sangmin Park > > wrote: > > > > Hi all, > > > > I got an error from the gold plugin example in the following link: > > http://llvm.org/docs/GoldPlugin.html#example1 > > > > Here is the error message: > > > > sangmin at sangmin-desktop:/tmp$ llvm-gcc -use-gold-plugin a.a b.o -o > main > > /usr/bin/ld: error: a.a: no archive symbol table (run ranlib) > > Looks like ar is not using the plugin. Have you created the links in > bfd-plugins? Can you run nm in a IL file? > > I have created the links in bfd-plugins, but I had some mistakes in handling versions. Now, I found that all combinations of versions work well for the example. By the way, how can I manually check whether ar and nm work? Thanks, Sangmin > Cheers, > Rafael > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/233d037b/attachment.html From gokulhcramaswamy at gmail.com Tue Mar 22 12:36:28 2011 From: gokulhcramaswamy at gmail.com (Gokul Ramaswamy) Date: Tue, 22 Mar 2011 23:06:28 +0530 Subject: [LLVMdev] Parallelization In-Reply-To: <4D88D07E.5020408@free.fr> References: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> <4D88D07E.5020408@free.fr> Message-ID: Hi Duncan Sands, As I have understood, GOMP and OpenMP provides support for parallelizing program at source program level. But I am at the IR level. That is I am trying to parallelize the IR code. This is the case of automatic parallelization. The programmer writing the code does not have any idea of parallelization going behind the hood. So my question is instead of support at the source program level, is the an support at the LLVM IR level to parallelize things ?? Regards, Gokul Ramaswamy H.C On Tue, Mar 22, 2011 at 10:08 PM, Duncan Sands wrote: > Hi Gokul Ramaswamy, > > > I am new to LLVM. So please help me out. Here is what I am trying to > > achieve: > > > > If there are 2 statements in a source program - > > S1; > > S2; > > > > and I know these is no data and control dependency between them > and > > both take large amount of time to execute. So I want to execute them > > in parallel. > > > > So as S1 starts executing, I want to launch another thread and > > execute S2 in parallel. > > > > I need help on how to launch a new thread and schedule some specific > > code on this new thread. I searched for it but did not get satisfiable > > results. Please help me out LLVM Developers. > > llvm-gcc and dragonegg support GOMP (gnu open-mp). The way it works is > that the > front-end lowers parallel constructs into library calls, extra functions > and so > on. > > Ciao, Duncan. > > > > > Regards, > > Gokul Ramaswamy H.C > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/78030cfb/attachment.html From Anton.Lokhmotov at arm.com Tue Mar 22 12:40:06 2011 From: Anton.Lokhmotov at arm.com (Anton Lokhmotov) Date: Tue, 22 Mar 2011 17:40:06 -0000 Subject: [LLVMdev] [PATCH] OpenCL half support In-Reply-To: <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> References: <000201cbd383$c35adec0$4a109c40$%Lokhmotov@arm.com> <8AF16F0D-A73A-455B-9141-C30C234299D8@apple.com> <000201cbe4c7$310977c0$931c6740$%Lokhmotov@arm.com> <98E4E9C8-732A-4FDC-8DF6-FAACBD6857BD@apple.com> Message-ID: <000001cbe8b8$2ef97d00$8cec7700$@Lokhmotov@arm.com> Chris Lattner wrote: > I'm sorry I don't have the patch anymore. Please resend. Attached. (Copying to cfe-dev, as the patch is dual Clang/LLVM.) Anton Korobeynikov wrote: > PS: my 2 cents: do not forget to handle the existing half fp <-> float > conversion intrinsics. We are not quite sure what to do with them. Can anyone help? Best wishes, Anton. -------------- next part -------------- A non-text attachment was scrubbed... Name: 00004-half-llvm.patch Type: application/octet-stream Size: 11977 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/f3d5813b/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 00004-half-clang.patch Type: application/octet-stream Size: 24681 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/f3d5813b/attachment-0001.obj From katokop1 at gmail.com Tue Mar 22 12:41:07 2011 From: katokop1 at gmail.com (Jonathan Ragan-Kelley) Date: Tue, 22 Mar 2011 13:41:07 -0400 Subject: [LLVMdev] Xcode 4 autocomplete of LLVM includes Message-ID: Slightly off-topic, but I imagine this crowd must have some experience using Xcode 4 for projects linking to LLVM. I've actually started using Xcode 4 as an IDE for C/C++ development thanks to the vastly improved code analysis-based tools it's inherited largely thanks to LLVM. But, ironically, I am particularly struggling to get the tools to parse and analyze LLVM (as a client, not for direct development in the LLVM tree). My current setup is extremely vanilla: - LLVM 2.8 is installed (by Homebrew) in /usr/local/[lib,include] - Header Search Paths for the target is set to /usr/local/include - The LLVM headers are included as - The libs relevant to the project are added as library deps in the "Link Binary With Libraries" Build Phase Compilation (with Xcode's llvm-gcc 4.2) works, but the tools seemingly don't parse the LLVM includes for analysis, so: - autocomplete on LLVM types returns "no completions" - cmd-clicking on the LLVM #include lines (e.g. #include ) does nothing, seemingly indicating that the parser cannot find them to open them Any ideas? Have others had more luck? Thanks. From reid.kleckner at gmail.com Tue Mar 22 12:56:49 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Tue, 22 Mar 2011 13:56:49 -0400 Subject: [LLVMdev] Parallelization In-Reply-To: References: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> <4D88D07E.5020408@free.fr> Message-ID: On Tue, Mar 22, 2011 at 1:36 PM, Gokul Ramaswamy wrote: > Hi Duncan Sands, > > As I have understood, GOMP and OpenMP provides support for > parallelizing program at source program level. But I am at the IR level. > That is I am trying to parallelize the IR code. This is the case of > automatic parallelization. The programmer writing the code does not have any > idea of parallelization going behind the hood. > > So my question is instead of support at the source program level, is the an > support at the LLVM IR level to parallelize things ?? > No, you have to insert calls to things like pthreads or GOMP or OpenMP or whatever threading runtime you choose. Reid -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/7ac5ab96/attachment.html From peter.zion at fabric-engine.com Tue Mar 22 13:51:26 2011 From: peter.zion at fabric-engine.com (Peter Zion) Date: Tue, 22 Mar 2011 14:51:26 -0400 Subject: [LLVMdev] LLVM optimization passes crash when running on second thread Message-ID: Hello, I am trying to modify my LLVM-based compiler to perform an initial, no-optimization compilation synchronously on startup and then perform an asynchronous, optimized recompilation in the background, and I am getting in one of the optimization passes. - I am using the official release of LLVM 2.8 - I have compiled LLVM with threading enabled; I am running llvm::llvm_start_multithreaded() on application startup and checking that that result is true. - The foreground compilation works fine. - The background compilation also works fine if I comment out the addition of the optimization passes. - The optimization passes are being added as follows: llvm::OwningPtr passManager( new llvm::PassManager ); if ( optimize ) { llvm::createStandardFunctionPasses( passManager.get(), 2 ); llvm::createStandardModulePasses( passManager.get(), 2, false, true, true, true, false, llvm::createFunctionInliningPass() ); llvm::createStandardLTOPasses( passManager.get(), true, true, false ); } passManager->run( *module ); - If I *don't* comment out the optimization passes (inside the if statement above) LLVM crashes with what appears to be a stack overflow; I've attached the stack trace below. - The code above is in the Fabric::DG::Code::compileAST() function shown in the stack trace below; its child in the stack is the passManager->run() call. - I have added a global mutex lock around all my accesses to LLVM, just to try to debug the problem, and it doesn't make any difference. - Both compilations are using the same LLVMContext. It is unclear from the LLVM docs whether this is allowed. I would be surprised, however, if the global lock I added wouldn't have sorted out that issue if it was the problem. - In case it makes any difference, I am running on OS X and using Grand Central Dispatch to execute the background compilation using dispatch_group_async_f() Is this a known problem? Can anyone suggest anything I can do to fix or further debug this problem? Thanks in advance, Peter Zion #0 0x965221a6 in szone_malloc_should_clear () #1 0x00916795 in ChromeMain () #2 0x96522148 in malloc_zone_malloc () #3 0x96520218 in malloc () #4 0x953da617 in operator new () #5 0x169f5872 in std::_Rb_tree, std::pair const, (anonymous namespace)::LVILatticeVal>, std::_Select1st const, (anonymous namespace)::LVILatticeVal> >, std::less >, std::allocator const, (anonymous namespace)::LVILatticeVal> > >::_M_insert () at ctype.h:275 #6 0x169f5a5a in std::_Rb_tree, std::pair const, (anonymous namespace)::LVILatticeVal>, std::_Select1st const, (anonymous namespace)::LVILatticeVal> >, std::less >, std::allocator const, (anonymous namespace)::LVILatticeVal> > >::_M_insert_unique () at ctype.h:275 #7 0x169f5bfd in std::map, (anonymous namespace)::LVILatticeVal, std::less >, std::allocator const, (anonymous namespace)::LVILatticeVal> > >::operator[] () at ctype.h:275 #8 0x169f9661 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #9 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #10 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #11 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 ... #657 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #658 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #659 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #660 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #661 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #662 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #663 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #664 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #665 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #666 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #667 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #668 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #669 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #670 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #671 0x169fc2da in (anonymous namespace)::LVIQuery::getEdgeValue () at ctype.h:275 #672 0x169f9a25 in (anonymous namespace)::LVIQuery::getBlockValue () at ctype.h:275 #673 0x169fb6ee in (anonymous namespace)::LazyValueInfoCache::getValueInBlock () at ctype.h:275 #674 0x169fbf64 in llvm::LazyValueInfo::getConstant () at ctype.h:275 #675 0x168dcc70 in (anonymous namespace)::CorrelatedValuePropagation::runOnFunction () at ctype.h:275 #676 0x16b626b5 in llvm::FPPassManager::runOnFunction () at ctype.h:275 #677 0x169c32ce in (anonymous namespace)::CGPassManager::runOnModule () at ctype.h:275 #678 0x16b63cae in llvm::MPPassManager::runOnModule () at ctype.h:275 #679 0x16b63f32 in llvm::PassManagerImpl::run () at ctype.h:275 #680 0x16b63fcb in llvm::PassManager::run () at ctype.h:275 #681 0x162ba04c in Fabric::DG::Code::compileAST (this=0x1b0cfba0, optimize=true) at build/osx/debug/Fabric/DG/Code.cpp:118 #682 0x162bc86d in Fabric::DG::Code::CompileOptimizedAST (userdata=0x1b0cfba0) at Code.h:74 #683 0x96544271 in _dispatch_worker_thread2 () #684 0x96543d21 in _pthread_wqthread () #685 0x96543b66 in start_wqthread () (gdb) Handling SIGTERM in renderer. Wrote signal to shutdown pipe. From eli.friedman at gmail.com Tue Mar 22 14:27:49 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 22 Mar 2011 12:27:49 -0700 Subject: [LLVMdev] LLVM optimization passes crash when running on second thread In-Reply-To: References: Message-ID: On Tue, Mar 22, 2011 at 11:51 AM, Peter Zion wrote: > Hello, > > I am trying to modify my LLVM-based compiler to perform an initial, no-optimization compilation synchronously on startup and then perform an asynchronous, optimized recompilation in the background, and I am getting in one of the optimization passes. > > - I am using the official release of LLVM 2.8 > - I have compiled LLVM with threading enabled; I am running llvm::llvm_start_multithreaded() on application startup and checking that that result is true. > - The foreground compilation works fine. > - The background compilation also works fine if I comment out the addition of the optimization passes. > - The optimization passes are being added as follows: > ? ? ? ?llvm::OwningPtr passManager( new llvm::PassManager ); > ? ? ? ?if ( optimize ) > ? ? ? ?{ > ? ? ? ? ?llvm::createStandardFunctionPasses( passManager.get(), 2 ); > ? ? ? ? ?llvm::createStandardModulePasses( passManager.get(), 2, false, true, true, true, false, llvm::createFunctionInliningPass() ); > ? ? ? ? ?llvm::createStandardLTOPasses( passManager.get(), true, true, false ); > ? ? ? ?} > ? ? ? ?passManager->run( *module ); > - If I *don't* comment out the optimization passes (inside the if statement above) LLVM crashes with what appears to be a stack overflow; I've attached the stack trace below. > - The code above is in the Fabric::DG::Code::compileAST() function shown in the stack trace below; its child in the stack is the passManager->run() call. > - I have added a global mutex lock around all my accesses to LLVM, just to try to debug the problem, and it doesn't make any difference. > - Both compilations are using the same LLVMContext. ?It is unclear from the LLVM docs whether this is allowed. ?I would be surprised, however, if the global lock I added wouldn't have sorted out that issue if it was the problem. > - In case it makes any difference, I am running on OS X and using Grand Central Dispatch to execute the background compilation using dispatch_group_async_f() > > Is this a known problem? ?Can anyone suggest anything I can do to fix or further debug this problem? My best guess is that your background thread has less stack space than the main thread. The version of LazyValueInfo which is in 2.8 was recursive, and can use a lot of stack space in extreme cases (this was fixed for 2.9, which should be released soon). If you need to use 2.8, I would suggest either allocating more stack space, or customizing the passes you use not to include -jump-threading and -correlated-propagation. -Eli From grosser at fim.uni-passau.de Tue Mar 22 14:28:12 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 22 Mar 2011 15:28:12 -0400 Subject: [LLVMdev] Parallelization In-Reply-To: References: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> <4D88D07E.5020408@free.fr> Message-ID: <4D88F84C.4070904@fim.uni-passau.de> On 03/22/2011 01:56 PM, Reid Kleckner wrote: > On Tue, Mar 22, 2011 at 1:36 PM, Gokul Ramaswamy > > wrote: > > Hi Duncan Sands, > > As I have understood, GOMP and OpenMP provides support for > parallelizing program at source program level. But I am at the IR > level. That is I am trying to parallelize the IR code. This is the > case of automatic parallelization. The programmer writing the code > does not have any idea of parallelization going behind the hood. > > So my question is instead of support at the source program level, is > the an support at the LLVM IR level to parallelize things ?? > > > > No, you have to insert calls to things like pthreads or GOMP or OpenMP > or whatever threading runtime you choose. Which is what we also do in Polly. In case you just have the simple case of two statements you want to execute in parallel, I propose to write this as OpenMP annotated C code, compile the code with dragonegg to LLVM-IR and have a look what code is generated. You will need to create similar code and similar function calls if you want to do it at the LLVM-IR level. One thing that might simplify the code is to specify in OpenMP that you want to be able to select choices at runtime. A common construct is: SCHEDULE(runtime) This will stop dragonegg from inlining some OpenMP runtime calls, which could complicate the code unnecessarily. Cheers Tobi P.S.: In case of directly inserting OpenMP function callsn it would be nice to have support for a set of LLVM intrinsics that will automatically be lowered to the relevant OpenMP/mpc.sf.net function calls. Let me know when you think about working on such a thing. From peter.zion at fabric-engine.com Tue Mar 22 15:23:42 2011 From: peter.zion at fabric-engine.com (Peter Zion) Date: Tue, 22 Mar 2011 16:23:42 -0400 Subject: [LLVMdev] LLVM optimization passes crash when running on second thread In-Reply-To: References: Message-ID: <6EE252C7-1227-4AC4-9C72-E91643F93946@fabric-engine.com> That was exactly the problem, thank you! pz On 2011-03-22, at 3:27 PM, Eli Friedman wrote: > On Tue, Mar 22, 2011 at 11:51 AM, Peter Zion > wrote: >> Hello, >> >> I am trying to modify my LLVM-based compiler to perform an initial, no-optimization compilation synchronously on startup and then perform an asynchronous, optimized recompilation in the background, and I am getting in one of the optimization passes. >> >> - I am using the official release of LLVM 2.8 >> - I have compiled LLVM with threading enabled; I am running llvm::llvm_start_multithreaded() on application startup and checking that that result is true. >> - The foreground compilation works fine. >> - The background compilation also works fine if I comment out the addition of the optimization passes. >> - The optimization passes are being added as follows: >> llvm::OwningPtr passManager( new llvm::PassManager ); >> if ( optimize ) >> { >> llvm::createStandardFunctionPasses( passManager.get(), 2 ); >> llvm::createStandardModulePasses( passManager.get(), 2, false, true, true, true, false, llvm::createFunctionInliningPass() ); >> llvm::createStandardLTOPasses( passManager.get(), true, true, false ); >> } >> passManager->run( *module ); >> - If I *don't* comment out the optimization passes (inside the if statement above) LLVM crashes with what appears to be a stack overflow; I've attached the stack trace below. >> - The code above is in the Fabric::DG::Code::compileAST() function shown in the stack trace below; its child in the stack is the passManager->run() call. >> - I have added a global mutex lock around all my accesses to LLVM, just to try to debug the problem, and it doesn't make any difference. >> - Both compilations are using the same LLVMContext. It is unclear from the LLVM docs whether this is allowed. I would be surprised, however, if the global lock I added wouldn't have sorted out that issue if it was the problem. >> - In case it makes any difference, I am running on OS X and using Grand Central Dispatch to execute the background compilation using dispatch_group_async_f() >> >> Is this a known problem? Can anyone suggest anything I can do to fix or further debug this problem? > > My best guess is that your background thread has less stack space than > the main thread. The version of LazyValueInfo which is in 2.8 was > recursive, and can use a lot of stack space in extreme cases (this was > fixed for 2.9, which should be released soon). If you need to use > 2.8, I would suggest either allocating more stack space, or > customizing the passes you use not to include -jump-threading and > -correlated-propagation. > > -Eli From nipun2512 at gmail.com Tue Mar 22 18:56:33 2011 From: nipun2512 at gmail.com (Nipun Arora) Date: Tue, 22 Mar 2011 19:56:33 -0400 Subject: [LLVMdev] Parallelization In-Reply-To: <4D88F84C.4070904@fim.uni-passau.de> References: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> <4D88D07E.5020408@free.fr> <4D88F84C.4070904@fim.uni-passau.de> Message-ID: Hi, I am looking into something similar as well for auto-parallelization i.e. some sort of low level support at the IR level for parallelization. I'd be interested in collaborating with anyone who is working on the same. >From a brief look at the architectural overview of Polly, it seems as if the parallel code generation is being done at the IR level since the input file is an LLVM IR file? Would it be possible to re-utilize that functionality for building something to this end? Thanks Nipun On Tue, Mar 22, 2011 at 3:28 PM, Tobias Grosser wrote: > On 03/22/2011 01:56 PM, Reid Kleckner wrote: > > On Tue, Mar 22, 2011 at 1:36 PM, Gokul Ramaswamy > > > wrote: > > > > Hi Duncan Sands, > > > > As I have understood, GOMP and OpenMP provides support for > > parallelizing program at source program level. But I am at the IR > > level. That is I am trying to parallelize the IR code. This is the > > case of automatic parallelization. The programmer writing the code > > does not have any idea of parallelization going behind the hood. > > > > So my question is instead of support at the source program level, is > > the an support at the LLVM IR level to parallelize things ?? > > > > > > > > No, you have to insert calls to things like pthreads or GOMP or OpenMP > > or whatever threading runtime you choose. > > Which is what we also do in Polly. > > In case you just have the simple case of two statements you want to > execute in parallel, I propose to write this as OpenMP annotated C code, > compile the code with dragonegg to LLVM-IR and have a look what code is > generated. You will need to create similar code and similar function > calls if you want to do it at the LLVM-IR level. > > One thing that might simplify the code is to specify in OpenMP that you > want to be able to select choices at runtime. A common construct is: > > SCHEDULE(runtime) > > This will stop dragonegg from inlining some OpenMP runtime calls, which > could complicate the code unnecessarily. > > Cheers > Tobi > > P.S.: In case of directly inserting OpenMP function callsn it would be > nice to have support for a set of LLVM intrinsics that will > automatically be lowered to the relevant OpenMP/mpc.sf.net function > calls. Let me know when you think about working on such a thing. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/a20e99ae/attachment.html From stoklund at 2pi.dk Tue Mar 22 21:21:45 2011 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 22 Mar 2011 19:21:45 -0700 Subject: [LLVMdev] How to get register liveness information for each MachineBasicBlock In-Reply-To: References: Message-ID: On Mar 21, 2011, at 3:46 AM, Dongrui She wrote: > Hi all, > > I try to print the live-in and live-out registers for each basic block in a backend for my own target. > And I can get a list of live-in registers directly in MachineBasicBlock. > > Is there a quick way to also get the list of live-out registers without redoing the analysis. I think > this information is computed and stored somewhere. This information is not available. You can compute the live-out set as the union of the live-in sets of all successor blocks. Note that the live-in lists only contain physical registers. Register allocation computes liveness information for virtual registers in the LiveIntervals analysis. /jakob From zhaber at yopmail.com Tue Mar 22 21:22:28 2011 From: zhaber at yopmail.com (stackunderflow) Date: Tue, 22 Mar 2011 19:22:28 -0700 (PDT) Subject: [LLVMdev] -emit-llvm on ubuntu is broken In-Reply-To: <4D885E0D.7040904@free.fr> References: <31206382.post@talk.nabble.com> <31206493.post@talk.nabble.com> <14116FB0-C4F5-4A3F-B76F-4D9C7850D5DC@apple.com> <4D885E0D.7040904@free.fr> Message-ID: <31216254.post@talk.nabble.com> Thanks, it works Duncan Sands wrote: > > Hi Eric, > >> Looks like something wonky with DragonEgg. > > you need to use -fplugin-arg-dragonegg-emit-ir or -flto with dragonegg, > not -emit-llvm. Also, you currently have to use -S (getting human > readable > IR) rather than -c because with -c gcc will run cc1 with -S (getting human > readable IR) then pass the result to the system assembler which of course > barfs. This is documented on the web-page and in the README. > > Ciao, Duncan. > >> >> Duncan? >> >> -eric >> >> On Mar 21, 2011, at 7:05 PM, stackunderflow wrote: >> >>> >>> Hi Eric, >>> >>> here is my -emit-llvm -S -v output: >>> >>> Using built-in specs. >>> COLLECT_GCC=gcc-4.5 >>> COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/4.5.1/lto-wrapper >>> Target: i686-linux-gnu >>> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro >>> 4.5.1-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs >>> --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr >>> --program-suffix=-4.5 --enable-shared --enable-multiarch >>> --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib >>> --without-included-gettext --enable-threads=posix >>> --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib >>> --enable-nls >>> --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug >>> --enable-libstdcxx-time=yes --enable-plugin --enable-gold >>> --with-plugin-ld=ld.gold --enable-objc-gc --enable-targets=all >>> --disable-werror --with-arch-32=i686 --with-tune=generic >>> --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu >>> --target=i686-linux-gnu >>> Thread model: posix >>> gcc version 4.5.1 (Ubuntu/Linaro 4.5.1-7ubuntu2) >>> COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' >>> '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' >>> /usr/lib/gcc/i686-linux-gnu/4.5.1/cc1 -quiet -v >>> -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin hello.c >>> -D_FORTIFY_SOURCE=2 -iplugindir=/usr/lib/gcc/i686-linux-gnu/4.5.1/plugin >>> -quiet -dumpbase hello.c -mtune=generic -march=i686 -auxbase hello >>> -version >>> -fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so -o hello.s >>> -fstack-protector >>> GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) >>> compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version >>> 3.0.0-p3, >>> MPC version 0.8.2 >>> GGC heuristics: --param ggc-min-expand=100 --param >>> ggc-min-heapsize=131072 >>> Versions of loaded plugins: >>> dragonegg: >>> ignoring nonexistent directory "/usr/local/include/i686-linux-gnu" >>> ignoring nonexistent directory >>> "/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../i686-linux-gnu/include" >>> #include "..." search starts here: >>> #include<...> search starts here: >>> /usr/local/include >>> /usr/lib/gcc/i686-linux-gnu/4.5.1/include >>> /usr/lib/gcc/i686-linux-gnu/4.5.1/include-fixed >>> /usr/include/i686-linux-gnu >>> /usr/include >>> End of search list. >>> GNU C (Ubuntu/Linaro 4.5.1-7ubuntu2) version 4.5.1 (i686-linux-gnu) >>> compiled by GNU C version 4.5.1, GMP version 4.3.2, MPFR version >>> 3.0.0-p3, >>> MPC version 0.8.2 >>> GGC heuristics: --param ggc-min-expand=100 --param >>> ggc-min-heapsize=131072 >>> Versions of loaded plugins: >>> dragonegg: >>> Compiler executable checksum: ee807c30bb3adc8f3aa917a64443d0ec >>> COMPILER_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/:/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/ >>> LIBRARY_PATH=/usr/lib/gcc/i686-linux-gnu/4.5.1/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/i686-linux-gnu/4.5.1/../../../:/lib/:/usr/lib/:/usr/lib/i686-linux-gnu/ >>> COLLECT_GCC_OPTIONS='-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/dragonegg.so' >>> '-emit-llvm' '-S' '-v' '-mtune=generic' '-march=i686' >>> >>> >>> >>> >>> >>> Eric Christopher-2 wrote: >>>> >>>> >>>> On Mar 21, 2011, at 6:36 PM, stackunderflow wrote: >>>> >>>>> >>>>> I try to generate a human readable .ll file on Linux. I installed >>>>> llvm-gcc >>>>> but as I see it can generate only assembly code (-S option). Is there >>>>> any >>>>> way to get something like what is generated by llvm online compiler? >>>>> >>>>> That's what I get with llvm-gcc -S -emit-llvm hello.c on Ubuntu 10.10: >>>> >>>> llvm-gcc -v ? >>>> >>>> -eric >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31206493.html >>> Sent from the LLVM - Dev mailing list archive at Nabble.com. >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -- View this message in context: http://old.nabble.com/-emit-llvm-on-ubuntu-is-broken-tp31206382p31216254.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From mohitbansal111 at gmail.com Wed Mar 23 01:18:00 2011 From: mohitbansal111 at gmail.com (mohitbansal111 at gmail.com) Date: Wed, 23 Mar 2011 06:18:00 +0000 Subject: [LLVMdev] new at LLVM Message-ID: <000e0cd32f6ea040f8049f205261@google.com> Hi I am new at LLVM.. I am not able to run a simple program just compile by LLVM.. also what is IR and how i convert a simple c program to IR. Regards, Mohit -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/885550f6/attachment.html From echristo at apple.com Wed Mar 23 01:21:19 2011 From: echristo at apple.com (Eric Christopher) Date: Tue, 22 Mar 2011 23:21:19 -0700 Subject: [LLVMdev] new at LLVM In-Reply-To: <000e0cd32f6ea040f8049f205261@google.com> References: <000e0cd32f6ea040f8049f205261@google.com> Message-ID: <75802AFA-D512-4664-B4E1-D65BCE76C368@apple.com> On Mar 22, 2011, at 11:18 PM, mohitbansal111 at gmail.com wrote: > Hi > > I am new at LLVM.. > I am not able to run a simple program just compile by LLVM.. > also what is IR and how i convert a simple c program to IR. http://llvm.org/docs/GettingStarted.html#tutorial -eric From raghesh.a at gmail.com Wed Mar 23 01:45:55 2011 From: raghesh.a at gmail.com (raghesh) Date: Wed, 23 Mar 2011 12:15:55 +0530 Subject: [LLVMdev] Parallelization In-Reply-To: References: <53675.10.14.11.19.1300807172.squirrel@mail.cse.iitb.ac.in> <4D88D07E.5020408@free.fr> <4D88F84C.4070904@fim.uni-passau.de> Message-ID: On Wed, Mar 23, 2011 at 5:26 AM, Nipun Arora wrote: > Hi, > I am looking into something similar as well for auto-parallelization i.e. > some sort of low level support at the IR level for parallelization. > I'd be interested in?collaborating with anyone who is working on the same. > From a brief look at the architectural overview of Polly, it seems as if the > parallel code generation is being done at the IR level since the input file > is an LLVM IR file? > Would it be possible to re-utilize that functionality for building something > to this end? Adding to Tobias' comments following is what Polly with OpenMP support does. If Polly detects two statements(preferably for loops) can be parallelized it will generate the required GOMP calls automatically. As of now the interface is not designed in a such a way that it can be reused. If we find that designing such OpenMP intrinsics is useful for people we can think about that. Regards, -- Raghesh II MTECH Room No: 0xFF Mahanadhi Hostel IIT Madras From echristo at apple.com Wed Mar 23 01:58:27 2011 From: echristo at apple.com (Eric Christopher) Date: Tue, 22 Mar 2011 23:58:27 -0700 Subject: [LLVMdev] new at LLVM In-Reply-To: <001636e0aec7633e69049f20d439@google.com> References: <001636e0aec7633e69049f20d439@google.com> Message-ID: <53455140-6722-41EA-B2BA-3E20267F0734@apple.com> On Mar 22, 2011, at 11:54 PM, mohitbansal111 at gmail.com wrote: > hii, > > On running hello world program : > Its working fine with "llvm-gcc hello.c -o hello" and then "./hello" command.. > but with "llvm-gcc -O3 -emit-llvm hello.c -c -o hello.bc" and then "lli hello.bc" its give error : > > > lli: BitcodeReader.cpp:863: bool llvm::BitcodeReader::ParseMetadata(): Assertion `Kind == NewKind && "FIXME: Unable to handle custom metadata mismatch!"' failed. > 0 lli 0x08721a28 > Stack dump: > 0. Program arguments: lli hello.bc > Aborted > > > now what i had to do lli probably doesn't deal well with some of the new metadata. There's not a lot you can do about it unless you want to hack on llvm (though a bug report would be nice). That said, do you need the interpreter? -eric From chenwj at iis.sinica.edu.tw Wed Mar 23 04:34:46 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Wed, 23 Mar 2011 17:34:46 +0800 Subject: [LLVMdev] Calling external functions failed on PowerPC Message-ID: <20110323093446.GC10793@cs.nctu.edu.tw> Hi, all I have a trouble with calling external functions on PowerPC. What I am doing is generating a LLVM IR first like this, - x86 call void @helper_shack_flush(%struct.CPUX86State* %62) noinline, !flags !12 - ppc call void @helper_shack_flush(%struct.CPUX86State* %62) noinline, !flags !10 After lowering above LLVM IR for x86 and ppc, it becomes: - x86 %RAX = MOV64ri %RDI = COPY %RBX CALL64r %RAX, %RDI, %RAX, %RCX, %RDX, %RSI, %%RDI, %R8, %EFLAGS, %RSP, ... - ppc %X4 = LDtoc , %X2 The x86 JIT can call the external function correctly, but the ppc JIT give me the error belows, %X4 = LDtoc , %X2 UNREACHABLE executed! Stack dump: 0. Running pass 'PowerPC Machine Code Emitter' on function '@"8048150"' Aborted Is this a bug in ppc JIT? Or I have to do something else so that ppc JIT can call external functions? Thanks! Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 From sanjoy at playingwithpointers.com Wed Mar 23 05:07:02 2011 From: sanjoy at playingwithpointers.com (Sanjoy Das) Date: Wed, 23 Mar 2011 15:37:02 +0530 Subject: [LLVMdev] RFC: GSoC Project Message-ID: <4D89C646.6080509@playingwithpointers.com> Hi All! I will be applying to the LLVM project for this GSoC, and I wanted some preliminary sanity check on my project idea. I intend to implement split (segmented) stacks for LLVM (like we have in Go, and as being implemented for GCC [1]). A lot of what follows is lifted from [1]; I will progressively add more details as I get more familiar with the LLVM codebase. I intend to start with the simplest possible approach - representing the stack as a doubly linked list of _block_s, the size of each _block_ being a power of two. This can later be modified to improve performance and accommodate other factors. Blocks will be chained together into a doubly linked list structure (using the first two words in the block as the next and previous pointers). In the prologue, a function will check whether the current block has enough stack space. This is easily done for function which don't have variable sized allocas, and for ones which do, we can assume some worst-case upper bound. The prologue can then call an intrinsic (let's call it llvm.adjust_stack) which allocates a new block (possibly by delegating this to a user-provided callback), copies the arguments, saves the previous stack pointer (in the new block), and adjusts the next and previous pointers. It will also have to adjust the stack pointer, and the frame pointer, if it is being maintained. Cleanup can be done by hijacking the return value, as also mentioned in [1]. It might make sense to leave the allocated blocks around, to prevent re-allocating the next time the program needs more stack space. DWARF info can be generated as follows: since we know the offset of base of the stack frame from the stack pointer (or we are maintaining a frame pointer), we can always say whether the concerned call frame is the first call frame or not. In the second case, all the previous register values can be computed as usual, and in the first case, we will add an extra indirection, involving looking up the stack pointer saved in this block's header. One thing I'd really like some input on is whether implementing split stacks would be useful enough to warrant the effort (especially keeping in mind that this is pretty useless on 64 bit architectures). [1] http://gcc.gnu.org/wiki/SplitStacks -- Sanjoy Das http://playingwithpointers.com From krvladislav at gmail.com Wed Mar 23 07:07:11 2011 From: krvladislav at gmail.com (=?KOI8-R?B?69LZzM/XIPfMwcTJ08zB1w==?=) Date: Wed, 23 Mar 2011 15:07:11 +0300 Subject: [LLVMdev] [GSoC] Interface layer for optimizers Message-ID: Hi folks, I like open technologies, epecially LLVM compiler. I want to implement a new interface layer in LLVM to plug-in optimizers as a part of GSoC, and then load the interface with optimizers. This would improve LLVM application for people who want to use their optimizations in compilers. The first "educative" step is to add Doxygen (for .h files) to the build and integrate it into the programmer manual [1]. Then I will try to clean up interface layer to the optimizer so it can be potentially replaced. This task is close to "superoptimizer" task from "Miscellaneous Additions" list, so I believe there are guys here who could mentor my GSoC project. If there are any of you who can mentor the project, I prepare and send detailed implementation plan here. [1] http://llvm.org/docs/ProgrammersManual.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/fa6cdbae/attachment.html From joerg at britannica.bec.de Wed Mar 23 07:10:54 2011 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Wed, 23 Mar 2011 13:10:54 +0100 Subject: [LLVMdev] RFC: GSoC Project In-Reply-To: <4D89C646.6080509@playingwithpointers.com> References: <4D89C646.6080509@playingwithpointers.com> Message-ID: <20110323121054.GC982@britannica.bec.de> On Wed, Mar 23, 2011 at 03:37:02PM +0530, Sanjoy Das wrote: > I intend to start with the simplest possible approach - representing the > stack as a doubly linked list of _block_s, the size of each _block_ > being a power of two. This can later be modified to improve performance > and accommodate other factors. Blocks will be chained together into a > doubly linked list structure (using the first two words in the block as > the next and previous pointers). Where do you plan to store the pointers? What changes to the runtime environment does this require? > In the prologue, a function will check whether the current block has > enough stack space. This is easily done for function which don't have > variable sized allocas, and for ones which do, we can assume some > worst-case upper bound. If there are allocas involved, it is quite likely they are inside loops etc., in which case there are no simple stack boundaries. This shouldn't be a problem if the space for all static variables was allocated at the beginning. > The prologue can then call an intrinsic (let's > call it llvm.adjust_stack) which allocates a new block (possibly by > delegating this to a user-provided callback), copies the arguments, > saves the previous stack pointer (in the new block), and adjusts the > next and previous pointers. Why do you need to copy the arguments? In fact, why do you think you can actually copy the arguments? Consider printf for this. > It will also have to adjust the stack pointer, and the frame pointer, > if it is being maintained. Cleanup can be done by hijacking the return > value, as also mentioned in [1]. It might make sense to leave the > allocated blocks around, to prevent re-allocating the next time the > program needs more stack space. Hijacking the return value is a nice trick. I'm not sure about freeing / not-freeing the unused block, this again has implications for the runtime environment. Have you considered how much work call graph optimisations require? Especially with LTO it should be possible to kill many of the checks by computing used stack space across segments of the call graph. One of the papers on service scalability discussed this for light weight threads. Forgot which one, it's been a while. > One thing I'd really like some input on is whether implementing split > stacks would be useful enough to warrant the effort (especially keeping > in mind that this is pretty useless on 64 bit architectures). I don't think it is useless on 64bit architectures. You can't always make arbitrary large reservations of address space (e.g. optimised output of Chicken). They also don't come without a price. Also consider something like a kernel environment, where you really want to have a minimal default stack, but be able to fallback gracefully. Joerg From douglasdocouto at gmail.com Wed Mar 23 08:06:11 2011 From: douglasdocouto at gmail.com (Douglas do Couto Teixeira) Date: Wed, 23 Mar 2011 10:06:11 -0300 Subject: [LLVMdev] Range Analysis GSoC 2011 Proposal Message-ID: Dear LLVM community, I would like to contribute to LLVM in the Google Summer of Code project. My proposal is listed below. Please let me know your comments. Adding Range Analysis to LLVM Abstract The objective of this work is patch our implementation of range analysis into LLVM. I have a running implementation of range analysis in LLVM, but it is not currently part of the main distribution. I propose to integrate our range analysis implementation into the Lazy Value Info (LVI) interface that LLVM provides. Range analysis finds the intervals of values that may be bound to the integer variables during the execution of programs and is useful in several scenarios: constant propagation, detection of potential buffer overflow attacks, dead branch elimination, array bound check elimination, elimination of overflow tests in scripting languages such as JavaScript and Lua, etc. Objective The objective of this project is to augment LLVM with range analysis. We will do this integration by patching the current implementation of range analysis that we have onto the Lazy Value Info (LVI) interface that LLVM already provides. In addition, we will develop new optimizations using LVI. In particular, we will provide a pass that performs conditional constant propagation [5], and elimination of dead-branches. Criteria of Success - To improve substantially the precision of the current implementation of LVI. Currently, LVI?s interface only allows a client to know if a variable contains a constant. We want to allow LVI to report that a variable either contains a constant, or a known-range. - To improve the current implementation of constant propagation that LLVM uses. We hope to obtain a small performance gain on the C benchmarks in the LLVM test suite, and a larger gain on Java programs that are compiled using VMKit. - To improve the implementation of JumpThreading, in such a way that more dead-branches will be eliminated. Again, we hope to achieve a small speed-up on the C benchmarks, and a larger speed-up on the Java benchmarks. Background Range Analysis is a technique that maps integer variables to the possible ranges of values that they may assume through out the execution of a program. Thus, for each integer variable, a range analysis determines its lower and upper limits. A very simple range analysis would, for instance, map each variable to the limits imposed by its type. That is, an 8-bit unsigned integer variable can be correctly mapped to the interval [0, 255], and an 16-bit signed integer can be mapped to [-32767, 32766]. However, the precision of this analysis can greatly be improved from information inferred from the program text. Ideally this range should be as constrained as possible, so that an optimizing compiler could learn more information about each variable. However, the range analysis must be conservative, that is, it will only constraint the range of a variable if it can prove that it is safe to do so. As an example, consider the program: i = read(); if (i < 10) { print (i + 1); else { print(i - 1); } In this program we know, from the conditional test, that the value of ?i? in the true side of the branch is in the range [-INF, 9], and in the false side is in the range [10, +INF]. During the Summer of Code 2010 I have designed and implemented, under the orientation of Duncan Sands, a non-iterative range analysis algorithm. Our implementation is currently fully functional, been able to analyze the whole LLVM test suite. For more details, see [4]. However, this implementation has never been integrated into the LLVM main trunc, for two reasons: 1. We use an intermediate representation called extended static assignment form [6], which the LLVM contributors were reluctant to use; 2. During the SoC 2010 we did not have time to completely finish our implementation, and runtime numbers were available only by the end of 2010. 3. There was not really an infra-structure already in place in LLVM to take benefit of our analysis. In order to address the first item, we propose to integrate the intermediate representation directly into our analysis, yet, as a module that can be used in separate by other clients, if necessary. We are splitting the live ranges of variables using single-arity phi-functions, which are automatically handled by the SSA elimination pass that LLVM already includes. This live range splitting is only necessary for greater precision. We can do our live range analysis without it, although the results are less precise. A previous Summer of Code, authored by Andre Tavares, has shown that the e-SSA form increases the number of phi-functions in the program code by less than 10%, and it is very fast to build [6]. The second item of our list of hindrances is no longer a problem. Our implementation is ready for use. We have been able to analyze the whole LLVM test suite, plus SPEC CPU 2006 - over 4 million LLVM bytecoes - in 44 seconds on a 2.4GHz machine. We obtain non-trivial bit size reductions for the small benchmarks, having results that match those found by previous, non-conservative works, such as Stephenson?s et al?s [2]. Moreover, our implementation is based on a very modern algorithm, by Su and Wagner [1], augmented with Gawlitza?s technique to handle cycles [3]. We believe that this is the fastest implementation of such an analysis. Finally, the third item is also no longer a problem. Presently LLVM offers the Lazy Value Info interface that reports when variables are constants. The current LVI implementation also provides infra-structure to deal with ranges of integer intervals. However, it does not contain a fully functional implementation yet, an omission that we hope to fix with this project. Timeline and Testing Methodology 1. Change the LVI interface, adding a new method to it: getRange(Value *V), so that we can, not only know if a variable is a constant, but also know its range of values whenever it is not a constant. 2. Change the implementation of the prototype range analysis that LVI already uses, so that it will use our implementation. 3. Implement the sparse conditional constant propagation to use LVI. This, in addition to JumpThreading will be another client that will use LVI. 4. Run experiments on the LLVM test suite to verify that our new optimization improves on the current dead branch elimination pass that LLVM already uses. 5. Run experiments using VMKit, to check the impact of dead branch elimination on Java programs. We hope to deliver better performance numbers in this case because Java, as a memory safe language, is notorious for using many checks to ensure that memory is used only in ways that obey the contract of the variable types. Biograph I am currently a third year Computer Science student at the Federal University of Minas Gerais , Brazil, and I work as a research assistant at the Programming Languages Lab(LLP), in that university. I believe I am a good candidate to work in the proposed project because I have already a good knowledge of LLVM. I have implemented range analysis, and currently it can find the ranges of integer variables used in the programs. In addition to this, I have been working as a programmer for three years, before joining the LLP as a research assistant, and I am experienced with C++, the language in which LLVM is implemented. Furthermore, I am a student at a very good university (UFMG). To ground this statement I would like to point that the Department of Computer Science of UFMG got the best mark in the Brazilian National Undergrad Exam (ENADE). Finally, I work on a lab in which three other students work with LLVM. We had three Summer of Codes on LLVM in the past, and a number of papers have been published out of these experiences. References 1. A Class of Polynomially Solvable Range Constraints for Interval Analysis without Widenings and Narrowings, Zhendong Su and David Wagner, In Theoretical Computer Science, Volume 345 , Issue 1, 2005. 2. Bitwidth Analysis with Application to Silicon Compilation, Mark Stephenson, Jonathan Babb, and Saman Amarasinghe, In Proceedings of the SIGPLAN conference on Programming Language Design and Implementation, 2000. 3. Polynomial Precise Interval Analysis Revisited. Thomas Gawlitza, J?r?me Leroux, Jan Reineke, Helmut Seidl, Gr?goire Sutre, Reinhard Wilhelm: Efficient Algorithms 2009: 422-437 4. Linear Time Range Analysis with Affine Constraints. Douglas do Couto Teixeira and Fernando Magno Quintao Pereira ( http://homepages.dcc.ufmg.br/~douglas/projects/RangeAnalysis/RangeAnalysis.paper.pdf ) 5. Constant propagation with conditional branches. Mark N. Wegman and F. Kenneth Zadeck, In ACM Transactions on Programming Languages and Systems (TOPLAS), 181-210, 1991. 6. Efficient SSI Conversion. Andr? Luiz C. Tavares, Fernando Magno Quint?o Pereira, Mariza A. S. Bigonha and Roberto S. Bigonha. Simp?sio Brasileiro de Linguagens de Programa??o. 2010. 7. ABCD: eliminating array bounds checks on demand, Rajkslav Bodik, Rajiv Gupta and Vivek Sarkar, In Proceedings of the SIGPLAN conference on Programming Language Design and Implementation, 2000. Contact Info Name: Douglas do Couto Teixeira e-mail 1: douglas at dcc dot ufmg dot br e-mail 2: douglasdocouto at gmail dot com -- Douglas do Couto Teixeira -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/e315130c/attachment.html From criswell at illinois.edu Tue Mar 22 22:18:27 2011 From: criswell at illinois.edu (John Criswell) Date: Tue, 22 Mar 2011 21:18:27 -0600 Subject: [LLVMdev] [GSoC] Interface layer for optimizers In-Reply-To: References: Message-ID: <4D896683.1040205@illinois.edu> On 3/23/2011 6:07 AM, ?????? ????????? wrote: > Hi folks, > > I like open technologies, epecially LLVM compiler. I want to implement > a new interface layer in LLVM to plug-in optimizers as a part of GSoC, > and then load the interface with optimizers. LLVM already has a plug-in framework for loading new analysis and optimization passes (http://llvm.org/docs/WritingAnLLVMPass.html). What makes your proposal different? > > This would improve LLVM application for people who want to use their > optimizations in compilers. > > The first "educative" step is to add Doxygen (for .h files) to the > build and integrate it into the programmer manual [1]. Then I will try > to clean up interface layer to the optimizer so it can be potentially > replaced. Doxygen docs are already available for LLVM (http://llvm.org/doxygen). Personally, I don't see a need to make them part of the Programmer's Manual, although having a link from the Programmer's Manual to the doxygen docs is probably a good idea (if it doesn't exist already). > > This task is close to "superoptimizer" task from "Miscellaneous > Additions" list, so I believe there are guys here who could mentor my > GSoC project. If there are any of you who can mentor the project, I > prepare and send detailed implementation plan here. The superoptimizer idea looks kinda cool. -- John T. > > [1] http://llvm.org/docs/ProgrammersManual.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110322/f424af40/attachment-0001.html From lgratian at gmail.com Wed Mar 23 10:35:04 2011 From: lgratian at gmail.com (Gratian Lup) Date: Wed, 23 Mar 2011 17:35:04 +0200 Subject: [LLVMdev] GSOC Project Proposal: Profile-guided optimizations Message-ID: Hi! My name is Gratian and I would like to participate to GSOC 2011. I'm interested in profile-guided optimizations, and I want to implement two optimizations that can bring tangible benefits for most applications: profile-guided function inlining and basic block positioning. Inlining can be greatly improved if we take into consideration how many times the function we want to inline was actually called. Functions that are not called at all, or are called infrequently should not be inlined, while the ones that are frequently called should have a higher chance of being inlined. The algorithm I want to use is based on a benefit/cost ratio, so it can take advantage of the existing InlineCost analysis. The algorithm is a variation of the one found in JikesRVM, of course adapted and tuned for LLVM. According to the paper, the performance of the application can be up to 57% better than with inlining done without any profile info. The second optimization is intended to replace the current BasicBlockPlacement pass, which uses a naive algorithm, with an algorithm that performs better in practice. It will be based on Algo2 from Pettis&Hansen, while the current one is Algo1. I think this is more useful than forming superblocks, because they may actually degrade the performance if new optimization opportunities are not discovered (the size of the code increases). If you think that superblocks are more useful, I could change the proposal (I read two papers about superblock formation and optimization, so I'm a bit familiar). The only problem would be that this, together with inlining, may be too much for the scope of GSOC. Any suggestions are welcome. Gratian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/89b127ad/attachment.html From joshuawarner32 at gmail.com Wed Mar 23 11:24:35 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Wed, 23 Mar 2011 10:24:35 -0600 Subject: [LLVMdev] Reversing a function's CFG? In-Reply-To: <8BB50157-7FC9-4119-8E95-04384790A1EA@gmail.com> References: <8BB50157-7FC9-4119-8E95-04384790A1EA@gmail.com> Message-ID: Hi Justin, I take the fact that nobody has replied as a sign that nobody really understands what you are asking. > I was wondering if there was a quick way to reverse a function's CFG and, > in turn, all basic blocks within it. Assuming all variables are globals, is > there a quick way to generate a function's reversal? I highly doubt such > functionality exists but I figured it was worth asking. I'm trying to > develop an "undo function" generator pass that would be able to restore > system state (without state-saving) after determining an error occurred. > This is documented in, "Efficient Optimistic Parallel Simulations using > Reverse Computation" by Carothers et al.[1] > I'm unaware of an easy way to "reverse the CFG" - but even if there was, I don't think that would solve your problem. First, you will only be able to generate "undo" functions for a small subset of all possible functions - specifically, invertible functions. Second, my intuition is that computing an inverse function will be significantly more involved than just reversing the CFG. Could you be a little more specific about the problem you are dealing with? How do Carothers et al. do the inversion? -Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/da092c7a/attachment.html From justin at lapre.com Wed Mar 23 12:52:40 2011 From: justin at lapre.com (Justin M. LaPre) Date: Wed, 23 Mar 2011 13:52:40 -0400 Subject: [LLVMdev] Reversing a function's CFG? In-Reply-To: References: <8BB50157-7FC9-4119-8E95-04384790A1EA@gmail.com> Message-ID: <5B806BA1-94DB-4EF0-B526-511F6EDFB28B@gmail.com> On Mar 23, 2011, at 12:24 PM, Joshua Warner wrote: > Hi Justin, > > I take the fact that nobody has replied as a sign that nobody really understands what you are asking. Most likely. I was trying not to write a ten page e-mail to the list but quite possibly simplified too much. I'll try to elaborate a little more: In large-scale parallel simulations, sometimes you process an event speculatively. If it turns out you should not have processed that event (and such situations can be detected by our system), you need to undo all of the changes you made in your event handler. Most approaches use some form of state-saving. We opt for an approach coined by my advisor (Chris Carothers, actually) called "reverse computation." The reverse function is basically the inverse of the forward event handler e.g. if in the forward event handler you increment a variable, the reverse handler would need to decrement it. That's an extremely simple case. A more complicated case: if you have an "if" in your forward handler, you must augment it with a bitfield to remember which path you took so the reverse handler can find its way back. So we're basically saving our control state as opposed to our data state as the control state is often significantly smaller. I like to think of it as a trail of breadcrumbs. Using different approaches we can handle loops, ifs, switches, etc. by following our "breadcrumbs" back. > I was wondering if there was a quick way to reverse a function's CFG and, in turn, all basic blocks within it. Assuming all variables are globals, is there a quick way to generate a function's reversal? I highly doubt such functionality exists but I figured it was worth asking. I'm trying to develop an "undo function" generator pass that would be able to restore system state (without state-saving) after determining an error occurred. This is documented in, "Efficient Optimistic Parallel Simulations using Reverse Computation" by Carothers et al.[1] > > > I'm unaware of an easy way to "reverse the CFG" - but even if there was, I don't think that would solve your problem. First, you will only be able to generate "undo" functions for a small subset of all possible functions - specifically, invertible functions. Second, my intuition is that computing an inverse function will be significantly more involved than just reversing the CFG. Reversing the CFG is just a piece of the puzzle. To generate our reverse handler, I thought a good place to start would be the reverse CFG. Specifically, I need to: 1. instrument the forward handler control path 2. emit the reverse handler given the forward handler (which I believe can be achieved by the following) 2a. create the reverse CFG 2b. for each basic block in (2a), invert the instructions Unfortunately, not all functions are invertible as you said. In those cases, we may opt to fall back on state-saving or use some heuristics to try and bypass state-saving (memory operations can be expensive on some of the machines we run simulations on). Assuming the above steps go off without a hitch (which is a big assumption), we should have our reverse event handler generated for us. I do have some questions: is there any way to differentiate IR I inserted while instrumenting the forward path from IR that was there before? I don't want to invert my instrumentation instructions. Also, due to this problem, I've been attempting to write a Module pass as opposed to a Function pass because I couldn't differentiate between the two. > Could you be a little more specific about the problem you are dealing with? How do Carothers et al. do the inversion? Carothers did it all by hand! :) In the paper he talks about automating it. As I have taken a few compiler courses, apparently I'm the compiler guy in our HPC group. Anyway, thanks for the response. If I was unclear in any way, please ask questions! -Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/c9906f2f/attachment.html From joshuawarner32 at gmail.com Wed Mar 23 13:58:22 2011 From: joshuawarner32 at gmail.com (Joshua Warner) Date: Wed, 23 Mar 2011 12:58:22 -0600 Subject: [LLVMdev] Reversing a function's CFG? In-Reply-To: References: <8BB50157-7FC9-4119-8E95-04384790A1EA@gmail.com> <5B806BA1-94DB-4EF0-B526-511F6EDFB28B@gmail.com> Message-ID: Forgot to CC the list. On Wed, Mar 23, 2011 at 12:51 PM, Joshua Warner wrote: > >> In large-scale parallel simulations, sometimes you process an event >> speculatively. If it turns out you should not have processed that event >> (and such situations can be detected by our system), you need to undo all of >> the changes you made in your event handler. Most approaches use some form >> of state-saving. We opt for an approach coined by my advisor (Chris >> Carothers, actually) called "reverse computation." The reverse function is >> basically the inverse of the forward event handler e.g. if in the forward >> event handler you increment a variable, the reverse handler would need to >> decrement it. That's an extremely simple case. A more complicated case: if >> you have an "if" in your forward handler, you must augment it with a >> bitfield to remember which path you took so the reverse handler can find its >> way back. So we're basically saving our control state as opposed to our >> data state as the control state is often significantly smaller. I like to >> think of it as a trail of breadcrumbs. Using different approaches we can >> handle loops, ifs, switches, etc. by following our "breadcrumbs" back. >> >> Reversing the CFG is just a piece of the puzzle. To generate our reverse >> handler, I thought a good place to start would be the reverse CFG. >> Specifically, I need to: >> >> 1. instrument the forward handler control path >> 2. emit the reverse handler given the forward handler (which I believe can >> be achieved by the following) >> 2a. create the reverse CFG >> 2b. for each basic block in (2a), invert the instructions >> >> Unfortunately, not all functions are invertible as you said. In those >> cases, we may opt to fall back on state-saving or use some heuristics to try >> and bypass state-saving (memory operations can be expensive on some of the >> machines we run simulations on). >> >> Assuming the above steps go off without a hitch (which is a big >> assumption), we should have our reverse event handler generated for us. >> >> I do have some questions: is there any way to differentiate IR I inserted >> while instrumenting the forward path from IR that was there before? I don't >> want to invert my instrumentation instructions. Also, due to this problem, >> I've been attempting to write a Module pass as opposed to a Function pass >> because I couldn't differentiate between the two. >> >> Carothers did it all by hand! :) In the paper he talks about automating >> it. As I have taken a few compiler courses, apparently I'm the compiler guy >> in our HPC group. >> >> > Thanks - that makes much more sense. Sounds intriguing! I would be very > interested in seeing a pass like this get into the LLVM trunk - it would go > a long way to providing reverse debugging (wikipedia it) for all languages > that target LLVM. There are very few (if any) good, free reverse debuggers > available for higher-level languages like Java and Python, let alone one for > C and C++. That would be amazing! > > I don't think there could ever be the sort of pre-built procedure for > reversing the control flow that you want - it would be heavily dependent on > the format of the data you produce in the forward computation. The best way > to do this is probably just to translate a function block-by-block. Compute > the predecessors of the original block, then add a switch instruction to the > end of the new block that computes how control flow reached that block when > it was run in the forward. > > Inverting each instruction should be just as easy - if its invertible > (including a call to an instrumented function), just invert, otherwise read > the recorded state and fix it up. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/88f4bdbb/attachment.html From rafael.espindola at gmail.com Wed Mar 23 15:00:08 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Wed, 23 Mar 2011 16:00:08 -0400 Subject: [LLVMdev] gold plugin example In-Reply-To: References: <4D88A1C6.90007@gmail.com> Message-ID: <4D8A5148.80406@gmail.com> > > I have created the links in bfd-plugins, but I had some mistakes in > handling versions. Now, I found that all combinations of versions work > well for the example. > > By the way, how can I manually check whether ar and nm work? Both take --plugin options. You can run nm in a plugin file to see if it can print the symbols. You can also create an archive with some IL file in it and check the armap by running "nm -s" in the produced .a file. > Thanks, > Sangmin Cheers, Rafael From criswell at illinois.edu Wed Mar 23 17:24:59 2011 From: criswell at illinois.edu (John Criswell) Date: Wed, 23 Mar 2011 17:24:59 -0500 Subject: [LLVMdev] Range Analysis GSoC 2011 Proposal In-Reply-To: References: Message-ID: <4D8A733B.9030402@illinois.edu> Dear Douglas, Comments below. On 3/23/11 8:06 AM, Douglas do Couto Teixeira wrote: > Dear LLVM community, > > I would like to contribute to LLVM in the Google Summer of Code > project. My proposal is listed below. Please let me know your comments. > > > Adding Range Analysis to LLVM > > > > Abstract > > The objective of this work is patch our implementation of range > analysis into LLVM. I have a running implementation of range Say "working implementing" instead of "running implementation." > analysis in LLVM, but it is not currently part of the main > distribution. I propose to integrate our range analysis implementation > into the Lazy Value Info (LVI) interface that LLVM provides. Range > analysis finds the intervals of values that may be bound to the > integer variables during the execution of programs and is useful in > several scenarios: constant propagation, detection of potential buffer > overflow attacks, dead branch elimination, array bound check > elimination, elimination of overflow tests in scripting languages such > as JavaScript and Lua, etc. > > > Objective > > The objective of this project is to augment LLVM with range analysis. > We will do this integration by patching the current implementation of > range analysis that we have onto the Lazy Value Info (LVI) interface > that LLVM already provides. In addition, we will develop new > optimizations using LVI. In particular, we will provide a pass that > performs conditional constant propagation [5], and elimination of > dead-branches. > > Criteria of Success > > * To improve substantially the precision of the current > implementation of LVI. Currently, LVI?s interface only allows a > client to know if a variable contains a constant. We want to > allow LVI to report that a variable either contains a constant, > or a known-range. > * To improve the current implementation of constant propagation > that LLVM uses. We hope to obtain a small performance gain on > the C benchmarks in the LLVM test suite, and a larger gain on > Java programs that are compiled using VMKit. > * To improve the implementation of JumpThreading, in such a way > that more dead-branches will be eliminated. Again, we hope to > achieve a small speed-up on the C benchmarks, and a larger > speed-up on the Java benchmarks. > > > > Background > > Range Analysis is a technique that maps integer variables to the > possible ranges of values that they may assume through out throughout is a single word in this instance. > the execution of a program. Thus, for each integer variable, a range > analysis determines its lower and upper limits. A very simple range > analysis would, for instance, map each variable to the limits imposed > by its type. That is, an 8-bit unsigned integer variable can be > correctly mapped to the interval [0, 255], and an 16-bit signed > integer can be mapped to [-32767, 32766]. You probably want to be consistent in how numbers are interpreted. Use the unsigned or signed interpretations in both examples above. > However, the precision of this analysis can greatly be improved from > information inferred from the program text. > > Ideally this range should be as constrained as possible, so that an > optimizing compiler could learn more information about each variable. > However, the range analysis must be conservative, that is, it will > only constraint the range of a variable if it can "constrain the range" > prove that it is safe to do so. As an example, consider the program: > > i = read(); > if (i < 10) { > print (i + 1); > else { > print(i - 1); > } > > In this program we know, from the conditional test, that the value of > ?i? in the true side of the branch is in the range [-INF, 9], and in > the false side is in the range [10, +INF]. > > During the Summer of Code 2010 I have designed and implemented, under > the orientation of Duncan Sands, a non-iterative Remove the word "have" in the sentence above. > range analysis algorithm. Our implementation is currently fully > functional, been able to analyze the whole LLVM test suite. For more > details, see [4]. However, this implementation has never been > integrated into the LLVM main trunc, for two reasons: > > 1. We use an intermediate representation called extended static > assignment form [6], which the LLVM contributors were reluctant > to use; > 2. During the SoC 2010 we did not have time to completely finish > our implementation, and runtime numbers were available only by > the end of 2010. > 3. There was not really an infra-structure already in place in LLVM > to take benefit of our analysis. > I think your text is a little unclear about point #3. From your text below, it sounds like LLVM lacks optimizations that utilize value range analysis. If that is the case, then you should state below (like you do in the beginning of your proposal) what optimizations you'll write. As an aside, we'd be interested in using trying out value-range analysis in SAFECode (http://safecode.cs.illinois.edu) for use in static array bounds checking. Creating a static array bounds checking pass for SAFECode using your analysis would probably be trivial. The only difficulty is that SAFECode is currently built using LLVM 2.7, and I'm guessing that you're using a newer version of LLVM. > > > In order to address the first item, we propose to integrate the > intermediate representation directly into our analysis, yet, as a > module that can be used in separate by other clients, if necessary. This part isn't completely clear to me. It sounds like what you're suggesting is to write one analysis pass that internally constructs e-SSA form and a second analysis that actually does the value-range analysis. It also sounds like you're writing the value-range analysis so that it can be used with and without e-SSA form. Is this correct, or have I misinterpreted what you're saying? Either way, I think the text above and the paragraph below about e-SSA form could be made a little more clear. > We are splitting the live ranges of variables using single-arity > phi-functions, which are automatically handled by the SSA elimination > pass that LLVM already includes. This live range splitting is only > necessary for greater precision. We can do our live range analysis > without it, although the results are less precise. A previous Summer > of Code, authored by Andre Tavares, has shown that the e-SSA form > increases the number of phi-functions in the program code by less than > 10%, and it is very fast to build [6]. > > The second item of our list of hindrances is no longer a problem. Our > implementation is ready for use. We have been able to analyze the > whole LLVM test suite, plus SPEC CPU 2006 - over 4 million LLVM > bytecoes - in 44 seconds on a 2.4GHz machine. We obtain non-trivial > bit size reductions for the small benchmarks, having results that > match those found by previous, non-conservative works, such as > Stephenson?s et al?s [2]. What kind of improvements do you see in large benchmarks? If small benchmarks yield good results and large benchmarks yield poor results, then your analysis probably needs more work to be useful for real-world programs. > Moreover, our implementation is based on a very modern algorithm, by > Su and Wagner [1], augmented with Gawlitza?s technique to handle > cycles [3]. We believe that this is the fastest implementation of such > an analysis. > > Finally, the third item is also no longer a problem. Presently LLVM > offers the Lazy Value Info interface that reports when variables are > constants. The current LVI implementation also provides > infra-structure to deal with ranges of integer intervals. However, it > does not contain a fully functional implementation yet, an omission > that we hope to fix with this project. Again, you need to be clear about whether any optimizations use this for anything beyond seeing if a value is constant, and if not, describe what optimizations you plan to write to fix that. > > Timeline and Testing Methodology > > 1. Change the LVI interface, adding a new method to it: > getRange(Value *V), so that we can, not only know if a variable is > No comma after can. > 1. a constant, but also know its range of values whenever it is not > a constant. > 2. Change the implementation of the prototype range analysis that > LVI already uses, so that it will use our implementation. > 3. Implement the sparse conditional constant propagation to use > LVI. This, in addition to JumpThreading will be another client > that will use LVI. > Is there an advantage to using your analysis for conditional constant propagation? I see the value in range analysis, but I don't see what your algorithm adds above what is already there. Please specify. > 1. Run experiments on the LLVM test suite to verify that our new > optimization improves on the current dead branch elimination > pass that LLVM already uses. > 2. Run experiments using VMKit, to check the impact of dead branch > elimination on Java programs. We hope to deliver better > performance numbers in this case because Java, as a memory safe > language, is notorious for using many checks to ensure that > memory is used only in ways that obey the contract of the > variable types. > You may want to check that VMKit is working with the version of LLVM that you're using. > > > Biograph Biography > > I am currently a third year Computer Science student at theFederal > University of Minas Gerais , Brazil, and > I work as a research assistant at theProgramming Languages Lab > (LLP), in that university. You might want to state more explicitly whether you're an undergraduate or graduate student. I'm guessing your an undergraduate but am not sure. > > I believe I am a good candidate to work in the proposed project > because I have already a good knowledge of LLVM. I have "on the proposed project" and "because I am already knowledgeable about LLVM" > implemented range analysis, and currently it can find the ranges of > integer variables used in the programs. In addition to this, I have > been working as a programmer for three years, before joining the LLP > as a research assistant, and I am experienced I think you want to say that you worked as a programmer for three years before joining LLP. > with C++, the language in which LLVM is implemented. Furthermore, I am > a student at a very good university (UFMG). To ground this statement I > would like to point that the Department of Computer Science of UFMG > got thebest mark in the "To justify" instead of "to ground" and "point out" instead of just "point" > Brazilian National Undergrad Exam (ENADE). Finally, I work on a lab in > which three other students work with LLVM. We had work in a lab > three Summer of Codes on LLVM in the past, and a number of papers have > been published out of these experiences. I think you should just point out the SoC's that you've been involved in. The ones from your lab mates seems less relevant to me. All in all, I think your wrote a good proposal. You may want to send a revised version to the list; others may want to comment on the things that I think need to be clarified. BTW, are you looking for a mentor, or has Duncan volunteered for this year again? Good luck! -- John T. > > References > > 1. A Class of Polynomially Solvable Range Constraints for Interval > Analysis without Widenings and Narrowings, Zhendong Su and David > Wagner, In Theoretical Computer Science, Volume 345 , Issue 1, 2005. > > 2. Bitwidth Analysis with Application to Silicon Compilation, Mark > Stephenson, Jonathan Babb, and Saman Amarasinghe, In Proceedings of > the SIGPLAN conference on Programming Language Design and > Implementation, 2000. > > 3. Polynomial Precise Interval Analysis Revisited.Thomas Gawlitza, > J?r?me Leroux, Jan Reineke, Helmut Seidl, Gr?goire Sutre, Reinhard > Wilhelm: Efficient Algorithms 2009: 422-437 > > 4. Linear Time Range Analysis with Affine Constraints.Douglas do Couto > Teixeira and Fernando Magno Quintao Pereira > (http://homepages.dcc.ufmg.br/~douglas/projects/RangeAnalysis/RangeAnalysis.paper.pdf > ) > > 5. Constant propagation with conditional branches. Mark N. Wegman and > F. Kenneth Zadeck, In ACM Transactions on Programming Languages and > Systems (TOPLAS), 181-210, 1991. > > 6. Efficient SSI Conversion.Andr? Luiz C. Tavares, Fernando Magno > Quint?o Pereira, Mariza A. S. Bigonha and Roberto S. Bigonha. Simp?sio > Brasileiro de Linguagens de Programa??o. 2010. > > 7. ABCD: eliminating array bounds checks on demand, Rajkslav Bodik, > Rajiv Gupta and Vivek Sarkar, In Proceedings of the SIGPLAN conference > on Programming Language Design and Implementation, 2000. > > Contact Info > > Name:Douglas do Couto Teixeira > e-mail 1:douglas at dcc dot ufmg dot br > e-mail 2: douglasdocouto at gmail dot com > > > > -- > Douglas do Couto Teixeira -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110323/5bee98fc/attachment.html From etherzhhb at gmail.com Wed Mar 23 20:58:10 2011 From: etherzhhb at gmail.com (ether zhhb) Date: Thu, 24 Mar 2011 09:58:10 +0800 Subject: [LLVMdev] Contributing to Polly with GSOC 2011 In-Reply-To: References: Message-ID: hi raghesh, > > 5. Porting Polly to Various architectures. > ------------------------------------------------- > > Currently Polly generates everything as 64 bit integer, which is > problamatic for embedded platforms. > you may try something like this: http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-January/037277.html I am already planning to implement this, and it is greate if you join :) best regards ether From grosser at fim.uni-passau.de Wed Mar 23 22:49:26 2011 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 23 Mar 2011 23:49:26 -0400 Subject: [LLVMdev] Contributing to Polly with GSOC 2011 In-Reply-To: References: Message-ID: <4D8ABF46.6070208@fim.uni-passau.de> On 03/23/2011 09:58 PM, ether zhhb wrote: > hi raghesh, > > >> >> 5. Porting Polly to Various architectures. >> ------------------------------------------------- >> >> Currently Polly generates everything as 64 bit integer, which is >> problamatic for embedded platforms. >> > you may try something like this: > http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-January/037277.html > > I am already planning to implement this, and it is greate if you join :) Great. How do you plan to derive the minimal size needed for an induction variable? Do you already have concrete plans? Cheers Tobi From kgondi2 at uic.edu Wed Mar 23 23:15:33 2011 From: kgondi2 at uic.edu (Gondi, Kalpana) Date: Wed, 23 Mar 2011 23:15:33 -0500 Subject: [LLVMdev] regarding LLVM Pass Message-ID: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> Hi All, I am a newbie to LLVM and I would like to write an LLVM pass where I can transform C code. Say, I would like to introduce a print statement after every loop. Could you please provide me any hints as how I should proceed to write such transformation using LLVM? Also, I would like to analyze C Code and transform. Say, I would like to use Alias analysis and decide to introduce print statements for some pointer variables inside the code itself. Any suggestion as where I should start? I've been going through documentation like, writing an LLVM Pass , but I guess, I need much more exposure than just going through such documentation. Also, do I need to look at CLANG? How is it different from writing the LLVM pass to perform the above mentioned tasks? Finally, did anyone compile Linux kernel using LLVM and booted the same? I am facing the error like "unsupported inline asm:...". Please help me with all these. And I appreciate your support and patience. Thanks, GK From chenwj at iis.sinica.edu.tw Wed Mar 23 23:44:59 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Thu, 24 Mar 2011 12:44:59 +0800 Subject: [LLVMdev] regarding LLVM Pass In-Reply-To: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> References: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> Message-ID: <20110324044458.GA31409@cs.nctu.edu.tw> Hi, Gondi > Finally, did anyone compile Linux kernel using LLVM and booted the same? I > am facing the error like "unsupported inline asm:...". It seems that LLVM does not support all inline assembly. Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 From criswell at cs.uiuc.edu Wed Mar 23 23:47:41 2011 From: criswell at cs.uiuc.edu (John Criswell) Date: Wed, 23 Mar 2011 23:47:41 -0500 Subject: [LLVMdev] regarding LLVM Pass In-Reply-To: <20110324044458.GA31409@cs.nctu.edu.tw> References: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> <20110324044458.GA31409@cs.nctu.edu.tw> Message-ID: <4D8ACCED.7090606@cs.uiuc.edu> On 3/23/11 11:44 PM, ??? wrote: > Hi, Gondi > >> Finally, did anyone compile Linux kernel using LLVM and booted the same? I >> am facing the error like "unsupported inline asm:...". > It seems that LLVM does not support all inline assembly. LLVM does support inline assembly, although some inline asm constraints may not be supported yet. You might find some documentation on the level of support by searching through the bug database. -- John T. > Regards, > chenwj > From eli.friedman at gmail.com Wed Mar 23 23:49:40 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 23 Mar 2011 21:49:40 -0700 Subject: [LLVMdev] regarding LLVM Pass In-Reply-To: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> References: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> Message-ID: On Wed, Mar 23, 2011 at 9:15 PM, Gondi, Kalpana wrote: > Hi All, > ?I am a newbie to LLVM and I would like to write an LLVM pass where I can > transform C code. Say, I would like to introduce a print statement after > every loop. Could you please provide me any hints as how I should proceed > to write such transformation using LLVM? > Also, I would like to analyze C Code and transform. Say, I would like to > use Alias analysis and decide to introduce print statements for some > pointer variables inside the code itself. Any suggestion as where I should > start? > > I've been going through documentation like, writing an LLVM Pass , but I > guess, I need much more exposure than just going through such > documentation. If you haven't looked at http://llvm.org/docs/tutorial/ , I would suggest taking a look; it's only partially relevant to what you're asking, but should give you a better feel for how stuff works. > Also, do I need to look at CLANG? How is it different from writing the > LLVM pass to perform the above mentioned tasks? It's different: clang has a rewriter you can use, for example, to insert a call after every loop. You end up with a different definition of "loop", though: at the clang level, you'll see the AST nodes for "for" and "while" loops; at the IR level, you'll see constructs which are structurally loops. > Finally, did anyone compile Linux kernel using LLVM and booted the same? I > am facing the error like "unsupported inline asm:...". IIRC, llvm-gcc is affected by some bugs related to inline asm which will likely never be fixed that affect the Linux kernel (llvm-gcc is considered deprecated). See http://llvm.org/bugs/attachment.cgi?id=3486 for building it with clang; the kernel tends to use all sorts of obscure/nasty gcc flags and constructs, which makes things tricky. -Eli From eli.friedman at gmail.com Wed Mar 23 23:51:15 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 23 Mar 2011 21:51:15 -0700 Subject: [LLVMdev] regarding LLVM Pass In-Reply-To: References: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> Message-ID: On Wed, Mar 23, 2011 at 9:49 PM, Eli Friedman wrote: > On Wed, Mar 23, 2011 at 9:15 PM, Gondi, Kalpana wrote: >> Hi All, >> ?I am a newbie to LLVM and I would like to write an LLVM pass where I can >> transform C code. Say, I would like to introduce a print statement after >> every loop. Could you please provide me any hints as how I should proceed >> to write such transformation using LLVM? >> Also, I would like to analyze C Code and transform. Say, I would like to >> use Alias analysis and decide to introduce print statements for some >> pointer variables inside the code itself. Any suggestion as where I should >> start? >> >> I've been going through documentation like, writing an LLVM Pass , but I >> guess, I need much more exposure than just going through such >> documentation. > > If you haven't looked at http://llvm.org/docs/tutorial/ , I would > suggest taking a look; it's only partially relevant to what you're > asking, but should give you a better feel for how stuff works. > >> Also, do I need to look at CLANG? How is it different from writing the >> LLVM pass to perform the above mentioned tasks? > > It's different: clang has a rewriter you can use, for example, to > insert a call after every loop. ?You end up with a different > definition of "loop", though: at the clang level, you'll see the AST > nodes for "for" and "while" loops; at the IR level, you'll see > constructs which are structurally loops. > >> Finally, did anyone compile Linux kernel using LLVM and booted the same? I >> am facing the error like "unsupported inline asm:...". > > IIRC, llvm-gcc is affected by some bugs related to inline asm which > will likely never be fixed that affect the Linux kernel (llvm-gcc is > considered deprecated). ?See > http://llvm.org/bugs/attachment.cgi?id=3486 for building it with > clang; the kernel tends to use all sorts of obscure/nasty gcc flags > and constructs, which makes things tricky. Err, make that http://llvm.org/bugs/show_bug.cgi?id=4068 ; accidentally copy-pasted the wrong link. -Eli From criswell at cs.uiuc.edu Wed Mar 23 23:58:31 2011 From: criswell at cs.uiuc.edu (John Criswell) Date: Wed, 23 Mar 2011 23:58:31 -0500 Subject: [LLVMdev] regarding LLVM Pass In-Reply-To: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> References: <792fc4e850461df5e3d61f3e1ed57604.squirrel@webmail.uic.edu> Message-ID: <4D8ACF77.4080206@cs.uiuc.edu> On 3/23/11 11:15 PM, Gondi, Kalpana wrote: > Hi All, > I am a newbie to LLVM and I would like to write an LLVM pass where I can > transform C code. Say, I would like to introduce a print statement after > every loop. Could you please provide me any hints as how I should proceed > to write such transformation using LLVM? You essentially want to create a call instruction to printf. Look for the doxygen documentation on the llvm.org web site and look for the llvm::CallInst class. The Create() method of CallInst is what you want to use. > Also, I would like to analyze C Code and transform. Say, I would like to > use Alias analysis and decide to introduce print statements for some > pointer variables inside the code itself. Any suggestion as where I should > start? Try to find examples that use the AliasAnalysis interface. The interface itself should be defined in a header file; it should be pretty easy to use, although the underlying implementation is still pretty simple, as far as I know. > I've been going through documentation like, writing an LLVM Pass , but I > guess, I need much more exposure than just going through such > documentation. > > Also, do I need to look at CLANG? How is it different from writing the > LLVM pass to perform the above mentioned tasks? Clang is useful for working with source-level ASTs. > Finally, did anyone compile Linux kernel using LLVM and booted the same? I > am facing the error like "unsupported inline asm:...". Yes and no, depending on one's perspective. I ported Linux 2.4 to a virtual architecture, meaning that I ripped out all the inline asm code and replaced it with calls to my VM which implemented it own assembly code. The C code parts of the kernel were compiled with LLVM. I think other people have compiled Linux 2.6 out-of-the-box (or pretty close to it) with newer versions of LLVM. I'll let them comment. -- John T. > Please help me with all these. And I appreciate your support and patience. > > Thanks, > GK > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From chenwj at iis.sinica.edu.tw Thu Mar 24 00:57:38 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Thu, 24 Mar 2011 13:57:38 +0800 Subject: [LLVMdev] llvm-gcc handles inline assembly incorrectly on PowerPC Message-ID: <20110324055738.GA32997@cs.nctu.edu.tw> Hi, folks I don't know if this is a bug or not? But the way how llvm-gcc on ppc handles inline assembly is different from the one on x86. For example, here is the example code: ----------------------------------------- register int *a asm("r10"); int main() { *a = 1; return 0; } ----------------------------------------- llvm-gcc on x86 produces LLVM IR belows, ----------------------------------------- define i32 @main() nounwind { entry: %0 = tail call i32* asm "", "={r10}"() nounwind ; [#uses=1] store i32 1, i32* %0, align 4 ret i32 undef } ----------------------------------------- But on ppc, llvm-gcc misses "r" in "r10", %0 = tail call i32* asm "", "={10}"() nounwind ; [#uses=1] ^^^^ missing "r" Any idea? Thanks. Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 From chenwj at iis.sinica.edu.tw Thu Mar 24 01:21:34 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Thu, 24 Mar 2011 14:21:34 +0800 Subject: [LLVMdev] Make PPC JIT support inline assembly? Message-ID: <20110324062134.GC29511@cs.nctu.edu.tw> Hi, all It seems PPC JIT does not recognize inline assembly. For example, when I give LLVM IR belows to PPC JIT, %0 = tail call i32* asm "", "={r10}"() nounwind ; [#uses=1] it complaints that inline assembly is not a supported instruction. x86 JIT works fine, however. Is there a reason that makes PPC JIT not support inline assembly? Currently, we modify PPCGenCodeEmitter.inc, then rebuild LLVM to let ppc recognize inline assembly (see attachment). Is there a better way to make PPC JIT support inline assembly? Any suggestion appreciated. Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 -------------- next part -------------- A non-text attachment was scrubbed... Name: PPCGenCodeEmitter.inc.patch Type: text/x-diff Size: 348 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110324/77b505e8/attachment.bin From baldrick at free.fr Thu Mar 24 01:53:15 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 24 Mar 2011 07:53:15 +0100 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <20110324062134.GC29511@cs.nctu.edu.tw> References: <20110324062134.GC29511@cs.nctu.edu.tw> Message-ID: <4D8AEA5B.7030209@free.fr> Hi ???, > It seems PPC JIT does not recognize inline assembly. > For example, when I give LLVM IR belows to PPC JIT, > > %0 = tail call i32* asm "", "={r10}"() nounwind ; [#uses=1] > > it complaints that inline assembly is not a supported > instruction. x86 JIT works fine, however. I'm surprised this worked with the x86 JIT - I thought the JIT didn't support any inline assembler on any platform, and that the plan was to solve this with the new MC-JIT, see http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html Ciao, Duncan. From tera219 at gmail.com Thu Mar 24 02:21:38 2011 From: tera219 at gmail.com (Ding-Yong Hong) Date: Thu, 24 Mar 2011 15:21:38 +0800 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <4D8AEA5B.7030209@free.fr> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> Message-ID: Hi, Very few inline assembly are supported for the X86 backend. As I see from X86ISelLowering.cpp, only bswap, rorw, xchgl and simple register selections (e.g. {=r10}) LLVM JIT can recoginze. But for PPC backend, I am not sure why PPC JIT see all inline assembly IRs as an error. Ding-Yong On Thu, Mar 24, 2011 at 2:53 PM, Duncan Sands wrote: > Hi ???, > > > It seems PPC JIT does not recognize inline assembly. > > For example, when I give LLVM IR belows to PPC JIT, > > > > %0 = tail call i32* asm "", "={r10}"() nounwind ; [#uses=1] > > > > it complaints that inline assembly is not a supported > > instruction. x86 JIT works fine, however. > > I'm surprised this worked with the x86 JIT - I thought the JIT didn't > support any inline assembler on any platform, and that the plan was to > solve this with the new MC-JIT, see > http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html > > Ciao, Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20110324/0c7ab201/attachment.html From chenwj at iis.sinica.edu.tw Thu Mar 24 02:35:40 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Thu, 24 Mar 2011 15:35:40 +0800 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <4D8AEA5B.7030209@free.fr> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> Message-ID: <20110324073540.GA35595@cs.nctu.edu.tw> Hi, Duncan > support any inline assembler on any platform, and that the plan was to > solve this with the new MC-JIT, see > http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html At the first glance, I think what llvm-mc does is, given an input, llvm-mc will disassemble the input into assembly. I don't know the MC-JIT you mentioned can be used as a JIT. Currently, a JIT is created by ExecutionEngine::createJIT. Can you give more information about the MC-JIT? Is it already included in llvm-2.8? Thank! Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 From baldrick at free.fr Thu Mar 24 03:14:08 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 24 Mar 2011 09:14:08 +0100 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <20110324073540.GA35595@cs.nctu.edu.tw> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> <20110324073540.GA35595@cs.nctu.edu.tw> Message-ID: <4D8AFD50.1060300@free.fr> Hi chenwj, >> support any inline assembler on any platform, and that the plan was to >> solve this with the new MC-JIT, see >> http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html > > At the first glance, I think what llvm-mc does is, given an input, > llvm-mc will disassemble the input into assembly. I don't know the > MC-JIT you mentioned can be used as a JIT. Currently, a JIT is > created by ExecutionEngine::createJIT. > > Can you give more information about the MC-JIT? Is it already > included in llvm-2.8? it is not in llvm-2.8, it will not even be in llvm-2.9. This is why you couldn't find it :) Some patches went into the subversion repository lately if you are interested. Ciao, Duncan. From baldrick at free.fr Thu Mar 24 03:16:54 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 24 Mar 2011 09:16:54 +0100 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> Message-ID: <4D8AFDF6.6090404@free.fr> Hi Ding-Yong, > Very few inline assembly are supported for the X86 backend. As I see from > X86ISelLowering.cpp, only bswap, rorw, xchgl and simple register selections > (e.g. {=r10}) LLVM JIT can recoginze. But for PPC backend, I am not > sure why PPC JIT see all inline assembly IRs as an error. probably because no-one was interested enough to add the analogous logic to the PPC backend (the PowerPC target does not implement ExpandInlineAsm). Ciao, Duncan. > Ding-Yong > On Thu, Mar 24, 2011 at 2:53 PM, Duncan Sands > wrote: > > Hi ???, > > > It seems PPC JIT does not recognize inline assembly. > > For example, when I give LLVM IR belows to PPC JIT, > > > > %0 = tail call i32* asm "", "={r10}"() nounwind ; [#uses=1] > > > > it complaints that inline assembly is not a supported > > instruction. x86 JIT works fine, however. > > I'm surprised this worked with the x86 JIT - I thought the JIT didn't > support any inline assembler on any platform, and that the plan was to > solve this with the new MC-JIT, see > http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html > > Ciao, Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > From chenwj at iis.sinica.edu.tw Thu Mar 24 03:42:13 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Thu, 24 Mar 2011 16:42:13 +0800 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <4D8AFD50.1060300@free.fr> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> <20110324073540.GA35595@cs.nctu.edu.tw> <4D8AFD50.1060300@free.fr> Message-ID: <20110324084213.GB36929@cs.nctu.edu.tw> Hi, Duncan > it is not in llvm-2.8, it will not even be in llvm-2.9. This is why you > couldn't find it :) Some patches went into the subversion repository > lately if you are interested. You mentioned "the plan was to solve this with the new MC-JIT". You mean that MC-JIT can handle inline assembly as an input, and generate target binary code? Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 From baldrick at free.fr Thu Mar 24 04:14:57 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 24 Mar 2011 10:14:57 +0100 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <20110324084213.GB36929@cs.nctu.edu.tw> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> <20110324073540.GA35595@cs.nctu.edu.tw> <4D8AFD50.1060300@free.fr> <20110324084213.GB36929@cs.nctu.edu.tw> Message-ID: <4D8B0B91.7080500@free.fr> Hi chenwj, >> it is not in llvm-2.8, it will not even be in llvm-2.9. This is why you >> couldn't find it :) Some patches went into the subversion repository >> lately if you are interested. > > You mentioned "the plan was to solve this with the new MC-JIT". You > mean that MC-JIT can handle inline assembly as an input, and generate > target binary code? MC allows you to turn LLVM IR into object code or assembler. It also enables you to turn assembler into object code. Consider now how a JIT works. When a function is to be run, the LLVM IR for it needs to be turned into object code in memory; the object code will then be executed by the processor. I hope it is clear that MC could be used to generate the object code in memory, rather than the hand-crafted assembly snippets currently used. Now consider what is required to have the JIT execute inline assembler. Essentially an inline asm is a string containing assembly code. In order for the JIT to execute it it needs to convert that assembly code into object code in memory. Thus the JIT needs to have a built in assembler. Currently the JIT does not have a built in assembler (though in the X86 case it does have an assembler that can only handle a few special cases). Since MC can also act as an assembler, it could be used to turn inline asm into object code. Ciao, Duncan. From chenwj at iis.sinica.edu.tw Thu Mar 24 04:23:51 2011 From: chenwj at iis.sinica.edu.tw (=?utf-8?B?6Zmz6Z+L5Lu7?=) Date: Thu, 24 Mar 2011 17:23:51 +0800 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <4D8B0B91.7080500@free.fr> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> <20110324073540.GA35595@cs.nctu.edu.tw> <4D8AFD50.1060300@free.fr> <20110324084213.GB36929@cs.nctu.edu.tw> <4D8B0B91.7080500@free.fr> Message-ID: <20110324092351.GA37956@cs.nctu.edu.tw> Hi, I see. Thanks. :-) One more question, how mature MC-JIT is? Can it be used on x86 or other architecture? Regards, chenwj -- Wei-Ren Chen (???) Parallel Processing Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 From baldrick at free.fr Thu Mar 24 04:27:59 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 24 Mar 2011 10:27:59 +0100 Subject: [LLVMdev] Make PPC JIT support inline assembly? In-Reply-To: <20110324092351.GA37956@cs.nctu.edu.tw> References: <20110324062134.GC29511@cs.nctu.edu.tw> <4D8AEA5B.7030209@free.fr> <20110324073540.GA35595@cs.nctu.edu.tw> <4D8AFD50.1060300@free.fr> <20110324084213.GB36929@cs.nctu.edu.tw> <4D8B0B91.7080500@free.fr> <20110324092351.GA37956@cs.nctu.edu.tw> Message-ID: <4D8B0E9F.2050709@free.fr> Hi chenwj, > One more question, how mature MC-JIT is? Can it be used on > x86 or other architecture? my understanding is that it is not yet usable. Ciao, Duncan. From david.lightstone at prodigy.net Thu Mar 24 06:21:41 2011 From: david.lightstone at prodigy.net (David Lightstone) Date: Thu, 24 Mar 2011 07:21:41 -0400 Subject: [LLVMdev] Reversing a function's CFG? Message-ID: <006f01cbea15$a75ddb70$f6199250$@prodigy.net> This is in reply to the posting below. I am not a compiler writer type, so I am probably in over my head a bit. Several years ago I was a bit interested in something called code slicers. An example of one (probably no longer supported) is UNRAVEL (http://www.itl.nist.gov/div897/sqg/unravel/unravel.html ) Their basic idea is to identify the algorithm which serves to determine the value of variable at a specific location in the code. That is they seek to determine all the paths which influence a computation result. They do not make any assumptions about the possible paths which lead to the specific location. (ergo many such paths) They accomplish this by walking the computation backwards and pruning out stuff which is just not relevant (you unfortunately cannot prune out the stuff which is not relevant) You appear to have the very same problem (except the Boolean evaluation which serves to indicate a need for backtracking), with a minor twist. You have instrumentation which allows you to determine the path. The strategy which they use (and the problems which they experience) are likely to be the same problems as you will experience This is what I see (1) There will be functions which cannot be inverted. Identifying them is the principle task. They are the locations where in addition to control information, data checks are necessary (2) In the instrumented code you know the locations where the results have to be reversed. You can infer the location in the actual code. Construct the paths based not on the instrumented code, but the actual code (3) There are probably only a finite number of paths, each identified by a different instrumentation configuration (ie a state variable). You reverse the erroneous result by running a pre-compiled procedure generated from the reverse path. (4) Associate the reversing algorithm with the state and execute when appropriate Dave Lightstone //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// / Date: Wed, 23 Mar 2011 13:52:40 -0400 From: "Justin M. LaPre" Subject: Re: [LLVMdev] Reversing a function's CFG? To: joshuawarner32 at gmail.com Cc: llvmdev at cs.uiuc.edu Message-ID: <5B806BA1-94DB-4EF0-B526-511F6EDFB28B at gmail.com> Content-Type: text/plain; charset="us-ascii" On Mar 23, 2011, at 12:24 PM, Joshua Warner wrote: > Hi Justin, > > I take the fact that nobody has replied as a sign that nobody really understands what you are asking. Most likely. I was trying not to write a ten page e-mail to the list but quite possibly simplified too much. I'll try to elaborate a little more: In large-scale parallel simulations, sometimes you process an event speculatively. If it turns out you should not have processed that event (and such situations can be detected by our system), you need to undo all of the changes you made in your event handler. Most approaches use some form of state-saving. We opt for an approach coined by my advisor (Chris Carothers, actually) called "reverse computation." The reverse function is basically the inverse of the forward event handler e.g. if in the forward event handler you increment a variable, the reverse handler would need to decrement it. That's an extremely simple case. A more complicated case: if you have an "if" in your forward handler, you must augment it with a bitfield to remember which path you took so the reverse handler can find its way back. So we're basically saving our control state as opposed to our data state as the control state is often significantly smaller. I like to think of it as a trail of breadc! rumbs. Using different approaches we can handle loops, ifs, switches, etc. by following our "breadcrumbs" back. > I was wondering if there was a quick way to reverse a function's CFG > and, in turn, all basic blocks within it. Assuming all variables are > globals, is there a quick way to generate a function's reversal? I > highly doubt such functionality exists but I figured it was worth > asking. I'm trying to develop an "undo function" generator pass that > would be able to restore system state (without state-saving) after > determining an error occurred. This is documented in, "Efficient > Optimistic Parallel Simulations using Reverse Computation" by > Carothers et al.[1] > > > I'm unaware of an easy way to "reverse the CFG" - but even if there was, I don't think that would solve your problem. First, you will only be able to generate "undo" functions for a small subset of all possible functions - specifically, invertible functions. Second, my intuition is that computing an inverse function will be significantly more involved than just reversing the CFG. Reversing the CFG is just a piece of the puzzle. To generate our reverse handler, I thought a good place to start would be the reverse CFG. Specifically, I need to: 1. instrument the forward handler control path 2. emit the reverse handler given the forward handler (which I believe can be achieved by the following) 2a. create the reverse CFG 2b. for each basic block in (2a), invert the instructions Unfortunately, not all functions are invertible as you said. In those cases, we may opt to fall back on state-saving or use some heuristics to try and bypass state-saving (memory operations can be expensive on some of the machines we run simulations on). Assuming the above steps go off without a hitch (which is a big assumption), we should have our reverse event handler generated for us. I do have some questions: is there any way to differentiate IR I inserted while instrumenting the forward path from IR that was there before? I don't want to invert my instrumentation instructions. Also, due to this problem, I've been attempting to write a Module pass as opposed to a Function pass because I couldn't differentiate between the two. > Could you be a little more specific about the problem you are dealing with? How do Carothers et al. do the inversion? Carothers did it all by hand! :) In the paper he talks about automating it. As I have taken a few compiler courses, apparently I'm the compiler guy in our HPC group. Anyway, thanks for the response. If I was unclear in any way, please ask questions! -Justin From douglasdocouto at gmail.com Thu Mar 24 06:39:38 2011 From: douglasdocouto at gmail.com (Douglas do Couto Teixeira) Date: Thu, 24 Mar 2011 08:39:38 -0300 Subject: [LLVMdev] Range Analysis GSoC 2011 Proposal In-Reply-To: <4D8A733B.9030402@illinois.edu> References: <4D8A733B.9030402@illinois.edu> Message-ID: On Wed, Mar 23, 2011 at 7:24 PM, John Criswell wrote: > Dear Douglas, > > Comments below. > > > Dear John, I'm planning to send a revised version of the proposal to the list today, but before that let me try answer some of your questions. > On 3/23/11 8:06 AM, Douglas do Couto Teixeira wrote: > > Dear LLVM community, > > I would like to contribute to LLVM in the Google Summer of Code project. My > proposal is listed below. Please let me know your comments. > > > Adding Range Analysis to LLVM > > Abstract > > The objective of this work is patch our implementation of range analysis > into LLVM. I have a running implementation of range > > > Say "working implementing" instead of "running implementation." > > > analysis in LLVM, but it is not currently part of the main distribution. I > propose to integrate our range analysis implementation into the Lazy Value > Info (LVI) interface that LLVM provides. Range analysis finds the intervals > of values that may be bound to the integer variables during the execution of > programs and is useful in several scenarios: constant propagation, detection > of potential buffer overflow attacks, dead branch elimination, array bound > check elimination, elimination of overflow tests in scripting languages such > as JavaScript and Lua, etc. > > > Objective > > The objective of this project is to augment LLVM with range analysis. We > will do this integration by patching the current implementation of range > analysis that we have onto the Lazy Value Info (LVI) interface that LLVM > already provides. In addition, we will develop new optimizations using LVI. > In particular, we will provide a pass that performs conditional constant > propagation [5], and elimination of dead-branches. > > Criteria of Success > > - To improve substantially the precision of the current implementation > of LVI. Currently, LVI?s interface only allows a client to know if a > variable contains a constant. We want to allow LVI to report that a variable > either contains a constant, or a known-range. > - To improve the current implementation of constant propagation that > LLVM uses. We hope to obtain a small performance gain on the C benchmarks in > the LLVM test suite, and a larger gain on Java programs that are compiled > using VMKit. > - To improve the implementation of JumpThreading, in such a way that > more dead-branches will be eliminated. Again, we hope to achieve a small > speed-up on the C benchmarks, and a larger speed-up on the Java benchmarks. > > > > Background > > Range Analysis is a technique that maps integer variables to the possible > ranges of values that they may assume through out > > > throughout is a single word in this instance. > > > the execution of a program. Thus, for each integer variable, a range > analysis determines its lower and upper limits. A very simple range analysis > would, for instance, map each variable to the limits imposed by its type. > That is, an 8-bit unsigned integer variable can be correctly mapped to the > interval [0, 255], and an 16-bit signed integer can be mapped to [-32767, > 32766]. > > > You probably want to be consistent in how numbers are interpreted. Use the > unsigned or signed interpretations in both examples above. > > > However, the precision of this analysis can greatly be improved from > information inferred from the program text. > > Ideally this range should be as constrained as possible, so that an > optimizing compiler could learn more information about each variable. > However, the range analysis must be conservative, that is, it will only > constraint the range of a variable if it can > > > "constrain the range" > > > prove that it is safe to do so. As an example, consider the program: > > i = read(); > if (i < 10) { > print (i + 1); > else { > print(i - 1); > } > > In this program we know, from the conditional test, that the value of ?i? > in the true side of the branch is in the range [-INF, 9], and in the false > side is in the range [10, +INF]. > > During the Summer of Code 2010 I have designed and implemented, under the > orientation of Duncan Sands, a non-iterative > > > Remove the word "have" in the sentence above. > > > range analysis algorithm. Our implementation is currently fully functional, > been able to analyze the whole LLVM test suite. For more details, see [4]. > However, this implementation has never been integrated into the LLVM main > trunc, for two reasons: > > 1. We use an intermediate representation called extended static > assignment form [6], which the LLVM contributors were reluctant to use; > 2. During the SoC 2010 we did not have time to completely finish our > implementation, and runtime numbers were available only by the end of 2010. > 3. There was not really an infra-structure already in place in LLVM to > take benefit of our analysis. > > > I think your text is a little unclear about point #3. From your text > below, it sounds like LLVM lacks optimizations that utilize value range > analysis. If that is the case, then you should state below (like you do in > the beginning of your proposal) what optimizations you'll write. > > I meant: There was not really an infra-structure already in place in LLVM to take benefit of our analysis. There were not clients for this pass, and not a common interface that those clients could use. Now, LLVM provides the LVI interface, and there are some clients that use it: JumpThreading, etc. > As an aside, we'd be interested in using trying out value-range analysis in > SAFECode (http://safecode.cs.illinois.edu) for use in static array bounds > checking. Creating a static array bounds checking pass for SAFECode using > your analysis would probably be trivial. The only difficulty is that > SAFECode is currently built using LLVM 2.7, and I'm guessing that you're > using a newer version of LLVM. > > > No, I'm using LLVM 2.7 too. That's because the pass that produces e-SSA form was not ported to the newer LLVM versions. > > > > > > In order to address the first item, we propose to integrate the > intermediate representation directly into our analysis, yet, as a module > that can be used in separate by other clients, if necessary. > > > This part isn't completely clear to me. It sounds like what you're > suggesting is to write one analysis pass that internally constructs e-SSA > form and a second analysis that actually does the value-range analysis. It > also sounds like you're writing the value-range analysis so that it can be > used with and without e-SSA form. Is this correct, or have I misinterpreted > what you're saying? > > Either way, I think the text above and the paragraph below about e-SSA form > could be made a little more clear. > > > Yes, we use e-SSA form to gain precision, but it is not a requirement. Compare, for instance, Figures 1, 4 and 5 in our report ( http://homepages.dcc.ufmg.br/~douglas/projects/RangeAnalysis/RangeAnalysis.paper.pdf). E-SSA increases a lot the precision of the analysis, but we still can work without it > We are splitting the live ranges of variables using single-arity > phi-functions, which are automatically handled by the SSA elimination pass > that LLVM already includes. This live range splitting is only necessary for > greater precision. We can do our live range analysis without it, although > the results are less precise. A previous Summer of Code, authored by Andre > Tavares, has shown that the e-SSA form increases the number of phi-functions > in the program code by less than 10%, and it is very fast to build [6]. > > The second item of our list of hindrances is no longer a problem. Our > implementation is ready for use. We have been able to analyze the whole LLVM > test suite, plus SPEC CPU 2006 - over 4 million LLVM bytecoes - in 44 > seconds on a 2.4GHz machine. We obtain non-trivial bit size reductions for > the small benchmarks, having results that match those found by previous, > non-conservative works, such as Stephenson?s et al?s [2]. > > > What kind of improvements do you see in large benchmarks? If small > benchmarks yield good results and large benchmarks yield poor results, then > your analysis probably needs more work to be useful for real-world programs. > > > > This happens because the analysis is intra-procedural. I guess the other optimizations that LLVM uses suffer from this limitation too. For SPEC CPU 2006 we have been able to reduce the bitwidth of the variables by 8%. This would happen with any type of range analysis algorithm. Every time we have a function, like: foo(int n) {...} We must assume that [-inf, +inf] \subseteq n. One of my lab mates is working on an inter-procedural version of the analysis. His project is on a very initial stage though. > Moreover, our implementation is based on a very modern algorithm, by Su > and Wagner [1], augmented with Gawlitza?s technique to handle cycles [3]. We > believe that this is the fastest implementation of such an analysis. > > Finally, the third item is also no longer a problem. Presently LLVM offers > the Lazy Value Info interface that reports when variables are constants. The > current LVI implementation also provides infra-structure to deal with ranges > of integer intervals. However, it does not contain a fully functional > implementation yet, an omission that we hope to fix with this project. > > > Again, you need to be clear about whether any optimizations use this for > anything beyond seeing if a value is constant, and if not, describe what > optimizations you plan to write to fix that. > > > I think the big client would be dead-code elimination: if (x > 10) {...} If we know that x is [0, 10], for instance, then this branch is dead. This kind of optimization is stronger than Zadeck's conditional constant propagation, and I believe that it would be good for array-bounds check elimination in memory safe languages such as Java. > > Timeline and Testing Methodology > > 1. Change the LVI interface, adding a new method to it: getRange(Value > *V), so that we can, not only know if a variable is > > > No comma after can. > > > > 1. a constant, but also know its range of values whenever it is not a > constant. > 2. Change the implementation of the prototype range analysis that LVI > already uses, so that it will use our implementation. > 3. Implement the sparse conditional constant propagation to use LVI. > This, in addition to JumpThreading will be another client that will use LVI. > > > Is there an advantage to using your analysis for conditional constant > propagation? I see the value in range analysis, but I don't see what your > algorithm adds above what is already there. Please specify. > > > Yes! I think I was not clear here. The main contribution, again, would be able to implement Zadeck's optimization with ranges, instead of simple constants. So, instead of being able to eliminate: if (x != 10) {...} when we know that x is, say, 11, we could eliminate also branches that use other relational operators, e.g, <, <=, >, >=, !=. > > 1. Run experiments on the LLVM test suite to verify that our new > optimization improves on the current dead branch elimination pass that LLVM > already uses. > 2. Run experiments using VMKit, to check the impact of dead branch > elimination on Java programs. We hope to deliver better performance numbers > in this case because Java, as a memory safe language, is notorious for using > many checks to ensure that memory is used only in ways that obey the > contract of the variable types. > > > You may want to check that VMKit is working with the version of LLVM that > you're using. > > > > Biograph > > > Biography > > > > I am currently a third year Computer Science student at the Federal > University of Minas Gerais , Brazil, and I > work as a research assistant at the Programming Languages Lab(LLP), in that university. > > > You might want to state more explicitly whether you're an undergraduate or > graduate student. I'm guessing your an undergraduate but am not sure. > > > I'm an undergraduate student. > > I believe I am a good candidate to work in the proposed project because I > have already a good knowledge of LLVM. I have > > > "on the proposed project" and "because I am already knowledgeable about > LLVM" > > > > implemented range analysis, and currently it can find the ranges of integer > variables used in the programs. In addition to this, I have been working as > a programmer for three years, before joining the LLP as a research > assistant, and I am experienced > > > I think you want to say that you worked as a programmer for three years > before joining LLP. > > > with C++, the language in which LLVM is implemented. Furthermore, I am a > student at a very good university (UFMG). To ground this statement I would > like to point that the Department of Computer Science of UFMG got the best > mark in the > > > "To justify" instead of "to ground" and "point out" instead of just "point" > > > Brazilian National Undergrad Exam (ENADE). Finally, I work on a lab in > which three other students work with LLVM. We had > > > work in a lab > > > three Summer of Codes on LLVM in the past, and a number of papers have been > published out of these experiences. > > > I think you should just point out the SoC's that you've been involved in. > The ones from your lab mates seems less relevant to me. > > All in all, I think your wrote a good proposal. You may want to send a > revised version to the list; others may want to comment on the things that I > think need to be clarified. > > BTW, are you looking for a mentor, or has Duncan volunteered for this year > again? > > Well, I made contact with Duncan some weeks ago but he doesn't officially volunteered to be my mentor this year. So, yes, I'm looking for a mentor. With best wishes, Douglas Good luck! > > -- John T. > > > > References > > 1. A Class of Polynomially Solvable Range Constraints for Interval > Analysis without Widenings and Narrowings, Zhendong Su and David Wagner, > In Theoretical Computer Science, Volume 345 , Issue 1, 2005. > > 2. Bitwidth Analysis with Application to Silicon Compilation, Mark > Stephenson, Jonathan Babb, and Saman Amarasinghe, In Proceedings of the >