From eli.friedman at gmail.com Fri Jan 1 01:59:58 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 31 Dec 2009 23:59:58 -0800 Subject: [LLVMdev] How does JIT/lli work with bc file? In-Reply-To: <3048b3a50912312103n2c611e8ax49b8c40ef4151755@mail.gmail.com> References: <3048b3a50912301953q16517e2w78a6cf9c1914c92a@mail.gmail.com> <3048b3a50912310226lba05266tbbe51b4eed293b2@mail.gmail.com> <3048b3a50912312103n2c611e8ax49b8c40ef4151755@mail.gmail.com> Message-ID: On Thu, Dec 31, 2009 at 9:03 PM, Heming Cui wrote: > Hi Eli, > ??? I think the llvm configure has already configured with ffi. > > ../llvm-2.6/configure -help | grep ffi > ? --enable-libffi???????? Check for the presence of libffi (default is YES) Umm, that just means it checks; if you don't have the headers installed, it doesn't use it. > ????In addition, the printf() can work in program, reflecting that libffi is > working, right? > ??? How can I make Interpreter work with getpid(), fork(), and clone()? I believe printf() is special-cased; I forget exactly where. -Eli From wendling at apple.com Fri Jan 1 06:18:34 2010 From: wendling at apple.com (Bill Wendling) Date: Fri, 1 Jan 2010 04:18:34 -0800 Subject: [LLVMdev] Void vs int In-Reply-To: <200912310737.08031.jon@ffconsultancy.com> References: <200912310737.08031.jon@ffconsultancy.com> Message-ID: <20A95D7D-CB74-413D-9CBF-7092F65B3E0E@apple.com> On Dec 30, 2009, at 11:37 PM, Jon Harrop wrote: > Is it more efficient to return void rather than the int 0, e.g. does it reduce > register pressure? > If you don't use the value after the call, then it will produce the same LLVM code for both functions from llvm-gcc and clang: $ cat t.c int foo(); void bar() { foo(); } void baz(); void qux() { baz(); } $ llvm-gcc -o - -S t.c -mllvm -disable-llvm-optzns -emit-llvm define void @bar() nounwind ssp { entry: %0 = call i32 (...)* @foo() nounwind ; [#uses=0] br label %return return: ; preds = %entry ret void } declare i32 @foo(...) define void @qux() nounwind ssp { entry: call void (...)* @baz() nounwind br label %return return: ; preds = %entry ret void } $ clang -o - -S t.c -emit-llvm define void @bar() nounwind ssp { entry: %call = call i32 (...)* @foo() ; [#uses=0] ret void } declare i32 @foo(...) define void @qux() nounwind ssp { entry: call void (...)* @baz() ret void } So there should be no advantage in this situation. However, if you use the "int 0" value, then it will produce different code for the two functions. And then the register allocator will have to get involved. Caveat: This was tested on a Mac. -bw From jon at ffconsultancy.com Fri Jan 1 14:35:19 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Fri, 1 Jan 2010 20:35:19 +0000 Subject: [LLVMdev] Parallelism in HLVM Message-ID: <201001012035.20048.jon@ffconsultancy.com> The HLVM project is a high-level VM optimized for scientific computing: http://www.ffconsultancy.com/ocaml/hlvm/ I implemented the first-working version of a garbage collector capable of collecting from threads that run in parallel in November. Initial performance was awful due to the overhead of accessing thread-local data via POSIX pthreads. I just completed optimizing HLVM so thread-local data are now passed everywhere as an auxiliary argument to every HLVM function. This has dramatically improved performance and single-threaded code now runs within 25% of the performance of the serial collector. However, one test fails with a segfault when JIT compiled with TCO enabled and I don't know why. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From mmms1841 at gmail.com Fri Jan 1 14:51:34 2010 From: mmms1841 at gmail.com (mmms1841) Date: Fri, 1 Jan 2010 12:51:34 -0800 Subject: [LLVMdev] Assembly Printer Message-ID: I am trying to understand how LLVM does code generation and I have a couple of questions. I am using LLVM 2.6. First, if I want to change the name of an instruction, all I need to do is to modify the XXXInstrInfo.td, right? Using Sparc as an example, if I wanted to output "mysra" instead of "sra", in SparcInstrInfo.td, I would write, defm SRA : F3_12<"mysra", 0b100111, sra>; Is this correct? When I run llc with option -march=sparc, after I make the modification, it still outputs "sra", not "mysra". I looked into SparcGenAsmWriter.inc, and made sure that string AsmStrs includes "mysra". However, when I run gdb and do "print AsmStrs + (Bits & 1023)", it prints "sra". Does this make sense or am I just overlooking something? The second question is about pattern matching of instructions. I found that some of the target instructions do not have corresponding patterns to match. For example, in SparcInstrInfo.td, "udiv" and "sdiv" don't seem to have any patterns specified. defm UDIV : F3_12np<"udiv", 0b001110>; defm SDIV : F3_12np<"sdiv", 0b001111>; Is this because these instructions are handled differently from other instructions in SparcISelDAGToDAG.cpp? In function SparcDAGToDAGISel::Select(SDValue Op), instruction selection for "sdiv" and "udiv" is done in the switch-case statement, while SelectCode(Op) takes care of the other instructions*. * Thank you.. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100101/e762fa15/attachment.html From giannismantz at gmail.com Sat Jan 2 09:52:04 2010 From: giannismantz at gmail.com (Yannis Mantzouratos) Date: Sat, 2 Jan 2010 17:52:04 +0200 Subject: [LLVMdev] Adding a new instruction? Message-ID: Hi, We 're working on an llvm interpreter. We perform some static analysis to detect some blocks with a specific property, and we need the interpreter to be able to recognise these blocks fast in time it reaches them. We thought of adding a new instruction in the LLVM instruction set and put it in the beginning of such blocks, so that the interpreter would be instantly alerted that the current block is 'special'. Is there an easier/quicker way to do this? Cheers, yannis From gvenn.cfe.dev at gmail.com Sat Jan 2 12:16:38 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Sat, 2 Jan 2010 13:16:38 -0500 Subject: [LLVMdev] Adding a new instruction? In-Reply-To: References: Message-ID: <0A611ABA-4435-4624-9B1A-79F8C47D0896@gmail.com> Sorry, forgot to post to list. For 2.7 I'm wondering if you could use custom metadata attached to the first instruction of a "special" block? You could register a unique kind (not sure how to guarantee uniqueness), and attach a metadata node via the context to the first instruction with this kind. Your pass would look for this. I have never tried this, so I don't know if predecessor passes that your pass would depend on would affect this metadata; if different threads with their own context would see metadata attached via a specific context; and what the resultant performance effect would be. Just a thought Garrison On Jan 2, 2010, at 10:52, Yannis Mantzouratos wrote: > Hi, > > We 're working on an llvm interpreter. We perform some static analysis > to detect some blocks with a specific property, and we need the > interpreter to be able to recognise these blocks fast in time it > reaches them. We thought of adding a new instruction in the LLVM > instruction set and put it in the beginning of such blocks, so that > the interpreter would be instantly alerted that the current block is > 'special'. Is there an easier/quicker way to do this? > > Cheers, > yannis > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From dllaurence at dslextreme.com Sat Jan 2 13:03:16 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 02 Jan 2010 11:03:16 -0800 Subject: [LLVMdev] indirectbr Message-ID: <4B3F9874.4010102@laurences.net> Hello, I have a question about the indirectbr instruction. I attempted to use it according to the example in the Assembly Language Reference manual, but got an "expected instruction opcode" error. Poking about on the web I found this document: http://nondot.org/sabre/LLVMNotes/IndirectGoto.txt which appears to be a Nov 2, 2009 proposal to add indirectbr and blockaddress() to the IR language. So I suspect indirectbr is not supported in the llvm-as 2.5 packaged in Fedora 11, is that correct? Dustin From bob.wilson at apple.com Sat Jan 2 13:24:16 2010 From: bob.wilson at apple.com (Bob Wilson) Date: Sat, 2 Jan 2010 11:24:16 -0800 Subject: [LLVMdev] indirectbr In-Reply-To: <4B3F9874.4010102@laurences.net> References: <4B3F9874.4010102@laurences.net> Message-ID: <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> On Jan 2, 2010, at 11:03 AM, Dustin Laurence wrote: > Hello, > > I have a question about the indirectbr instruction. I attempted to > use > it according to the example in the Assembly Language Reference manual, > but got an "expected instruction opcode" error. Poking about on the > web > I found this document: > > http://nondot.org/sabre/LLVMNotes/IndirectGoto.txt > > which appears to be a Nov 2, 2009 proposal to add indirectbr and > blockaddress() to the IR language. So I suspect indirectbr is not > supported in the llvm-as 2.5 packaged in Fedora 11, is that correct? Yes, that is correct. It is supported in the trunk sources, but it has not yet been released. From dllaurence at dslextreme.com Sat Jan 2 13:42:22 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 02 Jan 2010 11:42:22 -0800 Subject: [LLVMdev] indirectbr In-Reply-To: <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> References: <4B3F9874.4010102@laurences.net> <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> Message-ID: <4B3FA19E.5040409@laurences.net> On 01/02/2010 11:24 AM, Bob Wilson wrote: > Yes, that is correct. That *would* explain why I couldn't figure out how to make llvm-as understand it. :-) > ...It is supported in the trunk sources, but it has > not yet been released. OK. I'll stick with my workaround of using an integer state code and a switch statement for the moment just in the interests of minimizing the extra complexity beyond what I already have learning the LLVM IR, but at some point should think about just installing the trunk version (I did a trial build and had no problems, so I don't anticipate that would be a huge headache). Dustin From dllaurence at dslextreme.com Sat Jan 2 12:33:16 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 02 Jan 2010 10:33:16 -0800 Subject: [LLVMdev] indirectbr Message-ID: <4B3F916C.8080803@laurences.net> Hello, I have a question about the indirectbr instruction. I attempted to use it according to the example in the Assembly Language Reference manual, but got an "expected instruction opcode" error. Poking about on the web I found this document: http://nondot.org/sabre/LLVMNotes/IndirectGoto.txt which appears to be a Nov 2, 2009 proposal to add indirectbr and blockaddress() to the IR language. So I suspect indirectbr is not supported in the llvm-as 2.5 packaged in Fedora 11, is that correct? Dustin From dllaurence at dslextreme.com Sat Jan 2 16:31:06 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 02 Jan 2010 14:31:06 -0800 Subject: [LLVMdev] inbounds (was Re: indirectbr) In-Reply-To: <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> References: <4B3F9874.4010102@laurences.net> <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> Message-ID: <4B3FC92A.2040103@laurences.net> On 01/02/2010 11:24 AM, Bob Wilson wrote: > Yes, that is correct. It is supported in the trunk sources, but it has > not yet been released. Hmm. Would the same also be true of the "inbounds" keyword for GEP? It doesn't seem to be recognized ("expected type"). Dustin From eli.friedman at gmail.com Sat Jan 2 16:36:31 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Sat, 2 Jan 2010 14:36:31 -0800 Subject: [LLVMdev] inbounds (was Re: indirectbr) In-Reply-To: <4B3FC92A.2040103@laurences.net> References: <4B3F9874.4010102@laurences.net> <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> <4B3FC92A.2040103@laurences.net> Message-ID: On Sat, Jan 2, 2010 at 2:31 PM, Dustin Laurence wrote: > On 01/02/2010 11:24 AM, Bob Wilson wrote: > >> Yes, that is correct. ?It is supported in the trunk sources, but it has >> not yet been released. > > Hmm. ?Would the same also be true of the "inbounds" keyword for GEP? ?It > doesn't seem to be recognized ("expected type"). The version of LangRef corresponding to LLVM 2.5 is at http://llvm.org/releases/2.5/docs/LangRef.html . -Eli From dllaurence at dslextreme.com Sat Jan 2 16:45:37 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 02 Jan 2010 14:45:37 -0800 Subject: [LLVMdev] inbounds (was Re: indirectbr) In-Reply-To: References: <4B3F9874.4010102@laurences.net> <093F1788-0CA8-4A7A-8724-3E114B83150B@apple.com> <4B3FC92A.2040103@laurences.net> Message-ID: <4B3FCC91.7080406@laurences.net> On 01/02/2010 02:36 PM, Eli Friedman wrote: > The version of LangRef corresponding to LLVM 2.5 is at > http://llvm.org/releases/2.5/docs/LangRef.html . Thanks much, I've bookmarked it. That would have answered both questions had I had the wit to go looking for release-specific docs. :-( Dustin From clattner at apple.com Sun Jan 3 01:00:22 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 2 Jan 2010 23:00:22 -0800 Subject: [LLVMdev] Assembly Printer In-Reply-To: References: Message-ID: On Jan 1, 2010, at 12:51 PM, mmms1841 wrote: > I am trying to understand how LLVM does code generation and I have a couple of questions. > I am using LLVM 2.6. > > First, > if I want to change the name of an instruction, all I need to do is to modify the XXXInstrInfo.td, right? > Using Sparc as an example, if I wanted to output "mysra" instead of "sra", in SparcInstrInfo.td, I would write, > > defm SRA : F3_12<"mysra", 0b100111, sra>; > > Is this correct? Yes. > When I run llc with option -march=sparc, after I make the modification, it still outputs "sra", not "mysra". I looked into SparcGenAsmWriter.inc, and made sure that string AsmStrs includes "mysra". However, when I run gdb and do "print AsmStrs + (Bits & 1023)", it prints "sra". > Does this make sense or am I just overlooking something? Sounds like something is being overlooked. Perhaps tblgen didn't get rerun or something didn't get relinked. > The second question is about pattern matching of instructions. > I found that some of the target instructions do not have corresponding patterns to match. > For example, in SparcInstrInfo.td, "udiv" and "sdiv" don't seem to have any patterns specified. > > defm UDIV : F3_12np<"udiv", 0b001110>; > defm SDIV : F3_12np<"sdiv", 0b001111>; > > Is this because these instructions are handled differently from other instructions in SparcISelDAGToDAG.cpp? > In function SparcDAGToDAGISel::Select(SDValue Op), instruction selection for "sdiv" and "udiv" is done in the switch-case statement, while SelectCode(Op) takes care of the other instructions. Yep, exactly, -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100102/0304a1c8/attachment.html From clattner at apple.com Sun Jan 3 02:33:21 2010 From: clattner at apple.com (Chris Lattner) Date: Sun, 3 Jan 2010 00:33:21 -0800 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post Message-ID: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design: http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :) -Chris From edwintorok at gmail.com Sun Jan 3 03:17:51 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Sun, 03 Jan 2010 11:17:51 +0200 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post In-Reply-To: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> References: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> Message-ID: <4B4060BF.4080706@gmail.com> On 2010-01-03 10:33, Chris Lattner wrote: > If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design: > http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html > > This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :) > Can a label be listed multiple times in indirectbr? Clang generates this: foo: ; preds = %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto store i32 1, i32* %retval br label %return indirectbr i8* %indirect.goto.dest, [label %foo, label %foo, label %bar, label %foo, label %hack, label %foo, label %foo] For this code taken from the gcc manual: .... static const int array[] = { &&foo - &&foo, &&bar - &&foo, &&hack - &&foo }; goto *(&&foo + array[i]); ..... If I remove &&foo - &&foo from the array, clang still thinks that &&foo is a possible destination, even if I run some optimizers on it. Since the argument to goto is an array of constants, it should be possible for an optimizer to determine the exact list of destinations. Also the intent of that code is to allow it to go into a readonly section, however with Clang it only goes to a .data.rel.ro section (with -fPIC): .section .data.rel.ro,"aw", at progbits .align 4 foo.array: .long (.LBA3_foo_return) - (.LBA3_foo_return) .long (.LBA3_foo_bar) - (.LBA3_foo_return) .long (.LBA3_foo_hack) - (.LBA3_foo_return) .size foo.array, 12 While gcc does put it into a readonly section (with -fPIC): .section .rodata .align 4 .type array.1248, @object .size array.1248, 12 array.1248: .long 0 .long .L4-.L2 .long .L5-.L2 Best regards, --Edwin From anton at korobeynikov.info Sun Jan 3 05:14:21 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 3 Jan 2010 14:14:21 +0300 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post In-Reply-To: <4B4060BF.4080706@gmail.com> References: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> <4B4060BF.4080706@gmail.com> Message-ID: Hello, Edwin > Also the intent of that code is to allow it to go into a readonly > section, however with Clang it only goes to a .data.rel.ro section (with > -fPIC): Sounds like a bug. Fill a PR and assign to me. I will look into it when I return from vacations. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From eli.friedman at gmail.com Sun Jan 3 05:33:29 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 3 Jan 2010 03:33:29 -0800 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post In-Reply-To: <4B4060BF.4080706@gmail.com> References: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> <4B4060BF.4080706@gmail.com> Message-ID: 2010/1/3 T?r?k Edwin : > On 2010-01-03 10:33, Chris Lattner wrote: >> If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design: >> http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html >> >> This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. ?If you have questions or comments about the post, this is a good thread to discuss them on :) >> > > Can a label be listed multiple times in indirectbr? Yes; it's not particularly meaningful, but it's not difficult to construct a case where the optimizer will introduce such a construct. > Clang generates this: > foo: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?; preds = > %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto > ?store i32 1, i32* %retval > ?br label %return > > indirectbr i8* %indirect.goto.dest, [label %foo, label %foo, label %bar, > label %foo, label %hack, label %foo, label %foo] > > For this code taken from the gcc manual: > .... > ? ? static const int array[] = { &&foo - &&foo, &&bar - &&foo, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&&hack - &&foo }; > ? ? goto *(&&foo + array[i]); > ..... > > If I remove &&foo - &&foo from the array, clang still thinks that &&foo > is a possible destination, even if I run some optimizers on it. > Since the argument to goto is an array of constants, it should be > possible for an optimizer to determine the exact list of destinations. Missed optimization, I guess... put it into lib/Target/README.txt if you think it's an interesting case to try to optimize. (It doesn't strike me as particularly interesting because anyone using indirect gotos is going to coding carefully anyway.) > Also the intent of that code is to allow it to go into a readonly > section, however with Clang it only goes to a .data.rel.ro section (with > -fPIC): Another missed optimization; this one seems pretty important, though. -Eli From edwintorok at gmail.com Sun Jan 3 05:45:43 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Sun, 03 Jan 2010 13:45:43 +0200 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post In-Reply-To: References: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> <4B4060BF.4080706@gmail.com> Message-ID: <4B408367.3020203@gmail.com> On 2010-01-03 13:33, Eli Friedman wrote: > 2010/1/3 T?r?k Edwin : > >> On 2010-01-03 10:33, Chris Lattner wrote: >> >>> If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design: >>> http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html >>> >>> This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :) >>> >>> >> Can a label be listed multiple times in indirectbr? >> > > Yes; it's not particularly meaningful, but it's not difficult to > construct a case where the optimizer will introduce such a construct. > Ok. > >> Clang generates this: >> foo: ; preds = >> %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto, %indirectgoto >> store i32 1, i32* %retval >> br label %return >> >> indirectbr i8* %indirect.goto.dest, [label %foo, label %foo, label %bar, >> label %foo, label %hack, label %foo, label %foo] >> >> For this code taken from the gcc manual: >> .... >> static const int array[] = { &&foo - &&foo, &&bar - &&foo, >> &&hack - &&foo }; >> goto *(&&foo + array[i]); >> ..... >> >> If I remove &&foo - &&foo from the array, clang still thinks that &&foo >> is a possible destination, even if I run some optimizers on it. >> Since the argument to goto is an array of constants, it should be >> possible for an optimizer to determine the exact list of destinations. >> > > Missed optimization, I guess... put it into lib/Target/README.txt if > you think it's an interesting case to try to optimize. (It doesn't > strike me as particularly interesting because anyone using indirect > gotos is going to coding carefully anyway.) > If the code generator isn't confused by the multiple destinations then its fine. I can't think of a situation where the presence or the lack of that one extra edge would matter. > >> Also the intent of that code is to allow it to go into a readonly >> section, however with Clang it only goes to a .data.rel.ro section (with >> -fPIC): >> > > Another missed optimization; this one seems pretty important, though. > On 2010-01-03 13:14, Anton Korobeynikov wrote: > Hello, Edwin > > >> Also the intent of that code is to allow it to go into a readonly >> section, however with Clang it only goes to a .data.rel.ro section (with >> -fPIC): >> > Sounds like a bug. Fill a PR and assign to me. I will look into it > when I return from vacations. > > Done, PR5929. Best regards, --Edwin From jay.foad at gmail.com Sun Jan 3 06:54:23 2010 From: jay.foad at gmail.com (Jay Foad) Date: Sun, 3 Jan 2010 12:54:23 +0000 Subject: [LLVMdev] safe to speculatively execute load of malloc? Message-ID: I've just noticed this, in Instruction::isSafeToSpeculativelyExecute(): http://llvm.org/doxygen/Instruction_8cpp-source.html#l00408 00430 case Load: { 00431 if (cast(this)->isVolatile()) 00432 return false; 00433 if (isa(getOperand(0)) || isMalloc(getOperand(0))) 00434 return true; This says that it's safe to speculatively execute a load from the pointer returned by malloc(). But surely that's not true if malloc() returns NULL. Thanks, Jay. From clattner at apple.com Sun Jan 3 12:09:59 2010 From: clattner at apple.com (Chris Lattner) Date: Sun, 3 Jan 2010 10:09:59 -0800 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post In-Reply-To: <4B4060BF.4080706@gmail.com> References: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> <4B4060BF.4080706@gmail.com> Message-ID: <95155A66-9982-482D-967A-91D35494F556@apple.com> On Jan 3, 2010, at 1:17 AM, T?r?k Edwin wrote: On 2010-01-03 10:33, Chris Lattner wrote: >> If you're interested in this new extension, here is some more information with some less-than-obvious aspects of the design: >> http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html >> >> This feature was added to LLVM by Bob Wilson, Dan Gohman and I to mainline back in November. If you have questions or comments about the post, this is a good thread to discuss them on :) >> > > Can a label be listed multiple times in indirectbr? Yep. > Also the intent of that code is to allow it to go into a readonly > section, however with Clang it only goes to a .data.rel.ro section (with > -fPIC): Nice catch, fixed in r92450! -Chris From clattner at apple.com Sun Jan 3 12:14:44 2010 From: clattner at apple.com (Chris Lattner) Date: Sun, 3 Jan 2010 10:14:44 -0800 Subject: [LLVMdev] safe to speculatively execute load of malloc? In-Reply-To: References: Message-ID: You're right, fixed in r92452. -Chris On Jan 3, 2010, at 4:54 AM, Jay Foad wrote: > I've just noticed this, in Instruction::isSafeToSpeculativelyExecute(): > > http://llvm.org/doxygen/Instruction_8cpp-source.html#l00408 > > 00430 case Load: { > 00431 if (cast(this)->isVolatile()) > 00432 return false; > 00433 if (isa(getOperand(0)) || isMalloc(getOperand(0))) > 00434 return true; > > This says that it's safe to speculatively execute a load from the > pointer returned by malloc(). But surely that's not true if malloc() > returns NULL. > > Thanks, > Jay. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From ambika at cse.iitb.ac.in Sun Jan 3 12:41:13 2010 From: ambika at cse.iitb.ac.in (ambika) Date: Mon, 04 Jan 2010 00:11:13 +0530 Subject: [LLVMdev] [Fwd: Help Required for LLVM] Message-ID: <4B40E4C9.8020600@cse.iitb.ac.in> -------------- next part -------------- An embedded message was scrubbed... From: ambika Subject: Help Required for LLVM Date: Mon, 04 Jan 2010 00:08:36 +0530 Size: 984 Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/50013902/attachment.eml From ambika at cse.iitb.ac.in Sun Jan 3 12:38:36 2010 From: ambika at cse.iitb.ac.in (ambika) Date: Mon, 04 Jan 2010 00:08:36 +0530 Subject: [LLVMdev] Help Required for LLVM Message-ID: <4B40E42C.5020500@cse.iitb.ac.in> Sir/Ma'am, I am a MTech student at IIT Bombay, doing my thesis in compiler optimization. I am working on "Profile Based Pointer Analysis to Perform Optimization", and would like to implement the optimization in llvm for performing experiments. When I was trying to build and install the compiler after configuring llvm-gcc I got the following error: gcc: gengtype-lex.c: No such file or directory Where can I find this file. Or am I missing something else. Please reply as soon as possible, I will be highly obliged. Thanks and Regards, Ambika Agarwal 08305037 CSE, MTech IIT Bombay From dllaurence at dslextreme.com Sun Jan 3 14:22:00 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sun, 03 Jan 2010 12:22:00 -0800 Subject: [LLVMdev] 'Address of Label and Indirect Branches in LLVM IR' blog post In-Reply-To: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> References: <7A865D7E-83FA-4E1C-8BDB-FDB9F5661926@apple.com> Message-ID: <4B40FC68.20304@laurences.net> On 01/03/2010 12:33 AM, Chris Lattner wrote: > If you're interested in this new extension, here is some more > information with some less-than-obvious aspects of the design: > http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html > > This feature was added to LLVM by Bob Wilson, Dan Gohman and I to > mainline back in November. If you have questions or comments about > the post, this is a good thread to discuss them on :) My only comment is that I tripped over inadvertent version skew with the docs while tryin to code exactly the case discussed. I am building a small lexer which I naturally wanted to implement with a jump table to labels corresponding to automaton states, and when I couldn't get it to work I finally fell back on the switch solution too. But it offends my moral sensibilities. :-) Since I'm a newcomer to LLVM I can't comment on the implementation of the extension at all, but will make a wild guess that all the "creative" uses of label addresses are going to be like the Linux case you describe--it sounds like some sort of debugging hackery. If they interfere with optimizations, perhaps you can support them only with optimizations shut off (at least for that bit of code), or better just tell the relevant optimizer stages to leave that code alone when weird usages are detected. I for one wouldn't expect you to kill yourself so my debugging code could be aggressively optimized. Now that I've said that I'll no doubt think of some other use I *would* like optimized. It would have to be pretty strange, though. Dustin From haruki.zaemon at gmail.com Sun Jan 3 19:12:55 2010 From: haruki.zaemon at gmail.com (Simon Harris) Date: Mon, 4 Jan 2010 12:12:55 +1100 Subject: [LLVMdev] Tail Call Optimisation Message-ID: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> I'm investigating "improving" the TCO facilities in LLVM to provide for "hard" tail calls. Specifically, this would involve extending the existing implementation to discard the stack frame for the caller before executing the callee. I would then like to extend this further by performing hard tail calls on _all_ returning calls that no longer require the stack frame. A colleague of mine and I have looked into it and prima facie believe that it's entirely feasible. I wanted to ask the list if there was any interest, any objections, and of course, anything pointers/tips that may prove useful. Regards -- Simon Harris w: http://www.harukizaemon.com/ e: haruki.zaemon at gmail.com m: +61 417 505 611 t: @haruki_zaemon From jon at ffconsultancy.com Sun Jan 3 22:01:41 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Mon, 4 Jan 2010 04:01:41 +0000 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> Message-ID: <201001040401.42168.jon@ffconsultancy.com> On Monday 04 January 2010 01:12:55 Simon Harris wrote: > I'm investigating "improving" the TCO facilities in LLVM to provide for > "hard" tail calls. Specifically, this would involve extending the existing > implementation to discard the stack frame for the caller before executing > the callee. I would then like to extend this further by performing hard > tail calls on _all_ returning calls that no longer require the stack frame. > > A colleague of mine and I have looked into it and prima facie believe that > it's entirely feasible. I wanted to ask the list if there was any interest, > any objections, and of course, anything pointers/tips that may prove > useful. I am certainly interested in tail calls because my HLVM project relies upon LLVM's tail call elimination. However, I do not understand what tail calls LLVM is not currently eliminating that you plan to eliminate? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From zhu.heyu at gmail.com Sun Jan 3 20:59:40 2010 From: zhu.heyu at gmail.com (Heyu Zhu) Date: Mon, 4 Jan 2010 10:59:40 +0800 Subject: [LLVMdev] How to bind a register variable with a given general purpose register? Message-ID: Hi everyone, There are 16 GPRs in my RISC, but in fact GPR13 is read-only and connected to output of an A/D converter. It would be very convenient if i could bind a register variable with GPR13. Because i am a newbie i don't know how my llvm backend can support that. I plan to implement it as below. A. first declare a global variable in c-code int ADC asm("GPR13"); B. If backend finds a variable is loaded from "GPR13" use GPR13 instead. C. backend can't allocate GPR13 to other variable Is it a foolish method? Is there a better one? Please give me some guidance Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/4fe5ac84/attachment.html From haruki.zaemon at gmail.com Sun Jan 3 21:33:06 2010 From: haruki.zaemon at gmail.com (Simon Harris) Date: Mon, 4 Jan 2010 14:33:06 +1100 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: <201001040401.42168.jon@ffconsultancy.com> References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> <201001040401.42168.jon@ffconsultancy.com> Message-ID: On 04/01/2010, at 3:01 PM, Jon Harrop wrote: > On Monday 04 January 2010 01:12:55 Simon Harris wrote: >> I'm investigating "improving" the TCO facilities in LLVM to provide for >> "hard" tail calls. Specifically, this would involve extending the existing >> implementation to discard the stack frame for the caller before executing >> the callee. I would then like to extend this further by performing hard >> tail calls on _all_ returning calls that no longer require the stack frame. >> >> A colleague of mine and I have looked into it and prima facie believe that >> it's entirely feasible. I wanted to ask the list if there was any interest, >> any objections, and of course, anything pointers/tips that may prove >> useful. > > I am certainly interested in tail calls because my HLVM project relies upon > LLVM's tail call elimination. However, I do not understand what tail calls > LLVM is not currently eliminating that you plan to eliminate? Mutual recursion for a start: def a(n) n <= 0 ? "DONE" : b(n - 1) end def b(n) n <= 0 ? "DONE" : a(n - 1) end a(10000000) Boom! -- Simon Harris w: http://www.harukizaemon.com/ e: haruki.zaemon at gmail.com m: +61 417 505 611 t: @haruki_zaemon From jon at ffconsultancy.com Sun Jan 3 22:50:07 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Mon, 4 Jan 2010 04:50:07 +0000 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> <201001040401.42168.jon@ffconsultancy.com> Message-ID: <201001040450.07699.jon@ffconsultancy.com> On Monday 04 January 2010 03:33:06 Simon Harris wrote: > On 04/01/2010, at 3:01 PM, Jon Harrop wrote: > > I am certainly interested in tail calls because my HLVM project relies > > upon LLVM's tail call elimination. However, I do not understand what tail > > calls LLVM is not currently eliminating that you plan to eliminate? > > Mutual recursion for a start: > > def a(n) > n <= 0 ? "DONE" : b(n - 1) > end > > def b(n) > n <= 0 ? "DONE" : a(n - 1) > end > > a(10000000) > > Boom! LLVM's TCO already handles mutual recursion. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From haruki.zaemon at gmail.com Sun Jan 3 21:37:39 2010 From: haruki.zaemon at gmail.com (Simon Harris) Date: Mon, 4 Jan 2010 14:37:39 +1100 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: <201001040450.07699.jon@ffconsultancy.com> References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> <201001040401.42168.jon@ffconsultancy.com> <201001040450.07699.jon@ffconsultancy.com> Message-ID: On 04/01/2010, at 3:50 PM, Jon Harrop wrote: > On Monday 04 January 2010 03:33:06 Simon Harris wrote: >> On 04/01/2010, at 3:01 PM, Jon Harrop wrote: >>> I am certainly interested in tail calls because my HLVM project relies >>> upon LLVM's tail call elimination. However, I do not understand what tail >>> calls LLVM is not currently eliminating that you plan to eliminate? >> >> Mutual recursion for a start: >> >> def a(n) >> n <= 0 ? "DONE" : b(n - 1) >> end >> >> def b(n) >> n <= 0 ? "DONE" : a(n - 1) >> end >> >> a(10000000) >> >> Boom! > > LLVM's TCO already handles mutual recursion. Hmm... OK. Perhaps it's the way it's being used by MacRuby and Rubinius. Drat! Back to the drawing board :( From jyasskin at google.com Sun Jan 3 23:16:40 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Sun, 3 Jan 2010 23:16:40 -0600 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: <201001040450.07699.jon@ffconsultancy.com> References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> <201001040401.42168.jon@ffconsultancy.com> <201001040450.07699.jon@ffconsultancy.com> Message-ID: On Sun, Jan 3, 2010 at 10:50 PM, Jon Harrop wrote: > On Monday 04 January 2010 03:33:06 Simon Harris wrote: >> On 04/01/2010, at 3:01 PM, Jon Harrop wrote: >> > I am certainly interested in tail calls because my HLVM project relies >> > upon LLVM's tail call elimination. However, I do not understand what tail >> > calls LLVM is not currently eliminating that you plan to eliminate? >> >> Mutual recursion for a start: >> >> def a(n) >> ? n <= 0 ? "DONE" : b(n - 1) >> end >> >> def b(n) >> ? n <= 0 ? "DONE" : a(n - 1) >> end >> >> a(10000000) >> >> Boom! > > LLVM's TCO already handles mutual recursion. Only for fastcc functions compiled with -tailcallopt, right? http://llvm.org/docs/CodeGenerator.html#tailcallopt I believe gcc manages to support tail calls in many more cases, and this restriction in llvm confuses lots of newcomers. It would be very worthwhile if someone wanted to remove it. From foom at fuhm.net Mon Jan 4 00:10:38 2010 From: foom at fuhm.net (James Y Knight) Date: Mon, 4 Jan 2010 01:10:38 -0500 Subject: [LLVMdev] ASM output with JIT / codegen barriers Message-ID: In working on an LLVM backend for SBCL (a lisp compiler), there are certain sequences of code that must be atomic with regards to async signals. So, for example, on x86, a single SUB on a memory location should be used, not a load/sub/store sequence. LLVM's IR doesn't currently have any way to express this kind of constraint (...and really, that's essentially impossible since different architectures have different possibilities, so I'm not asking for this...). All I really would like is to be able to specify the exact instruction sequence to emit there. I'd hoped that inline asm would be the way to do so, but LLVM doesn't appear to support asm output when using the JIT compiler. Is there any hope for inline asm being supported with the JIT anytime soon? Or is there an alternative suggested way of doing this? I'm using llvm.atomic.load.sub.i64.p0i64 for the moment, but that's both more expensive than I need as it has an unnecessary LOCK prefix, and is also theoretically incorrect. While it generates correct code currently on x86-64, LLVM doesn't actually *guarantee* that it generates a single instruction, that's just "luck". Additionally, I think there will be some situations where a particular ordering of memory operations is required. LLVM makes no guarantees about the order of stores, unless there's some way that you could tell the difference in a linear program. Unfortunately, I don't have a linear program, I have a program which can run signal handlers between arbitrary instructions. So, I think I'll need something like an llvm.memory.barrier of type "ss", except only affecting the codegen, not actually inserting a processor memory barrier. Is there already some way to insert a codegen-barrier with no additional runtime cost (beyond the opportunity-cost of not being able to reorder/delete stores across the barrier)? If not, can such a thing be added? On x86, this is a non-issue, since the processor already implicitly has inter-processor store-store barriers, so using: call void @llvm.memory.barrier(i1 0, i1 0, i1 0, i1 1, i1 0) is fine: it's a noop at runtime but ensures the correct sequence of stores...but I'm thinking ahead here to other architectures where that would actually require expensive instructions to be emitted. Thanks, James From resistor at mac.com Mon Jan 4 02:20:58 2010 From: resistor at mac.com (Owen Anderson) Date: Mon, 04 Jan 2010 00:20:58 -0800 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: References: Message-ID: <38F1F80F-81B6-4353-A1A7-BC54C07F9E39@mac.com> On Jan 3, 2010, at 10:10 PM, James Y Knight wrote: > In working on an LLVM backend for SBCL (a lisp compiler), there are > certain sequences of code that must be atomic with regards to async > signals. So, for example, on x86, a single SUB on a memory location > should be used, not a load/sub/store sequence. LLVM's IR doesn't > currently have any way to express this kind of constraint (...and > really, that's essentially impossible since different architectures > have different possibilities, so I'm not asking for this...). Why do you want to do this? As far as I'm aware, there's no guarantee that a memory-memory SUB will be observed atomically across all processors. Remember that most processors are going to be breaking X86 instructions up into micro-ops, which might get reordered/interleaved in any number of different ways. > All I really would like is to be able to specify the exact instruction > sequence to emit there. I'd hoped that inline asm would be the way to > do so, but LLVM doesn't appear to support asm output when using the > JIT compiler. Is there any hope for inline asm being supported with > the JIT anytime soon? Or is there an alternative suggested way of > doing this? I'm using llvm.atomic.load.sub.i64.p0i64 for the moment, > but that's both more expensive than I need as it has an unnecessary > LOCK prefix, and is also theoretically incorrect. While it generates > correct code currently on x86-64, LLVM doesn't actually *guarantee* > that it generates a single instruction, that's just "luck". It's not luck. That's exactly what the atomic intrinsics guarantee: that no other processor can observe an intermediate state of the operation. What they don't guarantee per the LangRef is sequential consistency. If you care about that, you need to use explicit fencing. --Owen -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2620 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/3f64e2b4/attachment.bin From baldrick at free.fr Mon Jan 4 02:31:18 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 04 Jan 2010 09:31:18 +0100 Subject: [LLVMdev] Help Required for LLVM In-Reply-To: <4B40E42C.5020500@cse.iitb.ac.in> References: <4B40E42C.5020500@cse.iitb.ac.in> Message-ID: <4B41A756.8070801@free.fr> Hi Ambika, > I am a MTech student at IIT Bombay, doing my thesis in compiler > optimization. > I am working on "Profile Based Pointer Analysis to Perform > Optimization", and would like to implement the optimization in llvm for > performing experiments. > > When I was trying to build and install the compiler after configuring > llvm-gcc I got the following error: > gcc: gengtype-lex.c: No such file or directory > > Where can I find this file. Or am I missing something else. > Please reply as soon as possible, I will be highly obliged. what target are you building for and what commands did you use to configure and try to build? Ciao, Duncan. From etherzhhb at gmail.com Mon Jan 4 02:36:15 2010 From: etherzhhb at gmail.com (ether) Date: Mon, 04 Jan 2010 16:36:15 +0800 Subject: [LLVMdev] How to bind a register variable with a given general purpose register? In-Reply-To: References: Message-ID: <4B41A87F.5080808@gmail.com> hi zhu, i think you should map the peripheral registers to data memory space instead of register file, then mark that memory address as "volatile". like said the adc register was mapped to address 0xc0000, and then the corresponding c source will goes like this: //define the register #define ADCREG (*((volatile unsigned int *)0xc0000)) //read the register a = ADCREG --ether On 2010-1-4 10:59, Heyu Zhu wrote: > Hi everyone, > There are 16 GPRs in my RISC, but in fact GPR13 is read-only and > connected to output of an A/D converter. > It would be very convenient if i could bind a register variable with > GPR13. > Because i am a newbie i don't know how my llvm backend can support that. > I plan to implement it as below. > A. first declare a global variable in c-code > int ADC asm("GPR13"); > B. If backend finds a variable is loaded from "GPR13" use GPR13 instead. > C. backend can't allocate GPR13 to other variable > Is it a foolish method? Is there a better one? Please give me some > guidance > Thanks > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From chandlerc at google.com Mon Jan 4 03:17:02 2010 From: chandlerc at google.com (Chandler Carruth) Date: Mon, 4 Jan 2010 01:17:02 -0800 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: <38F1F80F-81B6-4353-A1A7-BC54C07F9E39@mac.com> References: <38F1F80F-81B6-4353-A1A7-BC54C07F9E39@mac.com> Message-ID: <74c447501001040117q5e95372dkc08ad69431b71a1@mail.gmail.com> On Mon, Jan 4, 2010 at 12:20 AM, Owen Anderson wrote: > > On Jan 3, 2010, at 10:10 PM, James Y Knight wrote: > >> In working on an LLVM backend for SBCL (a lisp compiler), there are >> certain sequences of code that must be atomic with regards to async >> signals. So, for example, on x86, a single SUB on a memory location >> should be used, not a load/sub/store sequence. LLVM's IR doesn't >> currently have any way to express this kind of constraint (...and >> really, that's essentially impossible since different architectures >> have different possibilities, so I'm not asking for this...). > > Why do you want to do this? ?As far as I'm aware, there's no guarantee that a memory-memory SUB will be observed atomically across all processors. ?Remember that most processors are going to be breaking X86 instructions up into micro-ops, which might get reordered/interleaved in any number of different ways. I'm assuming 'memory-memory' there is a typo, and we're just talking about, a 'sub' instruction with a memory destination. In that case, I'll go further: the Intel IA-32 manual explicitly tells you that x86 processors are allowed to do the read and write halves of that single instruction interleaved with other writes to that memory location from other processors (See section 8.2.3.1 of [1]). =[ I can tell you from bitter experience debugging code that assumed this, it does in fact happen. I have watched reference counters miss both increments and decrements from it on both Intel and AMD systems. >> All I really would like is to be able to specify the exact instruction >> sequence to emit there. I'd hoped that inline asm would be the way to >> do so, but LLVM doesn't appear to support asm output when using the >> JIT compiler. Is there any hope for inline asm being supported with >> the JIT anytime soon? Or is there an alternative suggested way of >> doing this? I'm using llvm.atomic.load.sub.i64.p0i64 for the moment, >> but that's both more expensive than I need as it has an unnecessary >> LOCK prefix, and is also theoretically incorrect. As I've mentioned above, I assure you the LOCK prefix matters. The strange thing is that you think this is inefficient. Modern processors don't lock the bus given this prefix to a 'sub' instruction; they just lock the cache and use the coherency model to resolve the issue. This is much cheaper than, say, an 'xchg' instruction on an x86 processor. What is the performance problem you are actually trying to solve here? >?What they don't guarantee per the LangRef is sequential consistency. ?If you care about that, you need to use explicit fencing. Side note: I regret greatly that I didn't know enough of the sequential consistency concerns here to address them more fully when I was working on this. =/ Even explicit fencing has subtle problems with it as currently specified. Is this causing problems for people (other than jyasskin who clued me in on the whole matter)? From resistor at mac.com Mon Jan 4 03:32:30 2010 From: resistor at mac.com (Owen Anderson) Date: Mon, 04 Jan 2010 01:32:30 -0800 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: <74c447501001040117q5e95372dkc08ad69431b71a1@mail.gmail.com> References: <38F1F80F-81B6-4353-A1A7-BC54C07F9E39@mac.com> <74c447501001040117q5e95372dkc08ad69431b71a1@mail.gmail.com> Message-ID: <948AAC50-3A9C-497A-9A0E-692FA869F18D@mac.com> On Jan 4, 2010, at 1:17 AM, Chandler Carruth wrote: > Side note: I regret greatly that I didn't know enough of the > sequential consistency concerns here to address them more fully when I > was working on this. =/ Even explicit fencing has subtle problems with > it as currently specified. Is this causing problems for people (other > than jyasskin who clued me in on the whole matter)? Talking about memory consistency is always painful. In particular, there's a disconnect between how consistency models think about reorderings, versus how the compiler and hardware actually perform them. There's a natural tension between sanity (make all atomic ops sequentially consistent) and performance (no consistency by default, frontend must supply it via fences). So far we've been pursuing the latter approach: C-level atomic intrinsics are emitted as fence-atomicop-fence. The X86 backend then has some knowledge (thanks to X86's comparatively strong memory model) of instances where fences can be folded away. --Owen -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2620 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/7d1a45b4/attachment.bin From chandlerc at google.com Mon Jan 4 03:35:57 2010 From: chandlerc at google.com (Chandler Carruth) Date: Mon, 4 Jan 2010 01:35:57 -0800 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: References: Message-ID: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> Responding to the original email... On Sun, Jan 3, 2010 at 10:10 PM, James Y Knight wrote: > In working on an LLVM backend for SBCL (a lisp compiler), there are > certain sequences of code that must be atomic with regards to async > signals. Can you define exactly what 'atomic with regards to async signals' this entails? Your descriptions led me to think you may mean something other than the POSIX definition, but maybe I'm just misinterpreting it. Are these signals guaranteed to run in the same thread? On the same processor? Is there concurrent code running in the address space when they run? > Additionally, I think there will be some situations where a particular > ordering of memory operations is required. LLVM makes no guarantees > about the order of stores, unless there's some way that you could tell > the difference in a linear program. Unfortunately, I don't have a > linear program, I have a program which can run signal handlers between > arbitrary instructions. So, I think I'll need something like an > llvm.memory.barrier of type "ss", except only affecting the codegen, > not actually inserting a processor memory barrier. The processor can reorder memory operations as well (within limits). Consider that 'memset' to zero is often codegened to a non-temporal store to memory. This exempts it from all ordering considerations except for an explicit memory fence in the processor. If code were to execute between those two instructions, the contents of the memory could read "andthenumberofcountingshallbethree", or 'feedbeef', or '0000...' or '11111...' there's just no telling. > Is there already some way to insert a codegen-barrier with no > additional runtime cost (beyond the opportunity-cost of not being able > to reorder/delete stores across the barrier)? If not, can such a thing > be added? On x86, this is a non-issue, since the processor already > implicitly has inter-processor store-store barriers, so using: > ? call void @llvm.memory.barrier(i1 0, i1 0, i1 0, i1 1, i1 0) > is fine: it's a noop at runtime but ensures the correct sequence of > stores...but I'm thinking ahead here to other architectures where that > would actually require expensive instructions to be emitted. But... if it *did* require expensive instructions, wouldn't you want them?!?! The reason we don't emit on x86 is because of its memory ordering guarantees. If it didn't have them, we would emit instructions to impose one because otherwise the wrong thing might happen. I think you should trust LLVM to only emit expensive instructions to achieve the ordering semantics you specify when they are necessary for the architecture, and file bugs if it ever fails. The only useful thing I can think of is if you happen to know that you execute on some "uniprocessor" with at most one thread of execution; and thus gain memory ordering constraints beyond those which can be assumed across an entire architecture (this is certainly true for x86). If it is useful to leverage this to optimize codegen, it should be at the target level, with some target options to specify that consistency assumptions should be greater than normal. The intrinsics and semantics should remain the same regardless. From jon at ffconsultancy.com Mon Jan 4 05:03:37 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Mon, 4 Jan 2010 11:03:37 +0000 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> <201001040450.07699.jon@ffconsultancy.com> Message-ID: <201001041103.38122.jon@ffconsultancy.com> On Monday 04 January 2010 05:16:40 Jeffrey Yasskin wrote: > On Sun, Jan 3, 2010 at 10:50 PM, Jon Harrop wrote: > > LLVM's TCO already handles mutual recursion. > > Only for fastcc functions Yes. > compiled with -tailcallopt, right? If you use the compiler, yes. > http://llvm.org/docs/CodeGenerator.html#tailcallopt > > I believe gcc manages to support tail calls in many more cases, and > this restriction in llvm confuses lots of newcomers. It would be very > worthwhile if someone wanted to remove it. That's interesting. What tail calls can be supported without changing the calling convention and would it not simply be easier to switch to the fastcc convention between internal functions to achieve the same effect from outside LLVM? Conversely, if TCO is implemented for the cc convention, what will be the point of fastcc? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From russell.wallace at gmail.com Mon Jan 4 04:19:59 2010 From: russell.wallace at gmail.com (Russell Wallace) Date: Mon, 4 Jan 2010 10:19:59 +0000 Subject: [LLVMdev] Getting Kaleidoscope to compile Message-ID: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> Hi all, I've started work on a new programming language for which I am considering using LLVM as the backend, and trying to experiment with it using the Kaleidoscope demo compiler. Taking the full source listing from http://llvm.org/docs/tutorial/LangImpl3.html#code and trying to compile it with the provided instructions gives me the following errors: a at a-desktop:~$ g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy toy.cpp:5:30: error: llvm/LLVMContext.h: No such file or directory toy.cpp:352: error: ?getGlobalContext? was not declared in this scope toy.cpp: In member function ?virtual llvm::Value* NumberExprAST::Codegen()?: toy.cpp:358: error: ?getGlobalContext? was not declared in this scope toy.cpp: In member function ?virtual llvm::Value* BinaryExprAST::Codegen()?: toy.cpp:379: error: ?getDoubleTy? is not a member of ?llvm::Type? toy.cpp:379: error: ?getGlobalContext? was not declared in this scope toy.cpp: In member function ?llvm::Function* PrototypeAST::Codegen()?: toy.cpp:407: error: ?getDoubleTy? is not a member of ?llvm::Type? toy.cpp:407: error: ?getGlobalContext? was not declared in this scope toy.cpp:408: error: ?getDoubleTy? is not a member of ?llvm::Type? toy.cpp: In member function ?llvm::Function* FunctionAST::Codegen()?: toy.cpp:454: error: ?getGlobalContext? was not declared in this scope toy.cpp: In function ?int main()?: toy.cpp:543: error: ?LLVMContext? was not declared in this scope toy.cpp:543: error: ?Context? was not declared in this scope toy.cpp:543: error: ?getGlobalContext? was not declared in this scope Am I doing something wrong? Operating system: Ubuntu 9.04 LLVM was obtained with: apt-get install llvm From oleg77 at gmail.com Mon Jan 4 04:27:37 2010 From: oleg77 at gmail.com (Oleg Knut) Date: Mon, 4 Jan 2010 12:27:37 +0200 Subject: [LLVMdev] Getting Kaleidoscope to compile In-Reply-To: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> References: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> Message-ID: <278bcd901001040227m2d6977afhe55106e36805ecd8@mail.gmail.com> Probably you missed to install llvm-dev package with headers for llvm. 2010/1/4 Russell Wallace > Hi all, > > I've started work on a new programming language for which I am > considering using LLVM as the backend, and trying to experiment with > it using the Kaleidoscope demo compiler. > > Taking the full source listing from > http://llvm.org/docs/tutorial/LangImpl3.html#code and trying to > compile it with the provided instructions gives me the following > errors: > > a at a-desktop:~$ g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags > --libs core` -o toy > toy.cpp:5:30: error: llvm/LLVMContext.h: No such file or directory > toy.cpp:352: error: ?getGlobalContext? was not declared in this scope > toy.cpp: In member function ?virtual llvm::Value* > NumberExprAST::Codegen()?: > toy.cpp:358: error: ?getGlobalContext? was not declared in this scope > toy.cpp: In member function ?virtual llvm::Value* > BinaryExprAST::Codegen()?: > toy.cpp:379: error: ?getDoubleTy? is not a member of ?llvm::Type? > toy.cpp:379: error: ?getGlobalContext? was not declared in this scope > toy.cpp: In member function ?llvm::Function* PrototypeAST::Codegen()?: > toy.cpp:407: error: ?getDoubleTy? is not a member of ?llvm::Type? > toy.cpp:407: error: ?getGlobalContext? was not declared in this scope > toy.cpp:408: error: ?getDoubleTy? is not a member of ?llvm::Type? > toy.cpp: In member function ?llvm::Function* FunctionAST::Codegen()?: > toy.cpp:454: error: ?getGlobalContext? was not declared in this scope > toy.cpp: In function ?int main()?: > toy.cpp:543: error: ?LLVMContext? was not declared in this scope > toy.cpp:543: error: ?Context? was not declared in this scope > toy.cpp:543: error: ?getGlobalContext? was not declared in this scope > > Am I doing something wrong? > > Operating system: Ubuntu 9.04 > LLVM was obtained with: apt-get install llvm > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/dd098d78/attachment.html From nicolas.geoffray at gmail.com Mon Jan 4 04:28:06 2010 From: nicolas.geoffray at gmail.com (nicolas geoffray) Date: Mon, 4 Jan 2010 11:28:06 +0100 Subject: [LLVMdev] Help Required for LLVM In-Reply-To: <4B41A756.8070801@free.fr> References: <4B40E42C.5020500@cse.iitb.ac.in> <4B41A756.8070801@free.fr> Message-ID: Hi Ambika, Could you check if you have bison and flex installed on your machine? Nicolas On Mon, Jan 4, 2010 at 9:31 AM, Duncan Sands wrote: > Hi Ambika, > > > I am a MTech student at IIT Bombay, doing my thesis in compiler > > optimization. > > I am working on "Profile Based Pointer Analysis to Perform > > Optimization", and would like to implement the optimization in llvm for > > performing experiments. > > > > When I was trying to build and install the compiler after configuring > > llvm-gcc I got the following error: > > gcc: gengtype-lex.c: No such file or directory > > > > Where can I find this file. Or am I missing something else. > > Please reply as soon as possible, I will be highly obliged. > > what target are you building for and what commands did you use to configure > and try to build? > > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/c4f9bc5e/attachment-0001.html From russell.wallace at gmail.com Mon Jan 4 04:37:11 2010 From: russell.wallace at gmail.com (Russell Wallace) Date: Mon, 4 Jan 2010 10:37:11 +0000 Subject: [LLVMdev] Getting Kaleidoscope to compile In-Reply-To: <278bcd901001040227m2d6977afhe55106e36805ecd8@mail.gmail.com> References: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> <278bcd901001040227m2d6977afhe55106e36805ecd8@mail.gmail.com> Message-ID: <8d71341e1001040237oea32fb4k4c1932904138fa36@mail.gmail.com> I tried apt-get install llvm-dev just now, and it says it was already installed, and when I again try compiling toy.cpp, it gives the same set of error messages. On Mon, Jan 4, 2010 at 10:27 AM, Oleg Knut wrote: > Probably you missed to install llvm-dev package with headers for llvm. > > 2010/1/4 Russell Wallace >> >> Hi all, >> >> I've started work on a new programming language for which I am >> considering using LLVM as the backend, ?and trying to experiment with >> it using the Kaleidoscope demo compiler. >> >> Taking the full source listing from >> http://llvm.org/docs/tutorial/LangImpl3.html#code and trying to >> compile it with the provided instructions gives me the following >> errors: >> >> a at a-desktop:~$ g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags >> --libs core` -o toy >> toy.cpp:5:30: error: llvm/LLVMContext.h: No such file or directory >> toy.cpp:352: error: ?getGlobalContext? was not declared in this scope >> toy.cpp: In member function ?virtual llvm::Value* >> NumberExprAST::Codegen()?: >> toy.cpp:358: error: ?getGlobalContext? was not declared in this scope >> toy.cpp: In member function ?virtual llvm::Value* >> BinaryExprAST::Codegen()?: >> toy.cpp:379: error: ?getDoubleTy? is not a member of ?llvm::Type? >> toy.cpp:379: error: ?getGlobalContext? was not declared in this scope >> toy.cpp: In member function ?llvm::Function* PrototypeAST::Codegen()?: >> toy.cpp:407: error: ?getDoubleTy? is not a member of ?llvm::Type? >> toy.cpp:407: error: ?getGlobalContext? was not declared in this scope >> toy.cpp:408: error: ?getDoubleTy? is not a member of ?llvm::Type? >> toy.cpp: In member function ?llvm::Function* FunctionAST::Codegen()?: >> toy.cpp:454: error: ?getGlobalContext? was not declared in this scope >> toy.cpp: In function ?int main()?: >> toy.cpp:543: error: ?LLVMContext? was not declared in this scope >> toy.cpp:543: error: ?Context? was not declared in this scope >> toy.cpp:543: error: ?getGlobalContext? was not declared in this scope >> >> Am I doing something wrong? >> >> Operating system: Ubuntu 9.04 >> LLVM was obtained with: apt-get install llvm >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > From russell.wallace at gmail.com Mon Jan 4 04:43:51 2010 From: russell.wallace at gmail.com (Russell Wallace) Date: Mon, 4 Jan 2010 10:43:51 +0000 Subject: [LLVMdev] C library function declarations Message-ID: <8d71341e1001040243q78a6c0bbma232861ae4c69fc0@mail.gmail.com> When implementing a language using LLVM as the backend, it is necessary to give programs written in that language, access to the C standard library functions. The Kaleidoscope tutorial shows how to do this for individual functions using extern declarations, but in general it would be necessary to have those predefined for the full standard library. Presumably these would contain exactly the information from the union of C header files. Is there a way, by parsing the C header files or otherwise, to obtain this information in the format LLVM expects? As an extension of this question, it will be necessary to provide access to other libraries written in C. The same question applies there: Is there a way, by parsing C header files or otherwise, to obtain a list of functions that are defined in a given library, in the format LLVM expects? From anton at korobeynikov.info Mon Jan 4 04:45:01 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 4 Jan 2010 13:45:01 +0300 Subject: [LLVMdev] Getting Kaleidoscope to compile In-Reply-To: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> References: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> Message-ID: Hello > Am I doing something wrong? Yes > Operating system: Ubuntu 9.04 > LLVM was obtained with: apt-get install llvm Consider checking out code from svn or at least use package for LLVM 2.6 release (as I can see, 9.04 has only 2.5) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From jon at ffconsultancy.com Mon Jan 4 06:09:41 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Mon, 4 Jan 2010 12:09:41 +0000 Subject: [LLVMdev] Getting Kaleidoscope to compile In-Reply-To: <8d71341e1001040237oea32fb4k4c1932904138fa36@mail.gmail.com> References: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> <278bcd901001040227m2d6977afhe55106e36805ecd8@mail.gmail.com> <8d71341e1001040237oea32fb4k4c1932904138fa36@mail.gmail.com> Message-ID: <201001041209.41854.jon@ffconsultancy.com> On Monday 04 January 2010 10:37:11 Russell Wallace wrote: > I tried apt-get install llvm-dev just now, and it says it was already > installed, and when I again try compiling toy.cpp, it gives the same > set of error messages. The debs are old. Install LLVM from source. > >> toy.cpp:5:30: error: llvm/LLVMContext.h: No such file or directory > >> toy.cpp:352: error: ?getGlobalContext? was not declared in this scope This is new stuff in 2.6. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From anton at korobeynikov.info Mon Jan 4 05:03:15 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 4 Jan 2010 14:03:15 +0300 Subject: [LLVMdev] Getting Kaleidoscope to compile In-Reply-To: <8d71341e1001040237oea32fb4k4c1932904138fa36@mail.gmail.com> References: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> <278bcd901001040227m2d6977afhe55106e36805ecd8@mail.gmail.com> <8d71341e1001040237oea32fb4k4c1932904138fa36@mail.gmail.com> Message-ID: > I tried apt-get install llvm-dev just now, and it says it was already > installed, and when I again try compiling toy.cpp, it gives the same > set of error messages. According to this site: http://packages.ubuntu.com/search?keywords=llvm 9.04 package is LLVM 2.5 which is too old and not compatible with the code from the site. Checkout code from SVN or use 2.6 package. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From etherzhhb at gmail.com Mon Jan 4 05:29:10 2010 From: etherzhhb at gmail.com (ether) Date: Mon, 04 Jan 2010 19:29:10 +0800 Subject: [LLVMdev] =?GB2312?B?tPC4tDogW0xMVk1kZXZdIEhvdyB0byBiaW5kIGEgcmVnaXN0ZQ==?= =?GB2312?B?ciB2YXJpYWJsZSB3aXRoIGEgZ2l2ZW4gZ2VuZXJhbCBwdXJwb3NlIHJlZ2lzdA==?= =?GB2312?B?ZXI/?= In-Reply-To: <54D0ADF2C20D3C4DB6883644D1F95F810109EEA1@OVTEX-CLUSTER.ovt.com> References: , <4B41A87F.5080808@gmail.com> <54D0ADF2C20D3C4DB6883644D1F95F810109EEA1@OVTEX-CLUSTER.ovt.com> Message-ID: <4B41D106.7060900@gmail.com> hi zhu, i am not sure if your c frontend support "int ADC asm("GPR13");" i think you could: 1. add a attribute "GPR13"(or a more meaningful name like "adcreg") and the corresponding handler, so you code "int ADC asm("GPR13");" became "int ADC __attribute__((GPR13));" 2. add a intrinsic functions like "llvm.zhu.readadcreg()". 3. map any read to the variable marked with __attribute__((GPR13)) to llvm.zhu.readadcreg() instead of "load" instruction 4. assign gpr13 to a special register class like "gpr13class" instead of gprclass, so gpr13 will not be allocated as other gpr. 5. lower instrinisic functions "llvm.zhu.readadcreg()" to register node "gpr13" in your backend. regards --ether On 2010-1-4 18:01, Demon(Xiangyang) Zhu ?????? wrote: > Hi Ether, > > The hardware had been fixed now. > If map it to memory space, it will cost another instruction cycle to execute 'a = ADCREG', > Algorithm accesses the A/D convert very frequently. To get higher precision at lower frequency > A/D output is connected to R13 directly. For the moment algorithm code is something like > assemble and difficult to maintain. I want to setup a c environment for it, but i don't know > how to bind a register variable with a given general purpose register. > > Thanks! > > ________________________________________ > ??????: llvmdev-bounces at cs.uiuc.edu [llvmdev-bounces at cs.uiuc.edu] ???? ether [etherzhhb at gmail.com] > ????????: 2010??1??4?? 0:36 > ??????: llvmdev at cs.uiuc.edu > ????: Re: [LLVMdev] How to bind a register variable with a given general purpose register? > > hi zhu, > > i think you should map the peripheral registers to data memory space > instead of register file, then mark that memory address as "volatile". > like said the adc register was mapped to address 0xc0000, and then the > corresponding c source will goes like this: > //define the register > #define ADCREG (*((volatile unsigned int *)0xc0000)) > //read the register > a = ADCREG > > --ether > > On 2010-1-4 10:59, Heyu Zhu wrote: > >> Hi everyone, >> There are 16 GPRs in my RISC, but in fact GPR13 is read-only and >> connected to output of an A/D converter. >> It would be very convenient if i could bind a register variable with >> GPR13. >> Because i am a newbie i don't know how my llvm backend can support that. >> I plan to implement it as below. >> A. first declare a global variable in c-code >> int ADC asm("GPR13"); >> B. If backend finds a variable is loaded from "GPR13" use GPR13 instead. >> C. backend can't allocate GPR13 to other variable >> Is it a foolish method? Is there a better one? Please give me some >> guidance >> Thanks >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Mon Jan 4 06:00:51 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 4 Jan 2010 15:00:51 +0300 Subject: [LLVMdev] How to bind a register variable with a given general purpose register? In-Reply-To: References: Message-ID: Hello > Is it a foolish method? Is there a better one? Please give me some guidance Inline assembler is your friend. Also, do not forget to mark this register as unallocable, so codegen won't use / clobber it. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From etherzhhb at gmail.com Mon Jan 4 06:14:10 2010 From: etherzhhb at gmail.com (ether) Date: Mon, 04 Jan 2010 20:14:10 +0800 Subject: [LLVMdev] support for attach embedded metadata to function/argument/basicblock proposal Message-ID: <4B41DB92.6090206@gmail.com> hi all, As i know attach embedded metadata nodes to function/argument/basicblock is not support yet, but such feature maybe useful to us. for example, we got the conversation " > Sorry, forgot to post to list. > > For 2.7 I'm wondering if you could use custom metadata attached to the first instruction of a "special" block? You could register a unique kind (not sure how to guarantee uniqueness), and attach a metadata node via the context to the first instruction with this kind. Your pass would look for this. I have never tried this, so I don't know if predecessor passes that your pass would depend on would affect this metadata; if different threads with their own context would see metadata attached via a specific context; and what the resultant performance effect would be. > > Just a thought > > Garrison > > On Jan 2, 2010, at 10:52, Yannis Mantzouratos wrote: > > > > Hi, > > > > We 're working on an llvm interpreter. We perform some static analysis > > to detect some blocks with a specific property, and we need the > > interpreter to be able to recognise these blocks fast in time it > > reaches them. We thought of adding a new instruction in the LLVM > > instruction set and put it in the beginning of such blocks, so that > > the interpreter would be instantly alerted that the current block is > > 'special'. Is there an easier/quicker way to do this? > > > > Cheers, > > yannis > " if we could attach metadata to a basicblock, this problem will be easily done. so i am going to add support for attaching metadata to function/argument/basicblock. the syntax of attaching metadata to a basicblock will go like this: bbname, !mdname !md [, !othermdname !othermd ...]: for example: entry, !foo !bar: ; attach bar to entry block and entry, !foo !bar, !foo1 !bar1: ; attach bar and bar1 to entry block and we could add a function/argument attribute named "metadata" for attaching metadata to function or argument. define void @functionname() metadata(!mdname !md, [, !othermdname !othermd ...]) [other funtion attributes] { ... } declare i32 @functionname(i8 metadata(!mdname !md, [, !othermdname !othermd ...]) [other parameter attributes]) for example: define void @f() metadata(!foo !bar) { ... } ;attach bar to function f define void @f(i8 metadata(!foo !bar) a) { ... } ;attach bar to argument a so, i think i shoud: 1.change "typedef DenseMap MDStoreTy; in MetadataContextImpl" to "typedef DenseMap MDStoreTy;" and the corresponding method in MetadataContextImpl and MetadataContext? 2.modify the code of asm reader/writer 3.modify bitcode reader/writer any comment or advice is appreciate best regards --ether From russell.wallace at gmail.com Mon Jan 4 06:41:58 2010 From: russell.wallace at gmail.com (Russell Wallace) Date: Mon, 4 Jan 2010 12:41:58 +0000 Subject: [LLVMdev] Getting Kaleidoscope to compile In-Reply-To: References: <8d71341e1001040219v50c312ecp299d353a3ea6f64a@mail.gmail.com> Message-ID: <8d71341e1001040441h6cd72285gfe38fe4bf61d6d3@mail.gmail.com> Right, it works with 2.6, thanks. On Mon, Jan 4, 2010 at 10:45 AM, Anton Korobeynikov wrote: > Hello > >> Am I doing something wrong? > Yes > >> Operating system: Ubuntu 9.04 >> LLVM was obtained with: apt-get install llvm > Consider checking out code from svn or at least use package for LLVM > 2.6 release (as I can see, 9.04 has only 2.5) > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > From etherzhhb at gmail.com Mon Jan 4 07:37:31 2010 From: etherzhhb at gmail.com (ether) Date: Mon, 04 Jan 2010 21:37:31 +0800 Subject: [LLVMdev] =?GB2312?B?tPC4tDogtPC4tDogW0xMVk1kZXZdIEhvdyB0byBiaW5kIGEgcg==?= =?GB2312?B?ZWdpc3RlciB2YXJpYWJsZSB3aXRoIGEgZ2l2ZW4gZ2VuZXJhbCBwdXJwb3NlIA==?= =?GB2312?B?cmVnaXN0ZXI/?= In-Reply-To: <54D0ADF2C20D3C4DB6883644D1F95F810109EEA3@OVTEX-CLUSTER.ovt.com> References: , <4B41A87F.5080808@gmail.com> <54D0ADF2C20D3C4DB6883644D1F95F810109EEA1@OVTEX-CLUSTER.ovt.com>, <4B41D106.7060900@gmail.com> <54D0ADF2C20D3C4DB6883644D1F95F810109EEA3@OVTEX-CLUSTER.ovt.com> Message-ID: <4B41EF1B.2000407@gmail.com> or you can just add a built-in function in c frontend "readadcreg()" and emit it as llvm.zhu.readadcreg(), so you can just call the built-in function to get the adc register value. On 2010-1-4 21:27, Demon(Xiangyang) Zhu ?????? wrote: > Hi Ether, > > Thank you very much. I will try it soon as your description. > > Regards > > ________________________________________ > ??????: ether [etherzhhb at gmail.com] > ????????: 2010??1??4?? 3:29 > ??????: Demon(Xiangyang) Zhu ?????? > ????: llvmdev at cs.uiuc.edu > ????: Re: ????: [LLVMdev] How to bind a register variable with a given general purpose register? > > hi zhu, > > i am not sure if your c frontend support "int ADC asm("GPR13");" > i think you could: > > 1. add a attribute "GPR13"(or a more meaningful name like "adcreg") and > the corresponding handler, so you code "int ADC asm("GPR13");" became > "int ADC __attribute__((GPR13));" > 2. add a intrinsic functions like "llvm.zhu.readadcreg()". > 3. map any read to the variable marked with __attribute__((GPR13)) to > llvm.zhu.readadcreg() instead of "load" instruction > 4. assign gpr13 to a special register class like "gpr13class" instead of > gprclass, so gpr13 will not be allocated as other gpr. > 5. lower instrinisic functions "llvm.zhu.readadcreg()" to register node > "gpr13" in your backend. > > regards > > --ether > > On 2010-1-4 18:01, Demon(Xiangyang) Zhu ?????? wrote: > >> Hi Ether, >> >> The hardware had been fixed now. >> If map it to memory space, it will cost another instruction cycle to execute 'a = ADCREG', >> Algorithm accesses the A/D convert very frequently. To get higher precision at lower frequency >> A/D output is connected to R13 directly. For the moment algorithm code is something like >> assemble and difficult to maintain. I want to setup a c environment for it, but i don't know >> how to bind a register variable with a given general purpose register. >> >> Thanks! >> >> ________________________________________ >> ??????: llvmdev-bounces at cs.uiuc.edu [llvmdev-bounces at cs.uiuc.edu] ???? ether [etherzhhb at gmail.com] >> ????????: 2010??1??4?? 0:36 >> ??????: llvmdev at cs.uiuc.edu >> ????: Re: [LLVMdev] How to bind a register variable with a given general purpose register? >> >> hi zhu, >> >> i think you should map the peripheral registers to data memory space >> instead of register file, then mark that memory address as "volatile". >> like said the adc register was mapped to address 0xc0000, and then the >> corresponding c source will goes like this: >> //define the register >> #define ADCREG (*((volatile unsigned int *)0xc0000)) >> //read the register >> a = ADCREG >> >> --ether >> >> On 2010-1-4 10:59, Heyu Zhu wrote: >> >> >>> Hi everyone, >>> There are 16 GPRs in my RISC, but in fact GPR13 is read-only and >>> connected to output of an A/D converter. >>> It would be very convenient if i could bind a register variable with >>> GPR13. >>> Because i am a newbie i don't know how my llvm backend can support that. >>> I plan to implement it as below. >>> A. first declare a global variable in c-code >>> int ADC asm("GPR13"); >>> B. If backend finds a variable is loaded from "GPR13" use GPR13 instead. >>> C. backend can't allocate GPR13 to other variable >>> Is it a foolish method? Is there a better one? Please give me some >>> guidance >>> Thanks >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kennethuil at gmail.com Mon Jan 4 07:41:35 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Mon, 4 Jan 2010 07:41:35 -0600 Subject: [LLVMdev] C library function declarations In-Reply-To: <8d71341e1001040243q78a6c0bbma232861ae4c69fc0@mail.gmail.com> References: <8d71341e1001040243q78a6c0bbma232861ae4c69fc0@mail.gmail.com> Message-ID: <400d33ea1001040541x2216fa2bo12c730904aa4f83b@mail.gmail.com> On Mon, Jan 4, 2010 at 4:43 AM, Russell Wallace wrote: > When implementing a language using LLVM as the backend, it is > necessary to give programs written in that language, access to the C > standard library functions. The Kaleidoscope tutorial shows how to do > this for individual functions using extern declarations, but in > general it would be necessary to have those predefined for the full > standard library. Presumably these would contain exactly the > information from the union of C header files. Is there a way, by > parsing the C header files or otherwise, to obtain this information in > the format LLVM expects? > > As an extension of this question, it will be necessary to provide > access to other libraries written in C. The same question applies > there: Is there a way, by parsing C header files or otherwise, to > obtain a list of functions that are defined in a given library, in the > format LLVM expects? > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > Usually, this would be defined by the language. For instance, C# uses PInvoke to declare functions that are implemented natively in some other language, and the programmer usually manually writes PInvoke declarations for whichever library functions he wants to call. Many languages try to wrap the standard C libraries and expose equivalent functionality through the language's own standard library. This doesn't address third-party libraries, of course. At any rate, if you want to consume native libraries in any language without explicitly declaring each native function, you'll need to parse the C header files. Native object files don't have information about parameters, calling conventions, etc. that your compiler would need. From kennethuil at gmail.com Mon Jan 4 07:44:55 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Mon, 4 Jan 2010 07:44:55 -0600 Subject: [LLVMdev] How to bind a register variable with a given general purpose register? In-Reply-To: References: Message-ID: <400d33ea1001040544i2aad9075ka2002bf6961bd9c9@mail.gmail.com> On Sun, Jan 3, 2010 at 8:59 PM, Heyu Zhu wrote: > Hi everyone, > > There are 16 GPRs in my RISC, but in fact GPR13?is read-only and connected > to?output of an A/D converter. > It would be very convenient if i could bind a register variable with?GPR13. > > Because i am a newbie i don't know how my llvm backend can support that. > > I?plan?to implement it?as below. > > A.? first declare a global variable in c-code > ?????int? ADC? asm("GPR13"); > B.? If?backend finds a variable is loaded?from "GPR13" ?use GPR13 instead. > C.? backend can't allocate GPR13 to other variable > > Is it a foolish method? Is there a better one? Please give me some guidance > > Thanks > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > If your end goal is to allow programs to use the output of the A/D converter, my recommendation would be to create a platform-specific intrinsic function that returns the output of the A/D converter, and have your back-end refuse to allocate GPR13 as a register. The intrinsic function would of course simply cough up the value of GPR13, which would then be available for use in the program. From mierle at gmail.com Mon Jan 4 08:40:58 2010 From: mierle at gmail.com (Keir Mierle) Date: Mon, 4 Jan 2010 06:40:58 -0800 Subject: [LLVMdev] C library function declarations In-Reply-To: <400d33ea1001040541x2216fa2bo12c730904aa4f83b@mail.gmail.com> References: <8d71341e1001040243q78a6c0bbma232861ae4c69fc0@mail.gmail.com> <400d33ea1001040541x2216fa2bo12c730904aa4f83b@mail.gmail.com> Message-ID: On Mon, Jan 4, 2010 at 5:41 AM, Kenneth Uildriks wrote: > On Mon, Jan 4, 2010 at 4:43 AM, Russell Wallace > wrote: > > When implementing a language using LLVM as the backend, it is > > necessary to give programs written in that language, access to the C > > standard library functions. The Kaleidoscope tutorial shows how to do > > this for individual functions using extern declarations, but in > > general it would be necessary to have those predefined for the full > > standard library. Presumably these would contain exactly the > > information from the union of C header files. Is there a way, by > > parsing the C header files or otherwise, to obtain this information in > > the format LLVM expects? > > > > As an extension of this question, it will be necessary to provide > > access to other libraries written in C. The same question applies > > there: Is there a way, by parsing C header files or otherwise, to > > obtain a list of functions that are defined in a given library, in the > > format LLVM expects? > > Usually, this would be defined by the language. For instance, C# uses > PInvoke to declare functions that are implemented natively in some > other language, and the programmer usually manually writes PInvoke > declarations for whichever library functions he wants to call. > > Many languages try to wrap the standard C libraries and expose > equivalent functionality through the language's own standard library. > This doesn't address third-party libraries, of course. > > At any rate, if you want to consume native libraries in any language > without explicitly declaring each native function, you'll need to > parse the C header files. Native object files don't have information > about parameters, calling conventions, etc. that your compiler would > need. You may want to consider leveraging Clang to parse existing headers. It's still not trivial to get everything working as you described, but by leveraging Clang at least you don't have to parse C yourself. Keir -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/ab9e988d/attachment.html From gregory.petrosyan at gmail.com Mon Jan 4 09:32:10 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Mon, 4 Jan 2010 18:32:10 +0300 Subject: [LLVMdev] [PATCH] test-suite/bullet: fix build in case $LLVM_SRC_ROOT != $LLVM_OBJ_ROOT Message-ID: <20100104153210.GA28233@gregory-laptop> Index: MultiSource/Benchmarks/Bullet/Makefile =================================================================== --- MultiSource/Benchmarks/Bullet/Makefile (revision 92478) +++ MultiSource/Benchmarks/Bullet/Makefile (working copy) @@ -1,6 +1,6 @@ LEVEL = ../../../ PROG = bullet -CPPFLAGS += -Iinclude -DNO_TIME +CPPFLAGS += -I$(PROJ_SRC_DIR)/include -DNO_TIME LDFLAGS = -lstdc++ include $(LEVEL)/Makefile.config From anton at korobeynikov.info Mon Jan 4 11:10:25 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 4 Jan 2010 20:10:25 +0300 Subject: [LLVMdev] [PATCH] test-suite/bullet: fix build in case $LLVM_SRC_ROOT != $LLVM_OBJ_ROOT In-Reply-To: <20100104153210.GA28233@gregory-laptop> References: <20100104153210.GA28233@gregory-laptop> Message-ID: On Mon, Jan 4, 2010 at 18:32, Gregory Petrosyan wrote: > Index: MultiSource/Benchmarks/Bullet/Makefile Applied, thanks! -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From ambika at cse.iitb.ac.in Mon Jan 4 11:37:14 2010 From: ambika at cse.iitb.ac.in (ambika) Date: Mon, 04 Jan 2010 23:07:14 +0530 Subject: [LLVMdev] Help Required for LLVM In-Reply-To: References: <4B40E42C.5020500@cse.iitb.ac.in> <4B41A756.8070801@free.fr> Message-ID: <4B42274A.2040604@cse.iitb.ac.in> I got it figured out.. Thanx for the help !!! From devang.patel at gmail.com Mon Jan 4 12:00:57 2010 From: devang.patel at gmail.com (Devang Patel) Date: Mon, 4 Jan 2010 10:00:57 -0800 Subject: [LLVMdev] Automatic Vectorization In-Reply-To: References: Message-ID: <352a1fb21001041000v711fbe2aq9c3d1eafbdd569d3@mail.gmail.com> On Thu, Dec 17, 2009 at 7:09 AM, Renato Golin wrote: > > I believe that would be a FunctionPass and registered in the > LoopDependencyAnalysis "runOnLoop()", so it can run when such pass is > called by the PassManager. Or should it be a completely separate pass > (VectorizationPass?) so we can control it from a separate command-line > flag? > A separate VectorizationPass that requires dependence analysis is the way to go. - Devang From jyasskin at google.com Mon Jan 4 12:06:37 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 4 Jan 2010 12:06:37 -0600 Subject: [LLVMdev] Tail Call Optimisation In-Reply-To: <201001041103.38122.jon@ffconsultancy.com> References: <4283E0E3-797D-4C7A-A885-6B8BD335C213@gmail.com> <201001040450.07699.jon@ffconsultancy.com> <201001041103.38122.jon@ffconsultancy.com> Message-ID: On Mon, Jan 4, 2010 at 5:03 AM, Jon Harrop wrote: > On Monday 04 January 2010 05:16:40 Jeffrey Yasskin wrote: >> On Sun, Jan 3, 2010 at 10:50 PM, Jon Harrop wrote: >> > LLVM's TCO already handles mutual recursion. >> >> Only for fastcc functions > > Yes. > >> compiled with -tailcallopt, right? > > If you use the compiler, yes. > >> http://llvm.org/docs/CodeGenerator.html#tailcallopt >> >> I believe gcc manages to support tail calls in many more cases, and >> this restriction in llvm confuses lots of newcomers. It would be very >> worthwhile if someone wanted to remove it. > > That's interesting. What tail calls can be supported without changing the > calling convention Simon's original example, for one. See below for the C and assembly at gcc -O3. Not all tail calls can be supported, and I can't find a recent authoritative list of the exact restrictions, but http://www.complang.tuwien.ac.at/schani/diplarb.ps has a list from 2001. > and would it not simply be easier to switch to the fastcc > convention between internal functions to achieve the same effect from outside > LLVM? Possibly, but not all functions can be internal. > Conversely, if TCO is implemented for the cc convention, what will be > the point of fastcc? I don't think we should be omitting optimizations because they might make a calling convention obsolete. (Although in this case, there are still some extra calls a different calling convention can make tail calls, and fastcc still improves the x86-32 call sequence.) $ cat test.c #include char* a(int n) __attribute__((noinline)); char* b(int n) __attribute__((noinline)); char* a(int n) { if (n <= 0) { return "DONE"; } else { return b(n - 1); } } char* b(int n) { if (n <= 0) { return "DONE"; } else { return a(n - 1); } } int main() { puts(a(10000000)); return 0; } $ gcc -v Using built-in specs. Target: i386-apple-darwin9 Configured with: ../gcc-4.4.1/configure --prefix=/opt/local --build=i386-apple-darwin9 --enable-languages=c,c++,objc,obj-c++,java,fortran --libdir=/opt/local/lib/gcc44 --includedir=/opt/local/include/gcc44 --infodir=/opt/local/share/info --mandir=/opt/local/share/man --with-local-prefix=/opt/local --with-system-zlib --disable-nls --program-suffix=-mp-4.4 --with-gxx-include-dir=/opt/local/include/gcc44/c++/ --with-gmp=/opt/local --with-mpfr=/opt/local Thread model: posix gcc version 4.4.1 (GCC) $ gcc -Wall -O3 test.c -o test.s -S $ cat test.s .cstring LC0: .ascii "DONE\0" .text .align 4,0x90 .globl _b _b: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax call ___i686.get_pc_thunk.cx "L00000000001$pb": testl %eax, %eax jle L6 subl $1, %eax movl %eax, 8(%ebp) leave jmp _a .align 4,0x90 L6: leal LC0-"L00000000001$pb"(%ecx), %eax leave ret .align 4,0x90 .globl _a _a: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax call ___i686.get_pc_thunk.cx "L00000000002$pb": testl %eax, %eax jle L11 subl $1, %eax movl %eax, 8(%ebp) leave jmp _b .align 4,0x90 L11: leal LC0-"L00000000002$pb"(%ecx), %eax leave ret .align 4,0x90 .globl _main _main: pushl %ebp movl %esp, %ebp subl $24, %esp movl $10000000, (%esp) call _a movl %eax, (%esp) call L_puts$stub xorl %eax, %eax leave ret .picsymbol_stub L_puts$stub: .indirect_symbol _puts call LPC$1 LPC$1: popl %eax movl L1$lz-LPC$1(%eax),%edx jmp *%edx L_puts$stub_binder: lea L1$lz-LPC$1(%eax),%eax pushl %eax jmp dyld_stub_binding_helper .lazy_symbol_pointer L1$lz: .indirect_symbol _puts .long L_puts$stub_binder .subsections_via_symbols .section __TEXT,__textcoal_nt,coalesced,pure_instructions .weak_definition ___i686.get_pc_thunk.cx .private_extern ___i686.get_pc_thunk.cx ___i686.get_pc_thunk.cx: movl (%esp), %ecx ret $ The results with -m64 are similar. From sramij at hotmail.com Mon Jan 4 12:50:05 2010 From: sramij at hotmail.com (rami jiossy) Date: Mon, 4 Jan 2010 18:50:05 +0000 Subject: [LLVMdev] change type allocoted register Message-ID: Hi; i am using llvm backend on x86 arch. My app ABI requires float2 (v2f32) to be passes as parameter and return in XMM0 register. Currently LLVM handles v2f32 using MMX register MM0. i wonder what changes do i need to do in LLVM to support that change; manipulating v2f32 (float2) using XMM and not MMX ? one place i identifies where a change needs to be done is X86CallingConv.td where it define CC and RetCC . Thanks _________________________________________________________________ Hotmail: Trusted email with powerful SPAM protection. http://clk.atdmt.com/GBL/go/177141665/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/c22dadbf/attachment.html From gohman at apple.com Mon Jan 4 13:44:24 2010 From: gohman at apple.com (Dan Gohman) Date: Mon, 4 Jan 2010 11:44:24 -0800 Subject: [LLVMdev] "Graphite" for llvm In-Reply-To: <4B394C54.9060500@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <26944_1261905486_4B37264E_26944_9213_1_4B372649.1010600@gmail.com> <4B394C54.9060500@fim.uni-passau.de> Message-ID: On Dec 28, 2009, at 4:24 PM, Tobias Grosser wrote: > > Probably. I think for single dimensional arrays it will not be too > difficult using scalar evolution to get the access functions. I think > multi dimensional arrays will get complicated. If you want to know how the address is calculated as a function of each enclosing loop, ScalarEvolution should be quite usable for multiple dimensions. This is represented with an add-recurrence with another add-recurrence as its "start" operand. For example: for (i=0; i,+,sizeof(double)} This says that the value starts at A, steps by sizeof(double)*n (address units) with the iteration of loop X, and steps by sizeof(double) with the iteration of loop Y. However, if you want to recognize this as a high-level array reference on A with subscripts "i" and then "j", there are some missing pieces. On a related note, analyzing getelementptr yourself directly is doable, but there are several major complications. Using a helper library such as ScalarEvolution can protect you from many low-level artifacts (though not all). Dan From gohman at apple.com Mon Jan 4 13:44:24 2010 From: gohman at apple.com (Dan Gohman) Date: Mon, 4 Jan 2010 11:44:24 -0800 Subject: [LLVMdev] "Graphite" for llvm In-Reply-To: <4B394C54.9060500@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <26944_1261905486_4B37264E_26944_9213_1_4B372649.1010600@gmail.com> <4B394C54.9060500@fim.uni-passau.de> Message-ID: On Dec 28, 2009, at 4:24 PM, Tobias Grosser wrote: > > Probably. I think for single dimensional arrays it will not be too > difficult using scalar evolution to get the access functions. I think > multi dimensional arrays will get complicated. If you want to know how the address is calculated as a function of each enclosing loop, ScalarEvolution should be quite usable for multiple dimensions. This is represented with an add-recurrence with another add-recurrence as its "start" operand. For example: for (i=0; i,+,sizeof(double)} This says that the value starts at A, steps by sizeof(double)*n (address units) with the iteration of loop X, and steps by sizeof(double) with the iteration of loop Y. However, if you want to recognize this as a high-level array reference on A with subscripts "i" and then "j", there are some missing pieces. On a related note, analyzing getelementptr yourself directly is doable, but there are several major complications. Using a helper library such as ScalarEvolution can protect you from many low-level artifacts (though not all). Dan From grosser at fim.uni-passau.de Mon Jan 4 15:07:56 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 4 Jan 2010 22:07:56 +0100 Subject: [LLVMdev] [PATCH] Add InstCombine to CMake. Message-ID: <1262639276-33940-1-git-send-email-grosser@fim.uni-passau.de> Fixes build of bugpoint, llvm-ld and opt. OK for commit? --- CMakeLists.txt | 1 + tools/bugpoint/CMakeLists.txt | 2 +- tools/llvm-ld/CMakeLists.txt | 2 +- tools/opt/CMakeLists.txt | 2 +- 4 files changed, 4 insertions(+), 3 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 9bce039..0edd509 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -295,6 +295,7 @@ add_subdirectory(lib/CodeGen/AsmPrinter) add_subdirectory(lib/Bitcode/Reader) add_subdirectory(lib/Bitcode/Writer) add_subdirectory(lib/Transforms/Utils) +add_subdirectory(lib/Transforms/InstCombine) add_subdirectory(lib/Transforms/Instrumentation) add_subdirectory(lib/Transforms/Scalar) add_subdirectory(lib/Transforms/IPO) diff --git a/tools/bugpoint/CMakeLists.txt b/tools/bugpoint/CMakeLists.txt index 90f24ba..fd32a68 100644 --- a/tools/bugpoint/CMakeLists.txt +++ b/tools/bugpoint/CMakeLists.txt @@ -1,4 +1,4 @@ -set(LLVM_LINK_COMPONENTS asmparser instrumentation scalaropts ipo +set(LLVM_LINK_COMPONENTS asmparser instcombine instrumentation scalaropts ipo linker bitreader bitwriter) set(LLVM_REQUIRES_EH 1) diff --git a/tools/llvm-ld/CMakeLists.txt b/tools/llvm-ld/CMakeLists.txt index 2ae4a1d..257d1e9 100644 --- a/tools/llvm-ld/CMakeLists.txt +++ b/tools/llvm-ld/CMakeLists.txt @@ -1,4 +1,4 @@ -set(LLVM_LINK_COMPONENTS ipo scalaropts linker archive bitwriter) +set(LLVM_LINK_COMPONENTS ipo scalaropts instcombine linker archive bitwriter) add_llvm_tool(llvm-ld Optimize.cpp diff --git a/tools/opt/CMakeLists.txt b/tools/opt/CMakeLists.txt index 0570d0e..912acfa 100644 --- a/tools/opt/CMakeLists.txt +++ b/tools/opt/CMakeLists.txt @@ -1,4 +1,4 @@ -set(LLVM_LINK_COMPONENTS bitreader asmparser bitwriter instrumentation scalaropts ipo) +set(LLVM_LINK_COMPONENTS bitreader asmparser bitwriter instcombine instrumentation scalaropts ipo) add_llvm_tool(opt AnalysisWrappers.cpp -- 1.6.5.3 From foom at fuhm.net Mon Jan 4 15:13:40 2010 From: foom at fuhm.net (James Y Knight) Date: Mon, 4 Jan 2010 16:13:40 -0500 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> Message-ID: <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> On Jan 4, 2010, at 4:35 AM, Chandler Carruth wrote: > Responding to the original email... > > On Sun, Jan 3, 2010 at 10:10 PM, James Y Knight wrote: >> In working on an LLVM backend for SBCL (a lisp compiler), there are >> certain sequences of code that must be atomic with regards to async >> signals. > > Can you define exactly what 'atomic with regards to async signals' > this entails? Your descriptions led me to think you may mean something > other than the POSIX definition, but maybe I'm just misinterpreting > it. Are these signals guaranteed to run in the same thread? On the > same processor? Is there concurrent code running in the address space > when they run? Hi, thanks everyone for all the comments. I think maybe I wasn't clear that I *only* care about atomicity w.r.t. a signal handler interruption in the same thread, *not* across threads. Therefore, many of the problems of cross-CPU atomicity are not relevant. The signal handler gets invoked via pthread_kill, and is thus necessarily running in the same thread as the code being interrupted. The memory in question can be considered thread-local here, so I'm not worried about other threads touching it at all. I also realize I had (at least :) one error in my original email: of course, the atomic operations llvm provides *ARE* guaranteed to do the right thing w.r.t. atomicity against signal handlers...they in fact just do more than I need, not less. I'm not sure why I thought they were both more and less than I needed before, and sorry if it confused you about what I'm trying to accomplish. Here's a concrete example, in hopes it will clarify matters: @pseudo_atomic = thread_local global i64 0 declare i64* @alloc(i64) declare void @do_pending_interrupt() declare i64 @llvm.atomic.load.sub.i64.p0i64(i64* nocapture, i64) nounwind declare void @llvm.memory.barrier(i1, i1, i1, i1, i1) define i64* @foo() { ;; Note that we're in an allocation section store i64 1, i64* @pseudo_atomic ;; Barrier only to ensure instruction ordering, not needed as a true memory barrier call void @llvm.memory.barrier(i1 0, i1 0, i1 0, i1 1, i1 0) ;; Call might actually be inlined, so cannot depend upon unknown call causing correct codegen effects. %obj = call i64* @alloc(i64 32) %obj_header = getelementptr i64* %obj, i64 0 store i64 5, i64* %obj_header ;; store obj type (5) in header word %obj_len = getelementptr i64* %obj, i64 1 store i64 2, i64* %obj_len ;; store obj length (2) in length slot ...etc... ;; Check if we were interrupted: %res = call i64 @llvm.atomic.load.sub.i64.p0i64(i64* @pseudo_atomic, i64 1) %was_interrupted = icmp eq i64 %res, 1 br i1 %was_interrupted, label %do-interruption, label %continue continue: ret i64* %obj do-interruption: call void @do_pending_interrupt() br label %continue } A signal handler will check the thread-local @pseudo_atomic variable: if it was already set it will just change the value to 2 and return, waiting to be reinvoked by do_pending_interrupt at the end of the pseudo-atomic section. This is because it may get confused by the proto-object being built up in this code. This sequence that SBCL does today with its internal codegen is basically like: MOV , 1 [[do allocation, fill in object, etc]] XOR , 1 JEQ continue <> continue: ... The important things here are: 1) Stores cannot be migrated from within the MOV/XOR instructions to outside by the codegen. 2) There's no way an interruption can be missed: the XOR is atomic with regards to signals executing in the same thread, it's either fully executed or not (both load+store). But I don't care whether it's visible on other CPUs or not: it's a thread-local variable in any case. Those are the two properties I'd like to get from LLVM, without actually ever invoking superfluous processor synchronization. > The processor can reorder memory operations as well (within limits). > Consider that 'memset' to zero is often codegened to a non-temporal > store to memory. This exempts it from all ordering considerations My understanding is that processor reordering only affects what you might see from another CPU: the processor will undo speculatively executed operations if the sequence of instructions actually executed is not the sequence it predicted, so within a single CPU you should never be able tell the difference. But I must admit I don't know anything about non-temporal stores. Within a single thread, if I do a non-temporal store, followed by a load, am I not guaranteed to get back the value I stored? James From dag at cray.com Mon Jan 4 16:09:10 2010 From: dag at cray.com (David Greene) Date: Mon, 4 Jan 2010 16:09:10 -0600 Subject: [LLVMdev] Assembly Printer In-Reply-To: References: Message-ID: <201001041609.10596.dag@cray.com> On Sunday 03 January 2010 01:00, Chris Lattner wrote: > On Jan 1, 2010, at 12:51 PM, mmms1841 wrote: > > I am trying to understand how LLVM does code generation and I have a > > couple of questions. I am using LLVM 2.6. > > > > First, > > if I want to change the name of an instruction, all I need to do is to > > modify the XXXInstrInfo.td, right? Using Sparc as an example, if I > > wanted to output "mysra" instead of "sra", in SparcInstrInfo.td, I would > > write, > > > > defm SRA : F3_12<"mysra", 0b100111, sra>; > > > > Is this correct? > > Yes. IMHO, this is a poor way to do this kind of thing. It eventually leads to confusion where someone things SRA means "sra" and someone else thinks it meas "mysra." It gets worse as "mysra" acquires subtly different semantics than "sra." Better to write a separate pattern and use AddedComplexity to prefer it. Just a nugget of wisdom from personal experience. :) -Dave From grosser at fim.uni-passau.de Mon Jan 4 16:31:31 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 04 Jan 2010 23:31:31 +0100 Subject: [LLVMdev] "Graphite" for llvm In-Reply-To: <2489_1262634262_4B424516_2489_6097_1_E715EC8F-EFC5-4290-9C15-7A9A9072B062@apple.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <26944_1261905486_4B37264E_26944_9213_1_4B372649.1010600@gmail.com> <4B394C54.9060500@fim.uni-passau.de> <2489_1262634262_4B424516_2489_6097_1_E715EC8F-EFC5-4290-9C15-7A9A9072B062@apple.com> Message-ID: <4B426C43.7090401@fim.uni-passau.de> On 01/04/10 20:44, Dan Gohman wrote: > > On Dec 28, 2009, at 4:24 PM, Tobias Grosser wrote: >> >> Probably. I think for single dimensional arrays it will not be too >> difficult using scalar evolution to get the access functions. I think >> multi dimensional arrays will get complicated. > > If you want to know how the address is calculated as a function of > each enclosing loop, ScalarEvolution should be quite usable for > multiple dimensions. This is represented with an add-recurrence > with another add-recurrence as its "start" operand. For example: > > for (i=0; i for (j=0; j A[i][j]; > > The store address has this expression: > > {{A,+,sizeof(double)*n},+,sizeof(double)} > > This says that the value starts at A, steps by sizeof(double)*n > (address units) with the iteration of loop X, and steps by > sizeof(double) with the iteration of loop Y. > > However, if you want to recognize this as a high-level array > reference on A with subscripts "i" and then "j", there are some > missing pieces. You are right, starting with the ScalarEvolution is the right approach. However I believe the high level array analysis might be useful to get simpler expressions. I have to think about this later on. > On a related note, analyzing getelementptr yourself directly is > doable, but there are several major complications. Using a helper > library such as ScalarEvolution can protect you from many > low-level artifacts (though not all). Sure, it is a great tool. Tobias From anton at korobeynikov.info Mon Jan 4 16:49:22 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 5 Jan 2010 01:49:22 +0300 Subject: [LLVMdev] [PATCH] Add InstCombine to CMake. In-Reply-To: <1262639276-33940-1-git-send-email-grosser@fim.uni-passau.de> References: <1262639276-33940-1-git-send-email-grosser@fim.uni-passau.de> Message-ID: > Fixes build of bugpoint, llvm-ld and opt. > > OK for commit? Looks good for me -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dag at cray.com Mon Jan 4 17:14:27 2010 From: dag at cray.com (David Greene) Date: Mon, 4 Jan 2010 17:14:27 -0600 Subject: [LLVMdev] Metadata Message-ID: <201001041714.28067.dag@cray.com> Is there some documentation about metadata and how to use it somewhere? The doxygen-generated stuff obviously doesn't specify things like how to create custom metadata kinds and all that jazz. As a few of us discussed at the latest dev meeting, I'd like to use custom metadata to annotate load/store instructions with "nontemporal" semantics to allow generation of instructions like MOVNT on x86-based targets. Losing the "nontemporal" annotation won't affect correctness so it seems a prime candidate for metadata. One question: should "nontemporal" be "native" metadata or "external?" -Dave From clattner at apple.com Mon Jan 4 19:22:28 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 4 Jan 2010 17:22:28 -0800 Subject: [LLVMdev] Metadata In-Reply-To: <201001041714.28067.dag@cray.com> References: <201001041714.28067.dag@cray.com> Message-ID: <85C8544F-60AF-404C-904E-21AB71DFBCC2@apple.com> On Jan 4, 2010, at 3:14 PM, David Greene wrote: > Is there some documentation about metadata and how to use it > somewhere? > The doxygen-generated stuff obviously doesn't specify things like how > to create custom metadata kinds and all that jazz. The metadata design and APIs are still evolving. I wouldn't recommend building anything on it just yet, but I'm hoping it will finalize and stabilize this month. The high-level idea and proposal was here: http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt > As a few of us discussed at the latest dev meeting, I'd like to use > custom metadata to annotate load/store instructions with "nontemporal" > semantics to allow generation of instructions like MOVNT on x86-based > targets. Losing the "nontemporal" annotation won't affect correctness > so it seems a prime candidate for metadata. > > One question: should "nontemporal" be "native" metadata or "external?" This seems like a very reasonable use of metadata to me! -Chris From dag at cray.com Mon Jan 4 19:39:38 2010 From: dag at cray.com (David Greene) Date: Mon, 4 Jan 2010 19:39:38 -0600 Subject: [LLVMdev] Metadata In-Reply-To: <85C8544F-60AF-404C-904E-21AB71DFBCC2@apple.com> References: <201001041714.28067.dag@cray.com> <85C8544F-60AF-404C-904E-21AB71DFBCC2@apple.com> Message-ID: <201001041939.38839.dag@cray.com> On Monday 04 January 2010 19:22, Chris Lattner wrote: > On Jan 4, 2010, at 3:14 PM, David Greene wrote: > > Is there some documentation about metadata and how to use it > > somewhere? > > The doxygen-generated stuff obviously doesn't specify things like how > > to create custom metadata kinds and all that jazz. > > The metadata design and APIs are still evolving. I wouldn't recommend > building anything on it just yet, but I'm hoping it will finalize and > stabilize this month. The high-level idea and proposal was here: > http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt Cool. Is the goal to have it ready for 2.7? The nontemporal stuff is a rather large cause of merge conflicts for us and it would be nice to get it into the public repository before 2.7 is out. -Dave From erwin.coumans at gmail.com Mon Jan 4 20:11:43 2010 From: erwin.coumans at gmail.com (Erwin Coumans) Date: Mon, 4 Jan 2010 18:11:43 -0800 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: References: <419a36b40912151529i38ba5768p2925a0066bc33cf3@mail.gmail.com> <419a36b40912151647i510c5bb7vd7a8b853195cee6c@mail.gmail.com> Message-ID: <419a36b41001041811n1c7272e9ud7a64d2c0069ded9@mail.gmail.com> Hi Anton, and happy new year all, >>One questions though: is it possible to "verify" the results of all >>the computations somehow? Good point, and there is no automated way currently, but we can work on that. Note that simulation suffers from the 'butterfly effect', so the smallest change anywhere, (cpu, compiler etc) diverges into totally different results after a while. There are a few ways of verification I can think of: 1) verifying by adding unit tests for all stages in the physics pipeline (broadphase acceleration structures, closest point computation, constraint solver) Given known input and output we can check if the solution is within a certain tolerance. 2) using the benchmark simulation and verifying the results frame by frame and check for unusual behaviour 3) modify the benchmark so that it is easier to test the end result, even through it might be different. For example, we can drop a number of boxes above a bowl, and after a while make sure all boxes are 'in' the bowl in a resting pose. What are your thoughts? Thanks, Erwin 2009/12/19 Anton Korobeynikov > Hello, Erwin > > > If you are interested, I think it is best to start with Bullet 2.75. > > If it turns out that LLVM requires some modifications (due to current C++ > > limitations), > > we can modify Bullet and go for an uncoming release such as Bullet 2.76 > > (planned around January 2010). > I added bullet to LLVM testsuite. Basically I had to flatten source > directories since this is a current requirement of the llvm testsuite > harness. > Some include paths tweaks were required due to this. Also, I disabled > the time reports, since otherwise we cannot compare the outputs. > > bullet appeared to be ~20% slower for me compared to gcc 4.2.4, so, > definitely something should be worked on :) > > One questions though: is it possible to "verify" the results of all > the computations somehow? We need to care not only about speed, but > about correctness too :) > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/186ad3fa/attachment.html From clattner at apple.com Mon Jan 4 20:15:16 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 4 Jan 2010 18:15:16 -0800 Subject: [LLVMdev] Metadata In-Reply-To: <201001041939.38839.dag@cray.com> References: <201001041714.28067.dag@cray.com> <85C8544F-60AF-404C-904E-21AB71DFBCC2@apple.com> <201001041939.38839.dag@cray.com> Message-ID: On Jan 4, 2010, at 5:39 PM, David Greene wrote: >> The metadata design and APIs are still evolving. I wouldn't >> recommend >> building anything on it just yet, but I'm hoping it will finalize and >> stabilize this month. The high-level idea and proposal was here: >> http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt > > Cool. Is the goal to have it ready for 2.7? Yep, > The nontemporal stuff is > a rather large cause of merge conflicts for us and it would be nice to > get it into the public repository before 2.7 is out. Nice! -Chris From dag at cray.com Mon Jan 4 20:24:23 2010 From: dag at cray.com (David Greene) Date: Mon, 4 Jan 2010 20:24:23 -0600 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: <419a36b41001041811n1c7272e9ud7a64d2c0069ded9@mail.gmail.com> References: <419a36b40912151529i38ba5768p2925a0066bc33cf3@mail.gmail.com> <419a36b41001041811n1c7272e9ud7a64d2c0069ded9@mail.gmail.com> Message-ID: <201001042024.23451.dag@cray.com> On Monday 04 January 2010 20:11, Erwin Coumans wrote: > Hi Anton, and happy new year all, > > >>One questions though: is it possible to "verify" the results of all > >>the computations somehow? > > Good point, and there is no automated way currently, but we can work on > that. > Note that simulation suffers from the 'butterfly effect', so the smallest > change anywhere, > (cpu, compiler etc) diverges into totally different results after a while. I haven't been following this thread, but this sounds like a typical unstable algorithm problem. Are you always operating that close to the tolerance level of the algorithm or are there some sets of inputs that will behave reasonably? If not, the code doesn't seem very useful to me. How could anyone rely on the results, ever? In the worst case, you could experiment with different optimization levels and/or Pass combinations to find something that is reasonably stable. Perhaps LLVM needs a flag to disable sometimes undesireable transformations. Like anything involving floating-point calculations. Compiler changes should not affect codes so horribly unless the user tells them to. :) The Cray compiler provides various -Ofp (-Ofp0, -Ofp1, etc.) levels for this very reason. > There are a few ways of verification I can think of: > > 1) verifying by adding unit tests for all stages in the physics pipeline > (broadphase acceleration structures, closest point computation, constraint > solver) > Given known input and output we can check if the solution is within a > certain tolerance. At each stage? That's reasonable. It could also help identify the parts of the pipeline that are unstable (if not already known). > 2) using the benchmark simulation and verifying the results frame by frame > and check for unusual behaviour Sounds expensive. > 3) modify the benchmark so that it is easier to test the end result, even > through it might be different. We really don't want to do this. Either LLVM needs to be fixed to respect floating-point evaluation in unstable cases or the benchmark and upstream code needs to be fixed to be more stable. -Dave From chandlerc at google.com Mon Jan 4 20:43:27 2010 From: chandlerc at google.com (Chandler Carruth) Date: Mon, 4 Jan 2010 18:43:27 -0800 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> Message-ID: <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> On Mon, Jan 4, 2010 at 1:13 PM, James Y Knight wrote: > Hi, thanks everyone for all the comments. I think maybe I wasn't clear that > I *only* care about atomicity w.r.t. a signal handler interruption in the > same thread, *not* across threads. Therefore, many of the problems of > cross-CPU atomicity are not relevant. The signal handler gets invoked via > pthread_kill, and is thus necessarily running in the same thread as the code > being interrupted. The memory in question can be considered thread-local > here, so I'm not worried about other threads touching it at all. Ok, this helps make sense, but it still is confusing to phrase this as "single threaded". While the signal handler code may execute exclusively to any other code, it does not share the stack frame, etc. I'd describe this more as two threads of mutually exclusive execution or some such. I'm not familiar with what synchronization occurs as part of the interrupt process, but I'd verify it before making too many assumptions. > This sequence that SBCL does today with its internal codegen is basically > like: > MOV , 1 > [[do allocation, fill in object, etc]] > XOR , 1 > JEQ continue > <> > continue: > ... > > The important things here are: > 1) Stores cannot be migrated from within the MOV/XOR instructions to outside > by the codegen. Basically, this is merely the problem that x86 places a stricter requirement on memory ordering than LLVM. Where x86 requires that stores occur in program order, LLVM reserves the right to change that. I have no idea if it is worthwhile to support memory barriers solely within the flow of execution, but it seems highly suspicious. On at least some non-x86 architectures, I suspect you'll need a memory barrier here anyways, so it seems reasonable to place one anyways. I *highly* doubt these fences are an overriding performance concern on x86, do you have any benchmarks that indicate they are? > 2) There's no way an interruption can be missed: the XOR is atomic with > regards to signals executing in the same thread, it's either fully executed > or not (both load+store). But I don't care whether it's visible on other > CPUs or not: it's a thread-local variable in any case. > > Those are the two properties I'd like to get from LLVM, without actually > ever invoking superfluous processor synchronization. Before we start extending LLVM to support expressing the finest points of the x86 memory model in an optimal fashion given a single thread of execution, I'd really need to see some compelling benchmarks that it is a major performance problem. My understanding of the implementation of these aspects of the x86 architecture is that they shouldn't have a particularly high overhead. >> The processor can reorder memory operations as well (within limits). >> Consider that 'memset' to zero is often codegened to a non-temporal >> store to memory. This exempts it from all ordering considerations > > My understanding is that processor reordering only affects what you might > see from another CPU: the processor will undo speculatively executed > operations if the sequence of instructions actually executed is not the > sequence it predicted, so within a single CPU you should never be able tell > the difference. > > But I must admit I don't know anything about non-temporal stores. Within a > single thread, if I do a non-temporal store, followed by a load, am I not > guaranteed to get back the value I stored? If you read the *same address*, then the ordering is guaranteed, but the Intel documentation specifically exempts these instructions from the general rule that writes will not be reordered with other writes. This means that a non-temporal store might be reordered to occur after the "xor" to your atomic integer, even if the instruction came prior to the xor. > > James > From nicholas at mxc.ca Mon Jan 4 20:54:03 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 04 Jan 2010 21:54:03 -0500 Subject: [LLVMdev] C library function declarations In-Reply-To: References: <8d71341e1001040243q78a6c0bbma232861ae4c69fc0@mail.gmail.com> <400d33ea1001040541x2216fa2bo12c730904aa4f83b@mail.gmail.com> Message-ID: <4B42A9CB.3070700@mxc.ca> Keir Mierle wrote: > On Mon, Jan 4, 2010 at 5:41 AM, Kenneth Uildriks > wrote: > > On Mon, Jan 4, 2010 at 4:43 AM, Russell Wallace > > wrote: > > When implementing a language using LLVM as the backend, it is > > necessary to give programs written in that language, access to the C > > standard library functions. The Kaleidoscope tutorial shows how > to do > > this for individual functions using extern declarations, but in > > general it would be necessary to have those predefined for the full > > standard library. Presumably these would contain exactly the > > information from the union of C header files. Is there a way, by > > parsing the C header files or otherwise, to obtain this > information in > > the format LLVM expects? > > > > As an extension of this question, it will be necessary to provide > > access to other libraries written in C. The same question applies > > there: Is there a way, by parsing C header files or otherwise, to > > obtain a list of functions that are defined in a given library, > in the > > format LLVM expects? > > Usually, this would be defined by the language. For instance, C# uses > PInvoke to declare functions that are implemented natively in some > other language, and the programmer usually manually writes PInvoke > declarations for whichever library functions he wants to call. > > Many languages try to wrap the standard C libraries and expose > equivalent functionality through the language's own standard library. > This doesn't address third-party libraries, of course. > > At any rate, if you want to consume native libraries in any language > without explicitly declaring each native function, you'll need to > parse the C header files. Native object files don't have information > about parameters, calling conventions, etc. that your compiler would > need. > > > You may want to consider leveraging Clang to parse existing headers. > It's still not trivial to get everything working as you described, but > by leveraging Clang at least you don't have to parse C yourself. I can't test it right now, but I believe "clang -femit-all-decls" will do it. Nick From arplynn at gmail.com Mon Jan 4 21:25:21 2010 From: arplynn at gmail.com (Alastair Lynn) Date: Tue, 5 Jan 2010 03:25:21 +0000 Subject: [LLVMdev] [llvm-commits] [llvm] r92458 - in /llvm/trunk: lib/Target/README.txt lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/or.ll In-Reply-To: <2EF237CB-17EF-4CA7-8DDF-2F48708A454E@apple.com> References: <201001040604.o04641Gs003776@zion.cs.uiuc.edu> <2EF237CB-17EF-4CA7-8DDF-2F48708A454E@apple.com> Message-ID: <599FDDC8-A86D-4785-85DE-1D0C3A3965EF@gmail.com> Hi Bill- For what it's worth, a simple truth table proves Chris correct. Alastair On 5 Jan 2010, at 02:46, Bill Wendling wrote: > On Jan 3, 2010, at 10:04 PM, Chris Lattner wrote: > >> Author: lattner >> Date: Mon Jan 4 00:03:59 2010 >> New Revision: 92458 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=92458&view=rev >> Log: >> implement an instcombine xform needed by clang's codegen >> on the example in PR4216. This doesn't trigger in the testsuite, >> so I'd really appreciate someone scrutinizing the logic for >> correctness. >> >> --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original) >> +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Mon Jan 4 00:03:59 2010 >> @@ -5213,12 +5213,30 @@ >> return ReplaceInstUsesWith(I, B); >> } >> } >> - V1 = 0; V2 = 0; V3 = 0; >> + >> + // ((V | N) & C1) | (V & C2) --> (V|N) & (C1|C2) >> + // iff (C1&C2) == 0 and (N&~C1) == 0 >> + if ((C1->getValue() & C2->getValue()) == 0) { >> + if (match(A, m_Or(m_Value(V1), m_Value(V2))) && >> + ((V1 == B && MaskedValueIsZero(V2, ~C1->getValue())) || // (V|N) >> + (V2 == B && MaskedValueIsZero(V1, ~C1->getValue())))) // (N|V) >> + return BinaryOperator::CreateAnd(A, >> + ConstantInt::get(A->getContext(), >> + C1->getValue()|C2->getValue())); >> + // Or commutes, try both ways. >> + if (match(B, m_Or(m_Value(V1), m_Value(V2))) && >> + ((V1 == A && MaskedValueIsZero(V2, ~C2->getValue())) || // (V|N) >> + (V2 == A && MaskedValueIsZero(V1, ~C2->getValue())))) // (N|V) >> + return BinaryOperator::CreateAnd(B, >> + ConstantInt::get(B->getContext(), >> + C1->getValue()|C2->getValue())); >> + } >> } >> > Hi Chris, > > I'm having trouble verifying the logic here. I'm probably doing something wrong. First a comment, if C1 and C2 are both zero, then this is zero. It can also be simplified if either one is zero. I don't know if those situations are caught before it gets to this point. > > Okay. Here's my derivation of your transformation: > > [(V | N) & C1] | (V & C2) > = {[(V | N) & C1] | V} & {[(V | N) & C1] | C2} > = {[(V | N | V) & (C1 | V)]} & {[(V | N | C2) & (C1 | C2)]} > = (V | N) & (V | C1) & (V | N | C2) & (C1 | C2) > > Note that > > A & (A | B) = A > > So, (V | N) & [(V | N) | C2] = (V | N) > > Therefore, we have > > (V|N) & (V|C1) & (C1|C2) > > Here's where I get stuck. I can expand out the (V|C1) term, but it doesn't appear to get me closer to your result. I freely admit that I probably made an error. :-) > > Could you provide more insight into the result you got? > > -bw > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From mmms1841 at gmail.com Mon Jan 4 21:27:53 2010 From: mmms1841 at gmail.com (mmms1841) Date: Mon, 4 Jan 2010 19:27:53 -0800 Subject: [LLVMdev] Assembly Printer In-Reply-To: <201001041609.10596.dag@cray.com> References: <201001041609.10596.dag@cray.com> Message-ID: Hello. Thank you for your advice. I am still trying to understand how code generation works and see how the changes I made in the .td files affect the output. I have managed to see the names of the instructions change in the output (it turns out it was a linkage problem), and right now I am trying to figure out how function call lowering works. If I get to the point where I can start implementing a real backend, I will certainly change the definition and everything else. On Mon, Jan 4, 2010 at 2:09 PM, David Greene wrote: > On Sunday 03 January 2010 01:00, Chris Lattner wrote: > > On Jan 1, 2010, at 12:51 PM, mmms1841 wrote: > > > I am trying to understand how LLVM does code generation and I have a > > > couple of questions. I am using LLVM 2.6. > > > > > > First, > > > if I want to change the name of an instruction, all I need to do is to > > > modify the XXXInstrInfo.td, right? Using Sparc as an example, if I > > > wanted to output "mysra" instead of "sra", in SparcInstrInfo.td, I > would > > > write, > > > > > > defm SRA : F3_12<"mysra", 0b100111, sra>; > > > > > > Is this correct? > > > > Yes. > > IMHO, this is a poor way to do this kind of thing. It eventually > leads to confusion where someone things SRA means "sra" and someone > else thinks it meas "mysra." It gets worse as "mysra" acquires > subtly different semantics than "sra." Better to write a separate pattern > and use AddedComplexity to prefer it. > > Just a nugget of wisdom from personal experience. :) > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100104/9ed845bd/attachment.html From gfursin at gmail.com Mon Jan 4 18:24:13 2010 From: gfursin at gmail.com (Grigori Fursin) Date: Tue, 5 Jan 2010 01:24:13 +0100 Subject: [LLVMdev] "Graphite" for llvm In-Reply-To: <4B39F21A.3050401@inria.fr> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <26944_1261905486_4B37264E_26944_9213_1_4B372649.1010600@gmail.com> <4B394C54.9060500@fim.uni-passau.de> <4B39F21A.3050401@inria.fr> Message-ID: <005501ca8d9d$69175880$3b460980$@com> Dear colleagues, Happy New Year! Just wanted to mention that since cTuning community is interested in performance/code size/power tuning using empirical feedback-directed techniques using different compilers including GCC and LLVM, so just wanted to mention that we would be interested to add support to fine-grain optimizations from GRAPHITE to Interactive Compilation Interface at some point at the beginning of this year to be able to use cTuning optimization framework directly and share optimization cases with the community. Will keep in touch, Grigori > -----Original Message----- > From: gcc-graphite at googlegroups.com [mailto:gcc-graphite at googlegroups.com] On Behalf Of Albert > Cohen > Sent: Tuesday, December 29, 2009 1:12 PM > To: Tobias Grosser > Cc: ether; LLVM Developers Mailing List; GCC GRAPHITE; loopo at infosun.fim.uni-passau.de > Subject: Re: [LLVMdev] "Graphite" for llvm > > Tobias Grosser wrote: > > The way to go is the scoplib format (propably extended by quantified > > variables). This format could be extracted from graphite easily and > > could also be created in LLVM. > > What we need to get back into LLVM is only the new optimized schedule > > described e.g. as cloog like scattering functions. These can be parsed > > easily. The real code generation could be done internally, so it is not > > necessary to parse the generated from external tools. > > By the way, Konrad Trifunovic is interested (for his own research) to > improve on the current read/write capabilities of Graphite to process > the full scoplib format instead. Dealing with iteration domain and > schedule/scattering is easy, but to process array subscripts will > require more work. We will keep you informed of the progresses on the > Graphite mailing list, but collaboration outside the Graphities is welcome. > > Albert From gyounghwakim at gmail.com Mon Jan 4 22:48:54 2010 From: gyounghwakim at gmail.com (Gyounghwa Kim) Date: Tue, 5 Jan 2010 13:48:54 +0900 Subject: [LLVMdev] [Please help] Is there any option to make static library files ( .a) to shared libraries (.so) ? Message-ID: <61ad7e1f1001042048w14d6b507oee9fb843abb115a5@mail.gmail.com> Dear experts, I am trying to learn and use llvm, and I built llvm 2.6 with gcc 4.3.2 on linux. I encountered an issue to resolve now. 1. Is there any option to build all the llvm libraries to shared library files with .so extension? Currently most of the library files come with .a extension which are static, and only two libLTO.so and libprofile_rt.so files are in .so ( shared ) forms. Why those two are built in .so files? What are the functions of those two files ( libLTO.so, libprofile_rt.so )? Is there any special reason for this? Is it possible to build the library files into .a files? I'd really appreciate your help on this issue. Thank you for your help in advance. Gyounghwa Kim From jyasskin at google.com Mon Jan 4 22:51:30 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 4 Jan 2010 22:51:30 -0600 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> Message-ID: On Mon, Jan 4, 2010 at 8:43 PM, Chandler Carruth wrote: > On Mon, Jan 4, 2010 at 1:13 PM, James Y Knight wrote: >> Hi, thanks everyone for all the comments. I think maybe I wasn't clear that >> I *only* care about atomicity w.r.t. a signal handler interruption in the >> same thread, *not* across threads. Therefore, many of the problems of >> cross-CPU atomicity are not relevant. The signal handler gets invoked via >> pthread_kill, and is thus necessarily running in the same thread as the code >> being interrupted. The memory in question can be considered thread-local >> here, so I'm not worried about other threads touching it at all. > > Ok, this helps make sense, but it still is confusing to phrase this as > "single threaded". While the signal handler code may execute > exclusively to any other code, it does not share the stack frame, etc. > I'd describe this more as two threads of mutually exclusive execution > or some such. I'm pretty sure James's way of describing it is accurate. It's a single thread with an asynchronous signal, and C allows things in that situation that it disallows for the multi-threaded case. In particular, global objects of type "volatile sig_atomic_t" can be read and written between signal handlers in a thread and that thread's main control flow without locking. C++0x also defines an atomic_signal_fence(memory_order) that only synchronizes with signal handlers, in addition to the atomic_thread_fence(memory_order) that synchronizes to other threads. See [atomics.fences] > I'm not familiar with what synchronization occurs as > part of the interrupt process, but I'd verify it before making too > many assumptions. > >> This sequence that SBCL does today with its internal codegen is basically >> like: >> MOV , 1 >> [[do allocation, fill in object, etc]] >> XOR , 1 >> JEQ continue >> <> >> continue: >> ... >> >> The important things here are: >> 1) Stores cannot be migrated from within the MOV/XOR instructions to outside >> by the codegen. > > Basically, this is merely the problem that x86 places a stricter > requirement on memory ordering than LLVM. Where x86 requires that > stores occur in program order, LLVM reserves the right to change that. > I have no idea if it is worthwhile to support memory barriers solely > within the flow of execution, but it seems highly suspicious. It's needed to support std::atomic_signal_fence. gcc will initially implement that with asm volatile("":::"memory") but as James points out, that kills the JIT, and probably will keep doing so until llvm-mc is finished or someone implements a special case for it. > On at > least some non-x86 architectures, I suspect you'll need a memory > barrier here anyways, so it seems reasonable to place one anyways. I > *highly* doubt these fences are an overriding performance concern on > x86, do you have any benchmarks that indicate they are? Memory fences are as expensive as atomic operations on x86 (quite expensive), but you're right that benchmarks are a good idea anyway. >> 2) There's no way an interruption can be missed: the XOR is atomic with >> regards to signals executing in the same thread, it's either fully executed >> or not (both load+store). But I don't care whether it's visible on other >> CPUs or not: it's a thread-local variable in any case. >> >> Those are the two properties I'd like to get from LLVM, without actually >> ever invoking superfluous processor synchronization. > > Before we start extending LLVM to support expressing the finest points > of the x86 memory model in an optimal fashion given a single thread of > execution, I'd really need to see some compelling benchmarks that it > is a major performance problem. My understanding of the implementation > of these aspects of the x86 architecture is that they shouldn't have a > particularly high overhead. > >>> The processor can reorder memory operations as well (within limits). >>> Consider that 'memset' to zero is often codegened to a non-temporal >>> store to memory. This exempts it from all ordering considerations >> >> My understanding is that processor reordering only affects what you might >> see from another CPU: the processor will undo speculatively executed >> operations if the sequence of instructions actually executed is not the >> sequence it predicted, so within a single CPU you should never be able tell >> the difference. >> >> But I must admit I don't know anything about non-temporal stores. Within a >> single thread, if I do a non-temporal store, followed by a load, am I not >> guaranteed to get back the value I stored? > > If you read the *same address*, then the ordering is guaranteed, but > the Intel documentation specifically exempts these instructions from > the general rule that writes will not be reordered with other writes. > This means that a non-temporal store might be reordered to occur after > the "xor" to your atomic integer, even if the instruction came prior > to the xor. It exempts these instructions from the cross-processor guarantees, but I don't see anything saying that, for example, a temporal store in a single processor's instruction stream after a non-temporal store may be overwritten by the non-temporal store. Do you see something I'm missing? If not, for single-thread signals, I think it's only compiler reordering James has to worry about. From lism03 at gmail.com Mon Jan 4 22:53:09 2010 From: lism03 at gmail.com (Li Shengmei) Date: Tue, 5 Jan 2010 12:53:09 +0800 Subject: [LLVMdev] Clang "warning: cannot find entry symbol mit-llvm-bc" Message-ID: <000e01ca8dc2$fd100da0$2b00030a@c8d07e44d243485> Hi, I am new to Clang. There is a warning when I use clang $llvmc -clang test.c "./bin/ld: warning: cannot find entry symbol mit-llvm-bc; defaulting to 00000000004003c0 llc: bitcode didn't read correctly." When I use lli to execute the test.bc as $lli test. bc The error message is "lli: error loading program 'test.bc': Bitcode stream should be a multiple of 4 bytes in length" Anyone give help on it? Thanks in advance. Shengmei -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/1cd46ff9/attachment.html From baldrick at free.fr Mon Jan 4 23:46:35 2010 From: baldrick at free.fr (Duncan Sands) Date: Tue, 05 Jan 2010 06:46:35 +0100 Subject: [LLVMdev] Problem running 2.6 test-suite on cygwin In-Reply-To: <6306f97b0912120617p1bc578abo24222e399033676c@mail.gmail.com> References: <6306f97b0912120617p1bc578abo24222e399033676c@mail.gmail.com> Message-ID: <4B42D23B.60808@free.fr> Hi Gregory, > `/cygdrive/c/projects/thesis/llvm-suite-2.6/llvm-2.6/projects/test-suite/SingleSource/UnitTests/Vector/SSE' > make[4]: *** No rule to make target `Output/sse.expandfft.linked.rbc', this usually means that you don't have llvm-gcc installed, or, if you do have it installed, that the configure script did not find it. Ciao, Duncan. From ofv at wanadoo.es Mon Jan 4 23:57:02 2010 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Tue, 05 Jan 2010 06:57:02 +0100 Subject: [LLVMdev] [PATCH] Add InstCombine to CMake. References: <1262639276-33940-1-git-send-email-grosser@fim.uni-passau.de> Message-ID: <87wrzxgklt.fsf@telefonica.net> Tobias Grosser writes: > Fixes build of bugpoint, llvm-ld and opt. Douglas Gregor in r92519 introduced a fix that looks more correct to me. Please check that it works for you (use a pristine build directory).a Generally speaking, LLVM_LINK_COMPONENTS shouldn't be used for linking a library when that library is an implicit dependence of other libraries listed on that macro. -- ?scar From ofv at wanadoo.es Tue Jan 5 00:02:16 2010 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Tue, 05 Jan 2010 07:02:16 +0100 Subject: [LLVMdev] [Please help] Is there any option to make static library files ( .a) to shared libraries (.so) ? References: <61ad7e1f1001042048w14d6b507oee9fb843abb115a5@mail.gmail.com> Message-ID: <87r5q5gkd3.fsf@telefonica.net> Gyounghwa Kim writes: [snip] > 1. Is there any option to build all the llvm libraries to shared > library files with .so extension? The build of shared libraries is controlled by BUILD_SHARED_LIBS on the CMake build and --enable-shared on the autotools build. I don't know if the latter works. > Currently most of the library files come with .a extension which are > static, and only two libLTO.so and libprofile_rt.so files are in .so ( > shared ) forms. > > Why those two are built in .so files? Those are plugings, that means that they are intended to be optionally loaded at runtime by some other application. > What are the functions of those two files ( libLTO.so, libprofile_rt.so )? > Is there any special reason for this? > Is it possible to build the library files into .a files? It is possible, but useless. -- ?scar From chandlerc at google.com Tue Jan 5 00:09:29 2010 From: chandlerc at google.com (Chandler Carruth) Date: Mon, 4 Jan 2010 22:09:29 -0800 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> Message-ID: <74c447501001042209u42281871gb9e9aa9ba7790467@mail.gmail.com> On Mon, Jan 4, 2010 at 8:51 PM, Jeffrey Yasskin wrote: > On Mon, Jan 4, 2010 at 8:43 PM, Chandler Carruth wrote: >> On Mon, Jan 4, 2010 at 1:13 PM, James Y Knight wrote: >>> Hi, thanks everyone for all the comments. I think maybe I wasn't clear that >>> I *only* care about atomicity w.r.t. a signal handler interruption in the >>> same thread, *not* across threads. Therefore, many of the problems of >>> cross-CPU atomicity are not relevant. The signal handler gets invoked via >>> pthread_kill, and is thus necessarily running in the same thread as the code >>> being interrupted. The memory in question can be considered thread-local >>> here, so I'm not worried about other threads touching it at all. >> >> Ok, this helps make sense, but it still is confusing to phrase this as >> "single threaded". While the signal handler code may execute >> exclusively to any other code, it does not share the stack frame, etc. >> I'd describe this more as two threads of mutually exclusive execution >> or some such. > > I'm pretty sure James's way of describing it is accurate. It's a > single thread with an asynchronous signal, and C allows things in that > situation that it disallows for the multi-threaded case. In > particular, global objects of type "volatile sig_atomic_t" can be read > and written between signal handlers in a thread and that thread's main > control flow without locking. C++0x also defines an > atomic_signal_fence(memory_order) that only synchronizes with signal > handlers, in addition to the atomic_thread_fence(memory_order) that > synchronizes to other threads. See [atomics.fences] Very interesting, and thanks for the clarifications. I'm not particularly familiar with either those parts of C or C++0x, although it's on the list... =D >> I'm not familiar with what synchronization occurs as >> part of the interrupt process, but I'd verify it before making too >> many assumptions. >> >>> This sequence that SBCL does today with its internal codegen is basically >>> like: >>> MOV , 1 >>> [[do allocation, fill in object, etc]] >>> XOR , 1 >>> JEQ continue >>> <> >>> continue: >>> ... >>> >>> The important things here are: >>> 1) Stores cannot be migrated from within the MOV/XOR instructions to outside >>> by the codegen. >> >> Basically, this is merely the problem that x86 places a stricter >> requirement on memory ordering than LLVM. Where x86 requires that >> stores occur in program order, LLVM reserves the right to change that. >> I have no idea if it is worthwhile to support memory barriers solely >> within the flow of execution, but it seems highly suspicious. > > It's needed to support std::atomic_signal_fence. gcc will initially > implement that with > ?asm volatile("":::"memory") > but as James points out, that kills the JIT, and probably will keep > doing so until llvm-mc is finished or someone implements a special > case for it. Want to propose an extension to the current atomics of LLVM? Could we potentially clarify your previous concern regarding the pairing of barriers to operations, as it seems like they would involve related bits of the lang ref? Happy to work with you on that sometime this Q if you're interested; I'll certainly have more time. =] >> On at >> least some non-x86 architectures, I suspect you'll need a memory >> barrier here anyways, so it seems reasonable to place one anyways. I >> *highly* doubt these fences are an overriding performance concern on >> x86, do you have any benchmarks that indicate they are? > > Memory fences are as expensive as atomic operations on x86 (quite > expensive), but you're right that benchmarks are a good idea anyway. > >>> 2) There's no way an interruption can be missed: the XOR is atomic with >>> regards to signals executing in the same thread, it's either fully executed >>> or not (both load+store). But I don't care whether it's visible on other >>> CPUs or not: it's a thread-local variable in any case. >>> >>> Those are the two properties I'd like to get from LLVM, without actually >>> ever invoking superfluous processor synchronization. >> >> Before we start extending LLVM to support expressing the finest points >> of the x86 memory model in an optimal fashion given a single thread of >> execution, I'd really need to see some compelling benchmarks that it >> is a major performance problem. My understanding of the implementation >> of these aspects of the x86 architecture is that they shouldn't have a >> particularly high overhead. >> >>>> The processor can reorder memory operations as well (within limits). >>>> Consider that 'memset' to zero is often codegened to a non-temporal >>>> store to memory. This exempts it from all ordering considerations >>> >>> My understanding is that processor reordering only affects what you might >>> see from another CPU: the processor will undo speculatively executed >>> operations if the sequence of instructions actually executed is not the >>> sequence it predicted, so within a single CPU you should never be able tell >>> the difference. >>> >>> But I must admit I don't know anything about non-temporal stores. Within a >>> single thread, if I do a non-temporal store, followed by a load, am I not >>> guaranteed to get back the value I stored? >> >> If you read the *same address*, then the ordering is guaranteed, but >> the Intel documentation specifically exempts these instructions from >> the general rule that writes will not be reordered with other writes. >> This means that a non-temporal store might be reordered to occur after >> the "xor" to your atomic integer, even if the instruction came prior >> to the xor. > > It exempts these instructions from the cross-processor guarantees, but > I don't see anything saying that, for example, a temporal store in a > single processor's instruction stream after a non-temporal store may > be overwritten by the non-temporal store. Do you see something I'm > missing? If not, for single-thread signals, I think it's only compiler > reordering James has to worry about. The exemption I'm referring to (Section 8.2.2 of System Programming Guide from Intel) is to the write-write ordering of the *single-processor* model. Reading the referenced section on the non-temporal behavior for these instructions (10.4.6 of volume 1 of the architecture manual) doesn't entirely clarify the matter for me either. It specifically says that the non-temporal writes may occur outside of program order, but doesn't seem clarify exactly what the result is of overlapping temporal writes are without fences within the same program thread. The only examples I'm finding are for multiprocessor scenarios. =/ From spark727 at 163.com Tue Jan 5 00:34:47 2010 From: spark727 at 163.com (sparkle) Date: Mon, 4 Jan 2010 22:34:47 -0800 (PST) Subject: [LLVMdev] bug Message-ID: <27024284.post@talk.nabble.com> [spark at oxygen llvm]$ llvmc -clang ~/a.c /home2/yjhuang/tools/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../x86_64-unknown-linux-gnu/bin/ld: warning: cannot find entry symbol mit-llvm-bc; defaulting to 00000000004003c0 llc: bitcode didn't read correctly. Reason: Bitcode stream should be a multiple of 4 bytes in length ??error,?-v??????clang -x c -emit-llvm-bc /home2/spark/a.c -o /tmp/llvm_HKq01o/a.bc ???warning???????????.bc?????lli??????error? [spark at oxygen llvm]$ llvmc -clang ~/a.c -v clang -x c -emit-llvm-bc /home2/spark/a.c -o /tmp/llvm_HKq01o/a.bc /home2/yjhuang/tools/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../x86_64-unknown-linux-gnu/bin/ld: warning: cannot find entry symbol mit-llvm-bc; defaulting to 00000000004003c0 llc -f /tmp/llvm_HKq01o/a.bc -o /tmp/llvm_HKq01o/a.s llc: bitcode didn't read correctly. Reason: Bitcode stream should be a multiple of 4 bytes in length [spark at oxygen llvm]$ clang -x c -emit-llvm-bc /home2/spark/a.c -o a.bc /home2/yjhuang/tools/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../x86_64-unknown-linux-gnu/bin/ld: warning: cannot find entry symbol mit-llvm-bc; defaulting to 00000000004003c0 [spark at oxygen llvm]$ lli a.bc lli: error loading program 'a.bc': Bitcode stream should be a multiple of 4 bytes in length [spark at oxygen llvm]$ -- View this message in context: http://old.nabble.com/bug-tp27024284p27024284.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From clattner at apple.com Tue Jan 5 01:19:26 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 4 Jan 2010 23:19:26 -0800 Subject: [LLVMdev] bug In-Reply-To: <27024284.post@talk.nabble.com> References: <27024284.post@talk.nabble.com> Message-ID: On Jan 4, 2010, at 10:34 PM, sparkle wrote: > > [spark at oxygen llvm]$ llvmc -clang ~/a.c > /home2/yjhuang/tools/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../x86_64-unknown-linux-gnu/bin/ld: > warning: cannot find entry symbol mit-llvm-bc; defaulting to > 00000000004003c0 Please use the clang driver directly, llvmc's clang support is experimental. -Chris > llc: bitcode didn't read correctly. > Reason: Bitcode stream should be a multiple of 4 bytes in length > ??error,?-v??????clang -x c -emit-llvm-bc /home2/spark/a.c -o > /tmp/llvm_HKq01o/a.bc ???warning???????????.bc?????lli??????error? > [spark at oxygen llvm]$ llvmc -clang ~/a.c -v > clang -x c -emit-llvm-bc /home2/spark/a.c -o /tmp/llvm_HKq01o/a.bc > /home2/yjhuang/tools/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../x86_64-unknown-linux-gnu/bin/ld: > warning: cannot find entry symbol mit-llvm-bc; defaulting to > 00000000004003c0 > llc -f /tmp/llvm_HKq01o/a.bc -o /tmp/llvm_HKq01o/a.s > llc: bitcode didn't read correctly. > Reason: Bitcode stream should be a multiple of 4 bytes in length > [spark at oxygen llvm]$ clang -x c -emit-llvm-bc /home2/spark/a.c -o a.bc > /home2/yjhuang/tools/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../x86_64-unknown-linux-gnu/bin/ld: > warning: cannot find entry symbol mit-llvm-bc; defaulting to > 00000000004003c0 > [spark at oxygen llvm]$ lli a.bc > lli: error loading program 'a.bc': Bitcode stream should be a multiple of 4 > bytes in length > [spark at oxygen llvm]$ > > -- > View this message in context: http://old.nabble.com/bug-tp27024284p27024284.html > Sent from the LLVM - Dev mailing list archive at Nabble.com. > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From baldrick at free.fr Tue Jan 5 01:22:22 2010 From: baldrick at free.fr (Duncan Sands) Date: Tue, 05 Jan 2010 08:22:22 +0100 Subject: [LLVMdev] Any reason why fastcc on x86 shouldn't use ECX as a return register? In-Reply-To: <400d33ea0912140701y4678c7d6w6faefbf39e7656df@mail.gmail.com> References: <400d33ea0912140701y4678c7d6w6faefbf39e7656df@mail.gmail.com> Message-ID: <4B42E8AE.1060104@free.fr> Hi Kenneth, > Now that we can safely return arbitrarily large structs on x86, it > seems to me that fastcc, which doesn't have to conform to any > preexisting ABI, should use ECX as well as EAX and EDX for returning > {i32,i32,i32} rather than use sret-demotion. the x86 trampoline lowering code would need tweaking to check that the ECX register was available for it. Ciao, Duncan. From gregory.petrosyan at gmail.com Tue Jan 5 02:30:20 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 11:30:20 +0300 Subject: [LLVMdev] Problem running 2.6 test-suite on cygwin In-Reply-To: <4B42D23B.60808@free.fr> References: <6306f97b0912120617p1bc578abo24222e399033676c@mail.gmail.com> <4B42D23B.60808@free.fr> Message-ID: <6306f97b1001050030m37ae1398r6b60bf62f022d8f3@mail.gmail.com> On Tue, Jan 5, 2010 at 8:46 AM, Duncan Sands wrote: >> `/cygdrive/c/projects/thesis/llvm-suite-2.6/llvm-2.6/projects/test-suite/SingleSource/UnitTests/Vector/SSE' >> make[4]: *** No rule to make target `Output/sse.expandfft.linked.rbc', > > this usually means that you don't have llvm-gcc installed, or, if > you do have it installed, that the configure script did not find > it. Thanks, but it looks like it was not the case (more info here [1]). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-December/027952.html Gregory From gregory.petrosyan at gmail.com Tue Jan 5 03:38:37 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 12:38:37 +0300 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir Message-ID: <20100105093837.GA10132@gregory-laptop> LLVM makefiles can't detect source changes in case objdir != srcdir, e.g. I've managed to get my pass listed in 'opt -help' only after removing opt subdir from objdir and running make again. Re-configuring LLVM also does not trigger rebuild when running make, e.g. after initial 'configure --enable-targets=x86' I've managed to get C backend only after removing objdir and re-configuring (was too lazy to check if 'make clean' is sufficient). Does anyone know what can be a source for these problems? Gregory From gregory.petrosyan at gmail.com Tue Jan 5 04:11:07 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 13:11:07 +0300 Subject: [LLVMdev] [PATCH] test-suite/bullet: unbreak linking Message-ID: <20100105101107.GA18004@gregory-laptop> Eliminate undefined references to powf, sqrtf and friends. Index: MultiSource/Benchmarks/Bullet/Makefile =================================================================== --- MultiSource/Benchmarks/Bullet/Makefile (revision 92512) +++ MultiSource/Benchmarks/Bullet/Makefile (working copy) @@ -1,7 +1,7 @@ LEVEL = ../../../ PROG = bullet CPPFLAGS += -I$(PROJ_SRC_DIR)/include -DNO_TIME -LDFLAGS = -lstdc++ +LDFLAGS = -lstdc++ -lm include $(LEVEL)/Makefile.config From gyounghwakim at gmail.com Tue Jan 5 04:35:50 2010 From: gyounghwakim at gmail.com (Gyounghwa Kim) Date: Tue, 5 Jan 2010 19:35:50 +0900 Subject: [LLVMdev] [Help] How can we call an object's virtual function inside IR? Message-ID: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> Dear experts, I am learning llvm by reading documents and have a question to ask. The following is the example of code generation that I created. [[a [10.00]] > [3.00]] ; ModuleID = 'ExprF' define i1 @expr(double* %record) { entry: %0 = getelementptr double* %record, i32 0 ; [#uses=1] %1 = load double* %0 ; [#uses=1] %2 = frem double %1, 1.000000e+01 ; [#uses=1] %3 = fcmp ogt double %2, 3.000000e+00 ; [#uses=1] ret i1 %3 } Now, I would like to change the type of the argument from double * to a pointer to an object ( C++ class ) like ClassA * %record, and inside the function body, I would like to call the virtual functions of the ClassA %record object to get the value for the evaluation. As far as I understand, we can get the members of a struct from getElementPtr function, but is it possible to call a virtual function of an object to get the value? Could you explain on how to do this? Thank you very much for your help in advance. :) Best regards, Gyounghwa Kim From baldrick at free.fr Tue Jan 5 04:59:36 2010 From: baldrick at free.fr (Duncan Sands) Date: Tue, 05 Jan 2010 11:59:36 +0100 Subject: [LLVMdev] [Help] How can we call an object's virtual function inside IR? In-Reply-To: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> References: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> Message-ID: <4B431B98.70403@free.fr> Hi Gyounghwa Kim, try pasting C++ code into http://llvm.org/demo/ in order to see the LLVM IR that llvm-g++ turns it into. That way you will see how this can be done. Best wishes, Duncan. From jyasskin at google.com Tue Jan 5 07:32:06 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Tue, 5 Jan 2010 07:32:06 -0600 Subject: [LLVMdev] ASM output with JIT / codegen barriers In-Reply-To: <74c447501001042209u42281871gb9e9aa9ba7790467@mail.gmail.com> References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> <74c447501001042209u42281871gb9e9aa9ba7790467@mail.gmail.com> Message-ID: On Tue, Jan 5, 2010 at 12:09 AM, Chandler Carruth wrote: > On Mon, Jan 4, 2010 at 8:51 PM, Jeffrey Yasskin wrote: >> On Mon, Jan 4, 2010 at 8:43 PM, Chandler Carruth wrote: >>> On Mon, Jan 4, 2010 at 1:13 PM, James Y Knight wrote: >>>> The important things here are: >>>> 1) Stores cannot be migrated from within the MOV/XOR instructions to outside >>>> by the codegen. >>> >>> Basically, this is merely the problem that x86 places a stricter >>> requirement on memory ordering than LLVM. Where x86 requires that >>> stores occur in program order, LLVM reserves the right to change that. >>> I have no idea if it is worthwhile to support memory barriers solely >>> within the flow of execution, but it seems highly suspicious. >> >> It's needed to support std::atomic_signal_fence. gcc will initially >> implement that with >> ?asm volatile("":::"memory") >> but as James points out, that kills the JIT, and probably will keep >> doing so until llvm-mc is finished or someone implements a special >> case for it. > > Want to propose an extension to the current atomics of LLVM? Could we > potentially clarify your previous concern regarding the pairing of > barriers to operations, as it seems like they would involve related > bits of the lang ref? Happy to work with you on that sometime this Q > if you're interested; I'll certainly have more time. =] I have some ideas for that, and will be happy to help. >>>>> The processor can reorder memory operations as well (within limits). >>>>> Consider that 'memset' to zero is often codegened to a non-temporal >>>>> store to memory. This exempts it from all ordering considerations >>>> >>>> My understanding is that processor reordering only affects what you might >>>> see from another CPU: the processor will undo speculatively executed >>>> operations if the sequence of instructions actually executed is not the >>>> sequence it predicted, so within a single CPU you should never be able tell >>>> the difference. >>>> >>>> But I must admit I don't know anything about non-temporal stores. Within a >>>> single thread, if I do a non-temporal store, followed by a load, am I not >>>> guaranteed to get back the value I stored? >>> >>> If you read the *same address*, then the ordering is guaranteed, but >>> the Intel documentation specifically exempts these instructions from >>> the general rule that writes will not be reordered with other writes. >>> This means that a non-temporal store might be reordered to occur after >>> the "xor" to your atomic integer, even if the instruction came prior >>> to the xor. >> >> It exempts these instructions from the cross-processor guarantees, but >> I don't see anything saying that, for example, a temporal store in a >> single processor's instruction stream after a non-temporal store may >> be overwritten by the non-temporal store. Do you see something I'm >> missing? If not, for single-thread signals, I think it's only compiler >> reordering James has to worry about. > > The exemption I'm referring to (Section 8.2.2 of System Programming > Guide from Intel) is to the write-write ordering of the > *single-processor* model. Reading the referenced section on the > non-temporal behavior for these instructions (10.4.6 of volume 1 of > the architecture manual) doesn't entirely clarify the matter for me > either. It specifically says that the non-temporal writes may occur > outside of program order, but doesn't seem clarify exactly what the > result is of overlapping temporal writes are without fences within the > same program thread. The only examples I'm finding are for > multiprocessor scenarios. =/ Yeah, it's not 100% clear. I'm pretty sure that x86 maintains the fiction of a linear "instruction stream" within each processor, even in the presence of interrupts (which underly pthread_kill and OS-level thread switching). For example, in 6.6, we have "The ability of a P6 family processor to speculatively execute instructions does not affect the taking of interrupts by the processor. Interrupts are taken at instruction boundaries located during the retirement phase of instruction execution; so they are always taken in the ?in-order? instruction stream." But I'm not an expert in non-temporal anything. From gregory.petrosyan at gmail.com Tue Jan 5 07:43:33 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 16:43:33 +0300 Subject: [LLVMdev] libcalls test fails to run Message-ID: <20100105134333.GA1195@gregory-laptop> This is what I get while trying to run 'make TEST=libcalls' in the top dir of test-suite: make[1]: Entering directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource' make[2]: Entering directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource/UnitTests' make[3]: Entering directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource/UnitTests/Vector' make[4]: Entering directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource/UnitTests/Vector/SSE' make[4]: *** No rule to make target `@', needed by `Output/sse.expandfft.libcalls.report.txt'. Stop. make[4]: Leaving directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource/UnitTests/Vector/SSE' make[3]: *** [test] Error 1 make[3]: Leaving directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource/UnitTests/Vector' make[2]: *** [test] Error 1 make[2]: Leaving directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource/UnitTests' make[1]: *** [UnitTests/.maketest] Error 2 make[1]: Leaving directory `/home/gregory/thesis/llvm/projects/test-suite/SingleSource' make: *** [SingleSource/.maketest] Error 2 'make TEST=example' works, 'make TEST=jit' and 'make' work too. Any ideas about what is going wrong here? Gregory From etherzhhb at gmail.com Tue Jan 5 07:45:05 2010 From: etherzhhb at gmail.com (ether) Date: Tue, 05 Jan 2010 21:45:05 +0800 Subject: [LLVMdev] "Graphite" for llvm In-Reply-To: <4B36837A.2020707@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> Message-ID: <4B434261.7050102@gmail.com> hi Tobi, i just added the Poly library(http://wiki.llvm.org/Polyhedral_optimization_framework) to llvm build system, which only contain a toy pass "Poly". i think we could add the polyhedral optimization stuff in to this library. it was test under cmake+visual studio 2009, and i also add the library build rule to MAKEFILEs, but not sure if it work under linux/cygwin/mac, sorry. hope this help best regards --ether On 2009-12-27 5:43, Tobias Grosser wrote: > Hi ether, > > On 12/26/09 13:06, ether zhhb wrote: >> hi, >> >> dose anyone going/planning to add something like >> Graphite(http://gcc.gnu.org/wiki/Graphite) in gcc to llvm(or that >> should be implement at the level of clang?)? > > I already looked into implementing something like Graphite for LLVM. > However just recently, so I have not released any code yet. As soon as > some code is available I will post patches. > > Anybody who wants to work on the polyhedral model in LLVM, is invited > to the Graphite mailing list, so we can share ideas. > > Here some information about graphite like optimizations in LLVM. > > A short introduction to the current state of GCC/Graphite: > ----------------------------------------------------------------------- > Graphite is a project in GCC that uses a mathematical representation, > the polytop model, to represent and transform loops and other control > flow structures. Using an abstract representation it is possible to > reason about transformations in a more general way and to use highly > optimized linear programming libraries to figure out the optimal loop > structures. These transformations can be used to do constant > propagation through arrays, remove dead loop iterations, optimize > loops for cache locality, optimize arrays, apply advanced automatic > parallelization, or to drive vectorization. > > The current state of Graphite and the polyhedral model in general is > at the moment in between research and production. Over the last 20 > years there has been a lot of research in the area of the polyhedral > model, however it was never used in real world compilers (I know of) > until Sebastian Pop started to implement the required analysis and > Graphite itself. Graphite has shown that it is possible to convert a > low level imperative language into the polyhedral model and generate > working code back from it with reasonable afford. Now several people > from INRIA, IBM, AMD, the University of Passau and China are working > on making the optimizations that have been found during the 20 years > of research available to GCC. The latest news about Graphite, will be > presented on the GROW workshop 2010 in Pisa by Konrad Trifunovic. > > [Advertisement end] ;-) > ----------------------------------------------------------------------- > > > A general plan to implement polyhedral transformations in LLVM: > > 1. The identity transformation (LLVM->polyedral->LLVM) > ====================================================== > > Create the polyhedral representation of the LLVM IR, do nothing with > it, and generate LLVM IR from the polyhedral representation. (Enough > to attach external optimizers) > > 1.1 Detect regions > 1.2 Translate LLVM IR to polyhedral model > ----------------------------------------- > > The first step will be to analyze the LLVM intermediate language and > extract control flow regions that can be analyzed using the polyhedral > model. This is mainly based on the scalar evolution analysis and > should be more or less like the detection in Graphite. > One point I do not yet fully understand is how to get array access > functions from the LLVM-IR. (Probably based on getElementPtr) > > Another question is which polyhedral library can be used inside LLVM. > One option would be ISL from Sven Verdoolaege (LGPL) another the PPL > from Roberto Bagnara (GPLv3). > > 1.3 Generate LLVM IR from polyhedral mode > ----------------------------------------- > > For code generation the CLooG/isl library can be used. It is LGPL > licensed. Sven will also work on an CLooG using the PPL, so this could > also be an option. > > 2. Optimize on the polyhedral representation > ============================================ > > 2.1 Use external optimizers > --------------------------- > > The polyhedral loop description is simple and not compiler depended. > Therefore external tools like LooPo (automatic parallelization), Pluto > (optimization in general) or even Graphite might be used to optimize > code. This could give a first impression what to expect from the > polyhedral model in LLVM. > > There are also affords to establish an interchangeable polyhedral > format (scoplib - Louis-Noel Pouchet) and to generate a polyhedral > compilation package. These will allow to share/exchange optimizations > between different compilers and research tools. > > Furthermore an external interface will enable researchers to use LLVM > for their work on polyhedral optimizations. This might be useful as > there is no polyhedral compiler for any dynamic language yet. Also > recent work on optimizing for GPUs has started in the polyhedral > community, so the LLVM OpenCL implementation might be interesting too. > > 2.2 Implement optimizations in LLVM > ----------------------------------- > > Useful optimizations could be imported into / rewritten in LLVM. For > optimizations that transfrom the LLVM-IR like vectorization or > automatic parallelization at least some part of the optimizations has > to be in LLVM anyway. > > This is just a rough overview. If anybody is interested in working on > any of these topics, as mentioned above, I would be glad to help. > > Enjoy your holidays > > Tobi > > P.S.: I do not see any reason implement this in Clang, the LLVM IR is > the right place to do this. > > > > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: polylib.patch Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/b0705a20/attachment.pl From kennethuil at gmail.com Tue Jan 5 08:01:46 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Tue, 5 Jan 2010 08:01:46 -0600 Subject: [LLVMdev] Any reason why fastcc on x86 shouldn't use ECX as a return register? In-Reply-To: <4B42E8AE.1060104@free.fr> References: <400d33ea0912140701y4678c7d6w6faefbf39e7656df@mail.gmail.com> <4B42E8AE.1060104@free.fr> Message-ID: <400d33ea1001050601q38986af7j41d4a08e8a2b5ca6@mail.gmail.com> On Tue, Jan 5, 2010 at 1:22 AM, Duncan Sands wrote: > Hi Kenneth, > >> Now that we can safely return arbitrarily large structs on x86, it >> seems to me that fastcc, which doesn't have to conform to any >> preexisting ABI, should use ECX as well as EAX and EDX for returning >> {i32,i32,i32} rather than use sret-demotion. > > the x86 trampoline lowering code would need tweaking to check that the ECX > register was available for it. > > Ciao, > > Duncan. > Doesn't ECX get used to pass parameters into, rather than out of, a trampoline? From baldrick at free.fr Tue Jan 5 08:04:28 2010 From: baldrick at free.fr (Duncan Sands) Date: Tue, 05 Jan 2010 15:04:28 +0100 Subject: [LLVMdev] Any reason why fastcc on x86 shouldn't use ECX as a return register? In-Reply-To: <400d33ea1001050601q38986af7j41d4a08e8a2b5ca6@mail.gmail.com> References: <400d33ea0912140701y4678c7d6w6faefbf39e7656df@mail.gmail.com> <4B42E8AE.1060104@free.fr> <400d33ea1001050601q38986af7j41d4a08e8a2b5ca6@mail.gmail.com> Message-ID: <4B4346EC.5090803@free.fr> Hi Kenneth, >>> Now that we can safely return arbitrarily large structs on x86, it >>> seems to me that fastcc, which doesn't have to conform to any >>> preexisting ABI, should use ECX as well as EAX and EDX for returning >>> {i32,i32,i32} rather than use sret-demotion. >> the x86 trampoline lowering code would need tweaking to check that the ECX >> register was available for it. > > Doesn't ECX get used to pass parameters into, rather than out of, a trampoline? yes, I hadn't read your email carefully enough - I thought you were talking about passing parameters in. Sorry for the noise. Ciao, Duncan. From gregory.petrosyan at gmail.com Tue Jan 5 08:22:42 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 17:22:42 +0300 Subject: [LLVMdev] [PATCH] test-suite/libcalls: unbreak build In-Reply-To: <20100105134333.GA1195@gregory-laptop> References: <20100105134333.GA1195@gregory-laptop> Message-ID: <20100105142242.GA17711@gregory-laptop> On Tue, Jan 05, 2010 at 04:43:33PM +0300, Gregory Petrosyan wrote: > 'make TEST=example' works, 'make TEST=jit' and 'make' work too. Any ideas about what is going wrong here? No idea why this stuff was there... Index: TEST.libcalls.Makefile =================================================================== --- TEST.libcalls.Makefile (revision 92512) +++ TEST.libcalls.Makefile (working copy) @@ -21,12 +21,11 @@ @cat $< $(PROGRAMS_TO_TEST:%=Output/%.$(TEST).report.txt): \ -Output/%.$(TEST).report.txt: Output/%.linked.rbc $(LOPT) \ +Output/%.$(TEST).report.txt: Output/%.linked.rbc $(LOPT) $(VERB) $(RM) -f $@ @echo "---------------------------------------------------------------" >> $@ @echo ">>> ========= '$(RELDIR)/$*' Program" >> $@ @echo "---------------------------------------------------------------" >> $@ - $(PROJ_SRC_ROOT)/TEST.libcalls.Makefile @-$(LOPT) -simplify-libcalls -stats -debug-only=simplify-libcalls \ -time-passes -disable-output $< 2>>$@ summary: From robert.quill at imgtec.com Tue Jan 5 08:41:31 2010 From: robert.quill at imgtec.com (Robert Quill) Date: Tue, 05 Jan 2010 14:41:31 +0000 Subject: [LLVMdev] Removing the constant pool Message-ID: <1262702491.4903.12.camel@quill-linux.kl.imgtec.org> Hi all, I was wondering if it is possible to stop floating-point constants being converted to use the constant pool? As for our back-end we would like to be able to treat floating point constants the same way integer constants are treated instead of having to go via the constant pool. Thanks for your help, Rob - This message is subject to Imagination Technologies' e-mail terms: http://www.imgtec.com/e-mail.htm Imagination Technologies Ltd is a limited company registered in England No: 1306335 Registered Office: Imagination House, Home Park Estate, Kings Langley, Hertfordshire, WD4 8LZ. Email to and from the company may be monitored for compliance and other administrative purposes. - From anton at korobeynikov.info Tue Jan 5 10:38:07 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 5 Jan 2010 19:38:07 +0300 Subject: [LLVMdev] Removing the constant pool In-Reply-To: <1262702491.4903.12.camel@quill-linux.kl.imgtec.org> References: <1262702491.4903.12.camel@quill-linux.kl.imgtec.org> Message-ID: Hello > I was wondering if it is possible to stop floating-point constants being converted to use the constant pool? As for our back-end we would like to be able to treat floating point constants the same way integer constants are treated instead of having to go via the constant pool. Yes, surely. Just make ISD::ConstantFP for given type legal and handle it during isel. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dag at cray.com Tue Jan 5 11:30:41 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 11:30:41 -0600 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <20100105093837.GA10132@gregory-laptop> References: <20100105093837.GA10132@gregory-laptop> Message-ID: <201001051130.41381.dag@cray.com> On Tuesday 05 January 2010 03:38, Gregory Petrosyan wrote: > LLVM makefiles can't detect source changes in case objdir != srcdir, e.g. I haven't found that. I build with objdir != srcdir all the time. > I've managed to get my pass listed in 'opt -help' only after removing opt > subdir from objdir and running make again. Re-configuring LLVM also does It sounds like the dependencies for your pass are not correct. Where did you put it in the LLVM tree and how did you change the Makefiles? > not trigger rebuild when running make, e.g. after initial 'configure > --enable-targets=x86' I've managed to get C backend only after removing > objdir and re-configuring (was too lazy to check if 'make clean' is > sufficient). A non-build after reconfigure is not really a problem. If nothing in the configuration has changed configure is smart enough not to update anything so make doesn't see any changes. -Dave From dag at cray.com Tue Jan 5 11:33:43 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 11:33:43 -0600 Subject: [LLVMdev] [Help] How can we call an object's virtual function inside IR? In-Reply-To: <4B431B98.70403@free.fr> References: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> <4B431B98.70403@free.fr> Message-ID: <201001051133.43839.dag@cray.com> On Tuesday 05 January 2010 04:59, Duncan Sands wrote: > Hi Gyounghwa Kim, try pasting C++ code into http://llvm.org/demo/ > in order to see the LLVM IR that llvm-g++ turns it into. That way > you will see how this can be done. LLVM has no direct support for class hierarchies, virtual functions, templates or anything of that support. It is very low-level. The frontend is responsible for implementing the C++ object model in terms of LLVM constructs. This the frontend will have to generate vtables, indirect calls, etc. to implement virtual function calls. This is all highly compiler-dependent. Everyone does it a little bit differently. Using the llvm-g++ online demo as Duncan suggests will show you how g++ implements this stuff. -Dave From foom at fuhm.net Tue Jan 5 11:53:46 2010 From: foom at fuhm.net (James Y Knight) Date: Tue, 5 Jan 2010 12:53:46 -0500 Subject: [LLVMdev] Non-temporal moves in memset [Was: ASM output with JIT / codegen barriers] In-Reply-To: <74c447501001042209u42281871gb9e9aa9ba7790467@mail.gmail.com> References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> <74c447501001042209u42281871gb9e9aa9ba7790467@mail.gmail.com> Message-ID: On Jan 5, 2010, at 1:09 AM, Chandler Carruth wrote: >>>>> Consider that 'memset' to zero is often codegened to a non- >>>>> temporal >>>>> store to memory. This exempts it from all ordering considerations Hm...off topic from my original email since I think this is only relevant for multithreaded code... But from what I can tell, an implementation of memset that does not contain an sfence after using movnti is considered broken. Callers of memset would not (and should not need to) know that they must use an actual memory barrier (sfence) after the memset call to get the usual x86 store-store guarantee. Thread describing that bug in glibc memset implementation: http://sourceware.org/ml/libc-alpha/2007-11/msg00017.html Redhat errata including that fix in a stable update: http://rhn.redhat.com/errata/RHBA-2008-0083.html Then there's a recent discussion on the topic of who is responsible for calling sfence on the gcc mailing list: http://www.mail-archive.com/gcc at gcc.gnu.org/msg45939.html Unfortunately, that thread didn't seem to have any firm conclusion, but ISTM that the current default assumption is (b): anything that uses movnti is assumed to surround such uses with memory fences so that other code doesn't need to. James From dag at cray.com Tue Jan 5 11:54:17 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 11:54:17 -0600 Subject: [LLVMdev] AVX Testcases Message-ID: <201001051154.17715.dag@cray.com> I should be sending up some AVX code this week. When I do this I'd like to generate some testcases to make sure we actually generate AVX code. Ideally we'd have a testcase for each AVX pattern but that's probably overkill. Still, we'd like a lot of tests, I think. Should these tests go into CodeGen/X86 or should I create a new space, like CodeGen/X86/SIMD? I tend to favor the latter as it keeps things more compartmentalized and cleaner. -Dave From dag at cray.com Tue Jan 5 11:59:16 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 11:59:16 -0600 Subject: [LLVMdev] TableGen !eq() Operator Patch Message-ID: <201001051159.17165.dag@cray.com> Attached is a patch to implement an !eq() operator in TableGen. We use this for the AVX specification to allow the user to control what kind of pattern should be used for a particular instruction def. For example, we use it for reg-mem instructions to let the user choose between a built-in generic reg-mem pattern, the same pattern that was used for the reg-reg variant of the instruction or a custom pattern provided by the user. It only operates on strings. An alternative to this would be to implement named parameters in TableGen but that seems like overkill for this case. Named parameters could be useful in other areas to reduce template complexity but implementing it is non-trivial given the widespread assumption in the TableGen sources that template parameters exactly match up with template arguments. Comments? Ok to commit? -Dave -------------- next part -------------- A non-text attachment was scrubbed... Name: eq.patch Type: text/x-diff Size: 4569 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/01e01702/attachment.bin From clattner at apple.com Tue Jan 5 12:05:21 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 5 Jan 2010 10:05:21 -0800 Subject: [LLVMdev] [PATCH] test-suite/libcalls: unbreak build In-Reply-To: <20100105142242.GA17711@gregory-laptop> References: <20100105134333.GA1195@gregory-laptop> <20100105142242.GA17711@gregory-laptop> Message-ID: On Jan 5, 2010, at 6:22 AM, Gregory Petrosyan wrote: > On Tue, Jan 05, 2010 at 04:43:33PM +0300, Gregory Petrosyan wrote: >> 'make TEST=example' works, 'make TEST=jit' and 'make' work too. Any >> ideas about what is going wrong here? > > No idea why this stuff was there... looks like some lines got moved, fixed on mainline, thanks. -Chris > > Index: TEST.libcalls.Makefile > =================================================================== > --- TEST.libcalls.Makefile (revision 92512) > +++ TEST.libcalls.Makefile (working copy) > @@ -21,12 +21,11 @@ > @cat $< > > $(PROGRAMS_TO_TEST:%=Output/%.$(TEST).report.txt): \ > -Output/%.$(TEST).report.txt: Output/%.linked.rbc $(LOPT) \ > +Output/%.$(TEST).report.txt: Output/%.linked.rbc $(LOPT) > $(VERB) $(RM) -f $@ > @echo > "---------------------------------------------------------------" >> > $@ > @echo ">>> ========= '$(RELDIR)/$*' Program" >> $@ > @echo > "---------------------------------------------------------------" >> > $@ > - $(PROJ_SRC_ROOT)/TEST.libcalls.Makefile > @-$(LOPT) -simplify-libcalls -stats -debug-only=simplify-libcalls \ > -time-passes -disable-output $< 2>>$@ > summary: > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From ambika at cse.iitb.ac.in Tue Jan 5 12:07:52 2010 From: ambika at cse.iitb.ac.in (ambika) Date: Tue, 05 Jan 2010 23:37:52 +0530 Subject: [LLVMdev] About LLVM Message-ID: <4B437FF8.7070906@cse.iitb.ac.in> Hi all, I am a new user to LLVM and trying to figure out if this is what I really want to complete my thesis work. I want to add a pointer analysis phase to compiler optimization. This analysis will take profile information into account. My analysis requires to analyse a single statement at a time and then I also want to perform some code duplication to finally perform the optimization. Finally I would like to compare the results with and without my pass on benchmark standards. I was just wondering if I can do all this with LLVM. I have very short period for my thesis and I have to implement and get the results. So I will be highly obliged if anyone of you can help me out. Thanks in advance. regards, Ambika From chandlerc at google.com Tue Jan 5 12:10:18 2010 From: chandlerc at google.com (Chandler Carruth) Date: Tue, 5 Jan 2010 10:10:18 -0800 Subject: [LLVMdev] Non-temporal moves in memset [Was: ASM output with JIT / codegen barriers] In-Reply-To: References: <74c447501001040135p25c84ae4xc7fc97dea43443dc@mail.gmail.com> <3CBE5BD9-6ACD-45BB-9D30-B0A099AB6B60@fuhm.net> <74c447501001041843x46950083x23d7b902fb165eb6@mail.gmail.com> <74c447501001042209u42281871gb9e9aa9ba7790467@mail.gmail.com> Message-ID: <74c447501001051010w4675fed8t45f552d8dba1fe5f@mail.gmail.com> On Tue, Jan 5, 2010 at 9:53 AM, James Y Knight wrote: > > On Jan 5, 2010, at 1:09 AM, Chandler Carruth wrote: > >>>>>> Consider that 'memset' to zero is often codegened to a non-temporal >>>>>> store to memory. This exempts it from all ordering considerations > > > Hm...off topic from my original email since I think this is only relevant > for multithreaded code... > > But from what I can tell, an implementation of memset that does not contain > an sfence after using movnti is considered broken. Callers of memset would > not (and should not need to) know that they must use an actual memory > barrier (sfence) after the memset call to get the usual x86 store-store > guarantee. > > Thread describing that bug in glibc memset implementation: > http://sourceware.org/ml/libc-alpha/2007-11/msg00017.html > > Redhat errata including that fix in a stable update: > http://rhn.redhat.com/errata/RHBA-2008-0083.html > > Then there's a recent discussion on the topic of who is responsible for > calling sfence on the gcc mailing list: > http://www.mail-archive.com/gcc at gcc.gnu.org/msg45939.html > > Unfortunately, that thread didn't seem to have any firm conclusion, but ISTM > that the current default assumption is (b): anything that uses movnti is > assumed to surround such uses with memory fences so that other code doesn't > need to. I didn't mean to imply that the fence was missing after the non-temporal store (yikes!!), rather that it was an example of a not uncommon situation where fencing (may be) required even in single-threaded x86 code. That said, Jeffrey raised good points that it isn't entirely clear at all to what extent non-temporal stores deviate from the ordering constraints of typical x86 code. From the threads you cite, there is also dispute about the best way to manage those deviations from the ordering constraints. At least w.r.t. memset, I would agree with you and assume that it is providing the fencing needed. From gregory.petrosyan at gmail.com Tue Jan 5 12:18:16 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 21:18:16 +0300 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <201001051130.41381.dag@cray.com> References: <20100105093837.GA10132@gregory-laptop> <201001051130.41381.dag@cray.com> Message-ID: <20100105181816.GA4168@gregory-laptop> On Tue, Jan 05, 2010 at 11:30:41AM -0600, David Greene wrote: > > I've managed to get my pass listed in 'opt -help' only after removing opt > > subdir from objdir and running make again. Re-configuring LLVM also does > > It sounds like the dependencies for your pass are not correct. Where > did you put it in the LLVM tree and how did you change the Makefiles? One new .cpp file in lib/Transforms/IPO + RegisterPass<> + mention pass in LinkAllPasses.h; no changes in makefiles. > > not trigger rebuild when running make, e.g. after initial 'configure > > --enable-targets=x86' I've managed to get C backend only after removing > > objdir and re-configuring (was too lazy to check if 'make clean' is > > sufficient). > > A non-build after reconfigure is not really a problem. If nothing in > the configuration has changed configure is smart enough not to update > anything so make doesn't see any changes. Yes, but in my case support for new targets should be built in. It is entirely possible that I've screwed something up, although I've tried to follow LLVM docs as closely as possible. LLVM build system is really not the nicest part of LLVM :-) Gregory From clattner at apple.com Tue Jan 5 12:31:32 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 5 Jan 2010 10:31:32 -0800 Subject: [LLVMdev] TableGen !eq() Operator Patch In-Reply-To: <201001051159.17165.dag@cray.com> References: <201001051159.17165.dag@cray.com> Message-ID: <908650D4-31FF-4F6C-93DE-012E3091BF24@apple.com> On Jan 5, 2010, at 9:59 AM, David Greene wrote: > Attached is a patch to implement an !eq() operator in TableGen. We > use this > for the AVX specification to allow the user to control what kind of > pattern > should be used for a particular instruction def. For example, we > use it for > reg-mem instructions to let the user choose between a built-in > generic reg-mem > pattern, the same pattern that was used for the reg-reg variant of the > instruction or a custom pattern provided by the user. > > It only operates on strings. > > An alternative to this would be to implement named parameters in > TableGen but > that seems like overkill for this case. Named parameters could be > useful in > other areas to reduce template complexity but implementing it is non- > trivial > given the widespread assumption in the TableGen sources that template > parameters exactly match up with template arguments. I can't say if this is really the best answer for AVX, but independently of that the patch looks great, please commit. -Chris From clattner at apple.com Tue Jan 5 12:32:51 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 5 Jan 2010 10:32:51 -0800 Subject: [LLVMdev] [PATCH] test-suite/bullet: unbreak linking In-Reply-To: <20100105101107.GA18004@gregory-laptop> References: <20100105101107.GA18004@gregory-laptop> Message-ID: <76FC0429-4B8D-49F3-96C9-7C5F2E111E49@apple.com> On Jan 5, 2010, at 2:11 AM, Gregory Petrosyan wrote: > Eliminate undefined references to powf, sqrtf and friends. Thanks, applied in r92748, -Chris > > Index: MultiSource/Benchmarks/Bullet/Makefile > =================================================================== > --- MultiSource/Benchmarks/Bullet/Makefile (revision 92512) > +++ MultiSource/Benchmarks/Bullet/Makefile (working copy) > @@ -1,7 +1,7 @@ > LEVEL = ../../../ > PROG = bullet > CPPFLAGS += -I$(PROJ_SRC_DIR)/include -DNO_TIME > -LDFLAGS = -lstdc++ > +LDFLAGS = -lstdc++ -lm > > include $(LEVEL)/Makefile.config > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From dag at cray.com Tue Jan 5 13:05:56 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 13:05:56 -0600 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <20100105181816.GA4168@gregory-laptop> References: <20100105093837.GA10132@gregory-laptop> <201001051130.41381.dag@cray.com> <20100105181816.GA4168@gregory-laptop> Message-ID: <201001051305.56726.dag@cray.com> On Tuesday 05 January 2010 12:18, Gregory Petrosyan wrote: > > It sounds like the dependencies for your pass are not correct. Where > > did you put it in the LLVM tree and how did you change the Makefiles? > > One new .cpp file in lib/Transforms/IPO + RegisterPass<> + mention pass in > LinkAllPasses.h; no changes in makefiles. Hmm, that should certainly work. What file are you touching that make doesn't seem to pick up? > > A non-build after reconfigure is not really a problem. If nothing in > > the configuration has changed configure is smart enough not to update > > anything so make doesn't see any changes. > > Yes, but in my case support for new targets should be built in. What do you mean? > It is entirely possible that I've screwed something up, although I've tried > to follow LLVM docs as closely as possible. LLVM build system is really not > the nicest part of LLVM :-) That's true, but that's autoconf's fault, not LLVM's. :) -Dave From dag at cray.com Tue Jan 5 13:10:12 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 13:10:12 -0600 Subject: [LLVMdev] TableGen !eq() Operator Patch In-Reply-To: <908650D4-31FF-4F6C-93DE-012E3091BF24@apple.com> References: <201001051159.17165.dag@cray.com> <908650D4-31FF-4F6C-93DE-012E3091BF24@apple.com> Message-ID: <201001051310.12703.dag@cray.com> On Tuesday 05 January 2010 12:31, Chris Lattner wrote: > I can't say if this is really the best answer for AVX, but > independently of that the patch looks great, please commit. Ok. I'm sure we'll iterate on the AVX stuff quite a bit. -Dave From gohman at apple.com Tue Jan 5 13:20:12 2010 From: gohman at apple.com (Dan Gohman) Date: Tue, 5 Jan 2010 11:20:12 -0800 Subject: [LLVMdev] AVX Testcases In-Reply-To: <201001051154.17715.dag@cray.com> References: <201001051154.17715.dag@cray.com> Message-ID: On Jan 5, 2010, at 9:54 AM, David Greene wrote: > I should be sending up some AVX code this week. When I do this > I'd like to generate some testcases to make sure we actually > generate AVX code. Ideally we'd have a testcase for each AVX > pattern but that's probably overkill. Still, we'd like a lot > of tests, I think. > > Should these tests go into CodeGen/X86 or should I create a new space, > like CodeGen/X86/SIMD? I tend to favor the latter as it keeps things > more compartmentalized and cleaner. Adding a subdirectory here sounds good. "SIMD" might be a bit too general though; how about test/CodeGen/X86/AVX? Dan From gregory.petrosyan at gmail.com Tue Jan 5 13:27:14 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 22:27:14 +0300 Subject: [LLVMdev] [PATCH] test-suite/libcalls: unbreak build In-Reply-To: References: <20100105134333.GA1195@gregory-laptop> <20100105142242.GA17711@gregory-laptop> Message-ID: <20100105192714.GA12480@gregory-laptop> On Tue, Jan 05, 2010 at 10:05:21AM -0800, Chris Lattner wrote: > looks like some lines got moved, fixed on mainline, thanks. Not really fixed :-) Please commit this: Index: TEST.libcalls.Makefile =================================================================== --- TEST.libcalls.Makefile (revision 92749) +++ TEST.libcalls.Makefile (working copy) @@ -23,10 +23,10 @@ $(PROGRAMS_TO_TEST:%=Output/%.$(TEST).report.txt): \ Output/%.$(TEST).report.txt: Output/%.linked.rbc $(LOPT) \ $(PROJ_SRC_ROOT)/TEST.libcalls.Makefile + $(VERB) $(RM) -f $@ @echo "---------------------------------------------------------------" >> $@ @echo ">>> ========= '$(RELDIR)/$*' Program" >> $@ @echo "---------------------------------------------------------------" >> $@ - $(VERB) $(RM) -f $@ @-$(LOPT) -simplify-libcalls -stats -debug-only=simplify-libcalls \ -time-passes -disable-output $< 2>>$@ summary: From gregory.petrosyan at gmail.com Tue Jan 5 13:33:05 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 22:33:05 +0300 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <201001051305.56726.dag@cray.com> References: <20100105093837.GA10132@gregory-laptop> <201001051130.41381.dag@cray.com> <20100105181816.GA4168@gregory-laptop> <201001051305.56726.dag@cray.com> Message-ID: <20100105193305.GB12480@gregory-laptop> On Tue, Jan 05, 2010 at 01:05:56PM -0600, David Greene wrote: > On Tuesday 05 January 2010 12:18, Gregory Petrosyan wrote: > > > > It sounds like the dependencies for your pass are not correct. Where > > > did you put it in the LLVM tree and how did you change the Makefiles? > > > > One new .cpp file in lib/Transforms/IPO + RegisterPass<> + mention pass in > > LinkAllPasses.h; no changes in makefiles. > > Hmm, that should certainly work. What file are you touching that make > doesn't seem to pick up? Sorry, can't tell you that now: I've switched to srcdir == objdir configuration. > > > A non-build after reconfigure is not really a problem. If nothing in > > > the configuration has changed configure is smart enough not to update > > > anything so make doesn't see any changes. > > > > Yes, but in my case support for new targets should be built in. > > What do you mean? I've done these: 1) configure --enable-targets=x86 2) make 3) configure --enable-targets=all 4) make and after it I still did not had e.g. C backend. > > It is entirely possible that I've screwed something up, although I've tried > > to follow LLVM docs as closely as possible. LLVM build system is really not > > the nicest part of LLVM :-) > > That's true, but that's autoconf's fault, not LLVM's. :) And what was the reason for picking autoconf? Gregory From clattner at apple.com Tue Jan 5 13:42:10 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 5 Jan 2010 11:42:10 -0800 Subject: [LLVMdev] [PATCH] test-suite/libcalls: unbreak build In-Reply-To: <20100105192714.GA12480@gregory-laptop> References: <20100105134333.GA1195@gregory-laptop> <20100105142242.GA17711@gregory-laptop> <20100105192714.GA12480@gregory-laptop> Message-ID: <288A7E14-03FE-497F-BDAE-A8CC31F96AFA@apple.com> Doh, thanks, done. -Chris On Jan 5, 2010, at 11:27 AM, Gregory Petrosyan wrote: > On Tue, Jan 05, 2010 at 10:05:21AM -0800, Chris Lattner wrote: >> looks like some lines got moved, fixed on mainline, thanks. > > Not really fixed :-) Please commit this: > > Index: TEST.libcalls.Makefile > =================================================================== > --- TEST.libcalls.Makefile (revision 92749) > +++ TEST.libcalls.Makefile (working copy) > @@ -23,10 +23,10 @@ > $(PROGRAMS_TO_TEST:%=Output/%.$(TEST).report.txt): \ > Output/%.$(TEST).report.txt: Output/%.linked.rbc $(LOPT) \ > $(PROJ_SRC_ROOT)/TEST.libcalls.Makefile > + $(VERB) $(RM) -f $@ > @echo > "---------------------------------------------------------------" >> > $@ > @echo ">>> ========= '$(RELDIR)/$*' Program" >> $@ > @echo > "---------------------------------------------------------------" >> > $@ > - $(VERB) $(RM) -f $@ > @-$(LOPT) -simplify-libcalls -stats -debug-only=simplify-libcalls \ > -time-passes -disable-output $< 2>>$@ > summary: From gregory.petrosyan at gmail.com Tue Jan 5 13:52:16 2010 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Tue, 5 Jan 2010 22:52:16 +0300 Subject: [LLVMdev] [PATCH] test-suite/libcalls: unbreak build In-Reply-To: <288A7E14-03FE-497F-BDAE-A8CC31F96AFA@apple.com> References: <20100105134333.GA1195@gregory-laptop> <20100105142242.GA17711@gregory-laptop> <20100105192714.GA12480@gregory-laptop> <288A7E14-03FE-497F-BDAE-A8CC31F96AFA@apple.com> Message-ID: <20100105195216.GA21298@gregory-laptop> On Tue, Jan 05, 2010 at 11:42:10AM -0800, Chris Lattner wrote: > Doh, thanks, done. LOL. Next patch should be titled 'really really really fix this' :-) Please apply the last part of the diff, too: Index: TEST.libcalls.Makefile =================================================================== --- TEST.libcalls.Makefile (revision 92757) +++ TEST.libcalls.Makefile (working copy) @@ -27,7 +27,6 @@ @echo "---------------------------------------------------------------" >> $@ @echo ">>> ========= '$(RELDIR)/$*' Program" >> $@ @echo "---------------------------------------------------------------" >> $@ - $(VERB) $(RM) -f $@ @-$(LOPT) -simplify-libcalls -stats -debug-only=simplify-libcalls \ -time-passes -disable-output $< 2>>$@ summary: From dag at cray.com Tue Jan 5 13:53:55 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 13:53:55 -0600 Subject: [LLVMdev] AVX Testcases In-Reply-To: References: <201001051154.17715.dag@cray.com> Message-ID: <201001051353.56029.dag@cray.com> On Tuesday 05 January 2010 13:20, Dan Gohman wrote: > On Jan 5, 2010, at 9:54 AM, David Greene wrote: > > I should be sending up some AVX code this week. When I do this > > I'd like to generate some testcases to make sure we actually > > generate AVX code. Ideally we'd have a testcase for each AVX > > pattern but that's probably overkill. Still, we'd like a lot > > of tests, I think. > > > > Should these tests go into CodeGen/X86 or should I create a new space, > > like CodeGen/X86/SIMD? I tend to favor the latter as it keeps things > > more compartmentalized and cleaner. > > Adding a subdirectory here sounds good. "SIMD" might be a bit too general > though; how about test/CodeGen/X86/AVX? This will be a rewrite of all the x86 SIMD stuff so the name seemed appropriate. It'll be testing the whole infrastructure, not just AVX. -Dave From dag at cray.com Tue Jan 5 14:21:53 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 14:21:53 -0600 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <20100105193305.GB12480@gregory-laptop> References: <20100105093837.GA10132@gregory-laptop> <201001051305.56726.dag@cray.com> <20100105193305.GB12480@gregory-laptop> Message-ID: <201001051421.53843.dag@cray.com> On Tuesday 05 January 2010 13:33, Gregory Petrosyan wrote: > > > > A non-build after reconfigure is not really a problem. If nothing in > > > > the configuration has changed configure is smart enough not to update > > > > anything so make doesn't see any changes. > > > > > > Yes, but in my case support for new targets should be built in. > > > > What do you mean? > > I've done these: > > 1) configure --enable-targets=x86 > 2) make > 3) configure --enable-targets=all > 4) make > > and after it I still did not had e.g. C backend. Ah. I actually don't know what configure does in that case. I suppose it depends on what .in files actually use the target list. This could be a real problem, I just don't know enough about the build system to be sure. > > > It is entirely possible that I've screwed something up, although I've > > > tried to follow LLVM docs as closely as possible. LLVM build system is > > > really not the nicest part of LLVM :-) > > > > That's true, but that's autoconf's fault, not LLVM's. :) > > And what was the reason for picking autoconf? Don't ask me, it's not what I would have done. :) But to be fair, at the time autoconf was really the only game in town. Even now, only CMake really competes in this space. Then again, neither one satisfies Joel Test #2: http://www.joelonsoftware.com/articles/fog0000000043.html I've wondered for a long time why software systems don't build a build system around a tool that's actually designed for it. Like make. In fact I wondered so much that I went and did it. Parallel configure/build/test is a really nifty thing. It's fun seeing regression tests running before the software build is complete. :) Make is not everyone's cup of tea but for those of us crazy enough to write in something akin to declarative LISP, it's a nice diversion from boring old C++ metaprogramming. :) -Dave From dag at cray.com Tue Jan 5 14:30:43 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 14:30:43 -0600 Subject: [LLVMdev] About LLVM In-Reply-To: <4B437FF8.7070906@cse.iitb.ac.in> References: <4B437FF8.7070906@cse.iitb.ac.in> Message-ID: <201001051430.44024.dag@cray.com> On Tuesday 05 January 2010 12:07, ambika wrote: > I am a new user to LLVM and trying to figure out if this is what I > really want to complete my thesis work. Yes. :) > I want to add a pointer analysis phase to compiler optimization. This > analysis will take profile information into account. Ok. LLVM has profiling. What kind of profile information do you need? > My analysis requires to analyse a single statement at a time and then I > also want to perform some code duplication to finally perform the > optimization. Define "statement." Do you mean source-level statement? the LLVM IR works in terms of instructions. Groups of instructions can be specified in a DAG of sorts or they can be fissioned into individual instructions that each write to a temporary virtual register. There is no high-level notion of a statement or control structure. If you need that you'll have to build it. No other open source C/C++ compiler infrastructure provides that either, AFAIK. > Finally I would like to compare the results with and without my pass on > benchmark standards. This is trivial with LLVM. > I was just wondering if I can do all this with LLVM. I have very short > period for my thesis and I have to implement and get the results. So I > will be highly obliged if anyone of you can help me out. What's "very short?" Doing anything substantial with aliasing/pointer analysis is not "very short" by definition. -Dave From clattner at apple.com Tue Jan 5 14:37:53 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 5 Jan 2010 12:37:53 -0800 Subject: [LLVMdev] [PATCH] test-suite/libcalls: unbreak build In-Reply-To: <20100105195216.GA21298@gregory-laptop> References: <20100105134333.GA1195@gregory-laptop> <20100105142242.GA17711@gregory-laptop> <20100105192714.GA12480@gregory-laptop> <288A7E14-03FE-497F-BDAE-A8CC31F96AFA@apple.com> <20100105195216.GA21298@gregory-laptop> Message-ID: On Jan 5, 2010, at 11:52 AM, Gregory Petrosyan wrote: > On Tue, Jan 05, 2010 at 11:42:10AM -0800, Chris Lattner wrote: >> Doh, thanks, done. > > LOL. Next patch should be titled 'really really really fix this' :-) Heh, already did: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100104/093682.html Not a good day for me I guess ;-) > > Please apply the last part of the diff, too: > > Index: TEST.libcalls.Makefile > =================================================================== > --- TEST.libcalls.Makefile (revision 92757) > +++ TEST.libcalls.Makefile (working copy) > @@ -27,7 +27,6 @@ > @echo > "---------------------------------------------------------------" >> > $@ > @echo ">>> ========= '$(RELDIR)/$*' Program" >> $@ > @echo > "---------------------------------------------------------------" >> > $@ > - $(VERB) $(RM) -f $@ > @-$(LOPT) -simplify-libcalls -stats -debug-only=simplify-libcalls \ > -time-passes -disable-output $< 2>>$@ > summary: From anton at korobeynikov.info Tue Jan 5 14:48:11 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 5 Jan 2010 23:48:11 +0300 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <201001051421.53843.dag@cray.com> References: <20100105093837.GA10132@gregory-laptop> <201001051305.56726.dag@cray.com> <20100105193305.GB12480@gregory-laptop> <201001051421.53843.dag@cray.com> Message-ID: \>> I've done these: >> >> ? ? ? 1) configure --enable-targets=x86 >> ? ? ? 2) make >> ? ? ? 3) configure --enable-targets=all >> ? ? ? 4) make >> >> and after it I still did not had e.g. C backend. > > Ah. ?I actually don't know what configure does in that case. ?I suppose > it depends on what .in files actually use the target list. ?This could be > a real problem, I just don't know enough about the build system to be sure. I think in this situation the second configure uses the cached stuff from the first run... -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From edwintorok at gmail.com Tue Jan 5 14:49:34 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Tue, 05 Jan 2010 22:49:34 +0200 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <201001051421.53843.dag@cray.com> References: <20100105093837.GA10132@gregory-laptop> <201001051305.56726.dag@cray.com> <20100105193305.GB12480@gregory-laptop> <201001051421.53843.dag@cray.com> Message-ID: <4B43A5DE.2010709@gmail.com> On 2010-01-05 22:21, David Greene wrote: > On Tuesday 05 January 2010 13:33, Gregory Petrosyan wrote: > > >>>>> A non-build after reconfigure is not really a problem. If nothing in >>>>> the configuration has changed configure is smart enough not to update >>>>> anything so make doesn't see any changes. >>>>> >>>> Yes, but in my case support for new targets should be built in. >>>> >>> What do you mean? >>> >> I've done these: >> >> 1) configure --enable-targets=x86 >> 2) make >> 3) configure --enable-targets=all >> 4) make >> >> and after it I still did not had e.g. C backend. >> > > Ah. I actually don't know what configure does in that case. I suppose > it depends on what .in files actually use the target list. This could be > a real problem, I just don't know enough about the build system to be sure. > > >>>> It is entirely possible that I've screwed something up, although I've >>>> tried to follow LLVM docs as closely as possible. LLVM build system is >>>> really not the nicest part of LLVM :-) >>>> >>> That's true, but that's autoconf's fault, not LLVM's. :) >>> >> And what was the reason for picking autoconf? >> > > Don't ask me, it's not what I would have done. :) > > But to be fair, at the time autoconf was really the only game in town. > Even now, only CMake really competes in this space. > > Then again, neither one satisfies Joel Test #2: > > http://www.joelonsoftware.com/articles/fog0000000043.html > > I've wondered for a long time why software systems don't build a build system > around a tool that's actually designed for it. Like make. In fact I wondered > so much that I went and did it. Parallel configure/build/test is a really > nifty thing. It's fun seeing regression tests running before the software > build is complete. :) > Slightly offtopic, I noticed this project which does something very similar to what you describe: http://code.google.com/p/quagmire/ I don't know what its current state is, but is something worth keeping an eye on IMHO. Best regards, --Edwin From edwintorok at gmail.com Tue Jan 5 14:52:26 2010 From: edwintorok at gmail.com (=?UTF-8?B?VMO2csO2ayBFZHdpbg==?=) Date: Tue, 05 Jan 2010 22:52:26 +0200 Subject: [LLVMdev] Clang "warning: cannot find entry symbol mit-llvm-bc" In-Reply-To: <000e01ca8dc2$fd100da0$2b00030a@c8d07e44d243485> References: <000e01ca8dc2$fd100da0$2b00030a@c8d07e44d243485> Message-ID: <4B43A68A.4090502@gmail.com> On 2010-01-05 06:53, Li Shengmei wrote: > > Hi, > > I am new to Clang. There is a warning when I use clang > > $llvmc -clang test.c > > ??/bin/ld: warning: cannot find entry symbol mit-llvm-bc; defaulting > to 00000000004003c0 > This looks like something has gone wrong during command-line parsing, and it has interpreted -emit-llvm-bc as "entrypoint is mit-llvm-bc" If you just want to create a bitcode from clang simplest way is to run: clang -emit-llvm-bc test.c -o test.bc test.c -c Best regards, --Edwin From erwin.coumans at gmail.com Tue Jan 5 14:53:57 2010 From: erwin.coumans at gmail.com (Erwin Coumans) Date: Tue, 5 Jan 2010 12:53:57 -0800 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? Message-ID: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> How do other benchmarks deal with unstable algorithms or differences in floating point results? >> haven't been following this thread, but this sounds like a typical >> unstable algorithm problem. Are you always operating that close to >> the tolerance level of the algorithm or are there some sets of inputs >> that will behave reasonably? What do you mean by "reasonably" or "affect codes so horribly"? The accumulation of algorithms in a physics pipeline is unstable and unless the compiler/platform guarantees 100% identical floating point results, the outcome will diverge. Do you think LLVM can be forced to produce identical floating point results? Even when using different optimization levels or even different CPUs? Some CPUs use 80bit FPU precision for intermediate results (on-chip in registers), while variables in-memory only use 32-bit or 64bit precision. In combination with cancellation and other re-ordering this can give slightly different results. >> If not, the code doesn't seem very useful to me. How could anyone rely >> on the results, ever? The code has proven to be useful for games and special effects in film, but this particular benchmark might not suite LLVM testing indeed. I suggest working on a better benchmark that tests independent parts of the pipeline, so we don't accumulate results (several frames) but we test a single algorithm at a time, with known input and expected output. This avoid unstability and we can measure the error of the output. Anton, are you interested in working together on such improved benchmark? Thanks, Erwin * * > Date: Mon, 4 Jan 2010 20:24:23 -0600 > From: David Greene > Subject: Re: [LLVMdev] Help adding the Bullet physics sdk benchmark to > the LLVM test suite? > To: llvmdev at cs.uiuc.edu > Message-ID: <201001042024.23451.dag at cray.com> > Content-Type: text/plain; charset="iso-8859-15" > > On Monday 04 January 2010 20:11, Erwin Coumans wrote: > > Hi Anton, and happy new year all, > > > > >>One questions though: is it possible to "verify" the results of all > > >>the computations somehow? > > > > Good point, and there is no automated way currently, but we can work on > > that. > > Note that simulation suffers from the 'butterfly effect', so the smallest > > change anywhere, > > (cpu, compiler etc) diverges into totally different results after a > while. > > I haven't been following this thread, but this sounds like a typical > unstable algorithm problem. Are you always operating that close to > the tolerance level of the algorithm or are there some sets of inputs > that will behave reasonably? > > If not, the code doesn't seem very useful to me. How could anyone rely > on the results, ever? > > In the worst case, you could experiment with different optimization levels > and/or Pass combinations to find something that is reasonably stable. > > Perhaps LLVM needs a flag to disable sometimes undesireable > transformations. > Like anything involving floating-point calculations. Compiler changes > should > not affect codes so horribly unless the user tells them to. :) The Cray > compiler provides various -Ofp (-Ofp0, -Ofp1, etc.) levels for this very > reason. > > > There are a few ways of verification I can think of: > > > > 1) verifying by adding unit tests for all stages in the physics pipeline > > (broadphase acceleration structures, closest point computation, > constraint > > solver) > > Given known input and output we can check if the solution is within a > > certain tolerance. > > At each stage? That's reasonable. It could also help identify the parts > of > the pipeline that are unstable (if not already known). > > > 2) using the benchmark simulation and verifying the results frame by > frame > > and check for unusual behaviour > > Sounds expensive. > > > 3) modify the benchmark so that it is easier to test the end result, even > > through it might be different. > > We really don't want to do this. Either LLVM needs to be fixed to respect > floating-point evaluation in unstable cases or the benchmark and upstream > code > needs to be fixed to be more stable. > > -Dave > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/1abdcc22/attachment.html From dag at cray.com Tue Jan 5 15:21:21 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 15:21:21 -0600 Subject: [LLVMdev] make fails to detect changes in case srcdir != objdir In-Reply-To: <4B43A5DE.2010709@gmail.com> References: <20100105093837.GA10132@gregory-laptop> <201001051421.53843.dag@cray.com> <4B43A5DE.2010709@gmail.com> Message-ID: <201001051521.21968.dag@cray.com> On Tuesday 05 January 2010 14:49, T?r?k Edwin wrote: > Slightly offtopic, I noticed this project which does something very > similar to what you describe: > http://code.google.com/p/quagmire/ > > I don't know what its current state is, but is something worth keeping > an eye on IMHO. Very interesting. I'll have a look! -Dave From dag at cray.com Tue Jan 5 15:38:53 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 15:38:53 -0600 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> Message-ID: <201001051538.53664.dag@cray.com> On Tuesday 05 January 2010 14:53, Erwin Coumans wrote: > How do other benchmarks deal with unstable algorithms or differences in > floating point results? > > >> haven't been following this thread, but this sounds like a typical > >> unstable algorithm problem. Are you always operating that close to > >> the tolerance level of the algorithm or are there some sets of inputs > >> that will behave reasonably? > > What do you mean by "reasonably" or "affect codes so horribly"? "Reasonably" means the numerics won't blow up due to small changes in floating-point results caused by compiler transformations like reassociation. "Affects code so horribly" means that the compiler is causing an unstable algorithm to blow up, generating useless results. This shouldn't happen unless the user allows it with an explicit compiler flag. AFAIK LLVM has no such flag yet. It has some flags to control changes in precision, which helps, but I don't think there's a flag that says "don't do anything risky, ever." For example, a gfortran-fronted LLVM should have a way to always respect ordering indicated by parentheses. I don't know if gfortran even has that, let alone LLVM proper. > The accumulation of algorithms in a physics pipeline is unstable and unless > the compiler/platform guarantees 100% identical floating point results, the > outcome will diverge. Yep. 100% reproducability is really important. LLVM should have a flag to guarantee it. > Do you think LLVM can be forced to produce identical floating point > results? Even when using different optimization levels or even different > CPUs? Not right now, but the support can certainly be added. It really *should* be added. It will take a bit of work, however. > Some CPUs use 80bit FPU precision for intermediate results (on-chip in > registers), while variables in-memory only use 32-bit or 64bit precision. > In combination with cancellation and other re-ordering this can > give slightly different results. Yep, which is why good compilers have ways to control this. llc, for example, has the -disable-excess-fp-precision and -enable-unsafe-fp-math options. I don't know if there's a way to control usage of the x87 stack, however. > >> If not, the code doesn't seem very useful to me. How could anyone rely > >> on the results, ever? > > The code has proven to be useful for games and special effects in film, > but this particular benchmark might not suite LLVM testing indeed. We can make it suit it. If it works for real world situations it must work when compiled with LLVM. Otherwise it's an LLVM bug (assuming the code is not doing undefined things). > I suggest working on a better benchmark that tests independent parts of the > pipeline, That's useful in itself. > so we don't accumulate results (several frames) but we test a single > algorithm at a time, No, we should be testing this accumulated stuff as well. As LLVM gets used in more arenas, this type of problem will crop up, guaranteed. In fact the only way we (Cray) get away with it is that we don't use very many LLVM passes and we stricly target SSE only. -Dave From vadve at illinois.edu Tue Jan 5 15:39:05 2010 From: vadve at illinois.edu (Adve, Vikram Sadanand) Date: Tue, 5 Jan 2010 15:39:05 -0600 Subject: [LLVMdev] Fwd: [TSG-Announce] network upgrades 6-8am: 3rd floor tomorrow, 4th floor Thursday References: <20100105180526.GE13213@cs.illinois.edu> Message-ID: The machine hosting all llvm.org services will be down briefly tomorrow morning US Central time for network maintenance. The maintenance is scheduled for 6-8am, but I am told our machine will be one of the first to come back up and could be up by 6:30am. --Vikram Associate Professor, Computer Science University of Illinois at Urbana-Champaign http://llvm.org/~vadve Begin forwarded message: > Hello, > > A quick reminder that we're doing network maintenance the next two > mornings. > > Tomorrow morning (Wednesday) we're replacing the 3rd floor networking > equipment. Systems on the third floor will be without networking > starting around 6am and back online by 8am. > > Thursday morning we're replacing the 4th floor networking equipment. > Systems on the fourth floor will be without networking starting at 6am > and back online by 8am. > > These outages will be localized to just the 3rd and 4th floors, but > may > impact other systems that depend on servers on those floors. The > computer rooms outages will be brief to minimize this impact. > > The earlier announcement (with more details) is below, or you can read > it online at https://agora.cs.illinois.edu/x/AxywAQ > > Cheers, > Dave From gohman at apple.com Tue Jan 5 15:57:28 2010 From: gohman at apple.com (Dan Gohman) Date: Tue, 5 Jan 2010 13:57:28 -0800 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: <201001051538.53664.dag@cray.com> References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> <201001051538.53664.dag@cray.com> Message-ID: <9F8979EE-99B4-4C36-8014-8C55014036F6@apple.com> On Jan 5, 2010, at 1:38 PM, David Greene wrote: > I don't think there's a flag that says "don't do anything risky, > ever." "Don't do anything risky with floating-point" is the default mode. If you're aware of any unsafe floating-point optimizations being done by default, please file a bug. > For example, a gfortran-fronted LLVM should have a way to always respect > ordering indicated by parentheses. I don't know if gfortran even has that, > let alone LLVM proper. LLVM does not currently re-associate floating-point values, so this hasn't been an issue. Dan From gohman at apple.com Tue Jan 5 16:01:33 2010 From: gohman at apple.com (Dan Gohman) Date: Tue, 5 Jan 2010 14:01:33 -0800 Subject: [LLVMdev] AVX Testcases In-Reply-To: <201001051353.56029.dag@cray.com> References: <201001051154.17715.dag@cray.com> <201001051353.56029.dag@cray.com> Message-ID: On Jan 5, 2010, at 11:53 AM, David Greene wrote: > On Tuesday 05 January 2010 13:20, Dan Gohman wrote: >> On Jan 5, 2010, at 9:54 AM, David Greene wrote: >>> I should be sending up some AVX code this week. When I do this >>> I'd like to generate some testcases to make sure we actually >>> generate AVX code. Ideally we'd have a testcase for each AVX >>> pattern but that's probably overkill. Still, we'd like a lot >>> of tests, I think. >>> >>> Should these tests go into CodeGen/X86 or should I create a new space, >>> like CodeGen/X86/SIMD? I tend to favor the latter as it keeps things >>> more compartmentalized and cleaner. >> >> Adding a subdirectory here sounds good. "SIMD" might be a bit too general >> though; how about test/CodeGen/X86/AVX? > > This will be a rewrite of all the x86 SIMD stuff so the name seemed > appropriate. It'll be testing the whole infrastructure, not just AVX. Ok, SIMD sounds fine. Dan From anton at korobeynikov.info Tue Jan 5 16:13:08 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 6 Jan 2010 01:13:08 +0300 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> Message-ID: Hello, Erwin > I suggest working on a better benchmark that tests independent parts of the > pipeline, > so we don't accumulate results (several frames) but we test a single > algorithm at a time, > with known input and expected output. This avoid unstability and we can > measure the error of the output. > Anton, are you interested in working together on such improved benchmark? This is pretty interesting approach. However, for now I'm more concerned about code speed, I'm seeing that llvm-generated code is slower that gcc-generated one on at least two platforms (20% on x86-64 & even more on arm), so, I suspect an optimization deficiency is somewhere... -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dag at cray.com Tue Jan 5 16:28:37 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 16:28:37 -0600 Subject: [LLVMdev] =?iso-8859-1?q?Help_adding_the_Bullet_physics_sdk_bench?= =?iso-8859-1?q?mark_to_the=09LLVM_test_suite=3F?= In-Reply-To: <9F8979EE-99B4-4C36-8014-8C55014036F6@apple.com> References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> <201001051538.53664.dag@cray.com> <9F8979EE-99B4-4C36-8014-8C55014036F6@apple.com> Message-ID: <201001051628.37415.dag@cray.com> On Tuesday 05 January 2010 15:57, Dan Gohman wrote: > On Jan 5, 2010, at 1:38 PM, David Greene wrote: > > I don't think there's a flag that says "don't do anything risky, > > ever." > > "Don't do anything risky with floating-point" is the default mode. If > you're aware of any unsafe floating-point optimizations being done by > default, please file a bug. Ok. It seems that something is causing a problem if Bullet is failing. -Dave From dag at cray.com Tue Jan 5 16:29:28 2010 From: dag at cray.com (David Greene) Date: Tue, 5 Jan 2010 16:29:28 -0600 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> Message-ID: <201001051629.29042.dag@cray.com> On Tuesday 05 January 2010 16:13, Anton Korobeynikov wrote: > Hello, Erwin > > > I suggest working on a better benchmark that tests independent parts of > > the pipeline, > > so we don't accumulate results (several frames) but we test a single > > algorithm at a time, > > with known input and expected output. This avoid unstability and we can > > measure the error of the output. > > Anton, are you interested in working together on such improved benchmark? > > This is pretty interesting approach. However, for now I'm more > concerned about code speed, I'm seeing that llvm-generated code is > slower that gcc-generated one on at least two platforms (20% on x86-64 > & even more on arm), so, I suspect an optimization deficiency is > somewhere... But keep in mind that fast+incorrect is no good. Are we sure the gcc code is correct? -Dave From erwin.coumans at gmail.com Tue Jan 5 16:46:05 2010 From: erwin.coumans at gmail.com (Erwin Coumans) Date: Tue, 5 Jan 2010 14:46:05 -0800 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: <201001051628.37415.dag@cray.com> References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> <201001051538.53664.dag@cray.com> <9F8979EE-99B4-4C36-8014-8C55014036F6@apple.com> <201001051628.37415.dag@cray.com> Message-ID: <419a36b41001051446k400522b6sa7c1f6862d7f74ec@mail.gmail.com> We haven't determined what 'failing' means or what the 'correct' behaviour is. Imagine a ball at the top of a rounded hill. If the ball is not exactly at the top but a tiny amount on the left it will roll left, but a tiny amount on the right it will roll right. The difference in initial position can be negligible but the final result is miles away. Is there a irc channel or perhaps google wave for a quick chat? Or do people prefer to keep all communication in the mailing list so everyone can participate? Thanks, Erwin 2010/1/5 David Greene > On Tuesday 05 January 2010 15:57, Dan Gohman wrote: > > On Jan 5, 2010, at 1:38 PM, David Greene wrote: > > > I don't think there's a flag that says "don't do anything risky, > > > ever." > > > > "Don't do anything risky with floating-point" is the default mode. If > > you're aware of any unsafe floating-point optimizations being done by > > default, please file a bug. > > Ok. It seems that something is causing a problem if Bullet is failing. > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/dd01b06e/attachment.html From junk at giantblob.com Tue Jan 5 17:34:43 2010 From: junk at giantblob.com (James Williams) Date: Tue, 5 Jan 2010 23:34:43 +0000 Subject: [LLVMdev] LLVM C bindings and Boehm GC Message-ID: Hi, I want to use LLVM as replacement code generator for an existing self hosting compiler. I hope to replace the existing BURS code generator with LLVM in order to take advantage of LLVM's JIT, optimizations and wider range of targets. I'm planning on ditching my existing IR completely and using my language's native call mechanism to call the LLVM C bindings. I've got a couple of questions I'd be grateful if anyone can answer: My language in general and the compiler in particular rely heavily on Boehm GC. I'm assuming that LLVM is OK with being linked into a process that's using libgc? I really don't want to write a garbage collector! My existing IR has no structured type information. All structures are layed out exactly for the target machine before intermediate code is generated and array and intermediate code for class/struct field accesses are all pointer operations. I'm assuming this approach will work with LLVM provided sizes match the target machine and I cast everything correctly? -- James -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/b78e5a7a/attachment.html From junk at giantblob.com Tue Jan 5 17:38:52 2010 From: junk at giantblob.com (James Williams) Date: Tue, 5 Jan 2010 23:38:52 +0000 Subject: [LLVMdev] LLVM C bindings and Boehm GC In-Reply-To: References: Message-ID: 2010/1/5 James Williams > Hi, > > I want to use LLVM as replacement code generator for an existing self > hosting compiler. I hope to replace the existing BURS code generator with > LLVM in order to take advantage of LLVM's JIT, optimizations and wider range > of targets. I'm planning on ditching my existing IR completely and using my > language's native call mechanism to call the LLVM C bindings. > > I've got a couple of questions I'd be grateful if anyone can answer: > > My language in general and the compiler in particular rely heavily on Boehm > GC. I'm assuming that LLVM is OK with being linked into a process that's > using libgc? I really don't want to write a garbage collector! > > My existing IR has no structured type information. All structures are layed > out exactly for the target machine before intermediate code is generated and > array and intermediate code for class/struct field accesses are all pointer > operations. I'm assuming this approach will work with LLVM provided sizes > match the target machine and I cast everything correctly? > Oops, mistyped this. I intended to say: My existing IR has no structured type information. All structures are layed out exactly for the target machine before intermediate code is generated *and intermediate code for array and class/struct field *accesses are all pointer operations. I'm assuming this approach will work with LLVM provided sizes match the target machine and I cast everything correctly? > > -- James > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100105/107fcc9d/attachment.html From anton at korobeynikov.info Tue Jan 5 17:41:15 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 6 Jan 2010 02:41:15 +0300 Subject: [LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite? In-Reply-To: <419a36b41001051446k400522b6sa7c1f6862d7f74ec@mail.gmail.com> References: <419a36b41001051253j6191dbcfpf771b6405f102f2b@mail.gmail.com> <201001051538.53664.dag@cray.com> <9F8979EE-99B4-4C36-8014-8C55014036F6@apple.com> <201001051628.37415.dag@cray.com> <419a36b41001051446k400522b6sa7c1f6862d7f74ec@mail.gmail.com> Message-ID: Hello, Erwin > Is there a irc channel or perhaps google wave for a quick chat??Or do people > prefer to keep all communication in the mailing list so everyone can > participate? There is #llvm IRC channel on OFTC network. ML audience is definitely larger :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From vargaz at gmail.com Tue Jan 5 18:24:26 2010 From: vargaz at gmail.com (Zoltan Varga) Date: Wed, 6 Jan 2010 01:24:26 +0100 Subject: [LLVMdev] LLVM C bindings and Boehm GC In-Reply-To: References: Message-ID: <295e750a1001051624w5c3ab3a5ydf8d622260677a3c@mail.gmail.com> Hi, I've got a couple of questions I'd be grateful if anyone can answer: > > My language in general and the compiler in particular rely heavily on Boehm > GC. I'm assuming that LLVM is OK with being linked into a process that's > using libgc? I really don't want to write a garbage collector! > > mono uses llvm and Boehm GC, and the two seems to coexist fine. Zoltan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/07e55cb7/attachment.html From glguida at gmail.com Tue Jan 5 18:58:10 2010 From: glguida at gmail.com (Gianluca Guida) Date: Wed, 6 Jan 2010 01:58:10 +0100 Subject: [LLVMdev] [PATCH] Add simple cross-block DSE. Message-ID: Hello, This patch implements cross-block dead store elimination for a simple scenario -- which was somehow important in my case --, i.e. when a store has only one memory dependence in a function. This patch is a bit narrow-minded (e..g, only store instructions are checked for memory dependencies), but I can always make it more generic, if you give me pointers *and* my code is correct. Cheers, Gianluca -- It was a type of people I did not know, I found them very strange and they did not inspire confidence at all. Later I learned that I had been introduced to electronic engineers. E. W. Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: simple-cross-bb-dse.patch Type: application/octet-stream Size: 1114 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/2d3f0528/attachment.obj From grosser at fim.uni-passau.de Tue Jan 5 19:29:06 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 06 Jan 2010 02:29:06 +0100 Subject: [LLVMdev] "Graphite" for llvm [building infrastructure] In-Reply-To: <13951_1262699107_4B434263_13951_326_1_4B434261.7050102@gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <13951_1262699107_4B434263_13951_326_1_4B434261.7050102@gmail.com> Message-ID: <4B43E762.6030601@fim.uni-passau.de> On 01/05/10 14:45, ether wrote: > hi Tobi, > > i just added the Poly > library(http://wiki.llvm.org/Polyhedral_optimization_framework) to llvm > build system, which only contain a toy pass "Poly". > i think we could add the polyhedral optimization stuff in to this library. > > it was test under cmake+visual studio 2009, and i also add the library > build rule to MAKEFILEs, but not sure if it work under linux/cygwin/mac, > sorry. > hope this help > > best regards > > --ether hi ether, I pushed your work to our git repository at http://repo.or.cz/w/llvm-complete/pofl.git So we can work on first version that could be committed to the LLVM svn repository. I just had some discussions on the LLVM IRC channel concerning the integration of Poly. From grosser at fim.uni-passau.de Tue Jan 5 19:39:54 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 06 Jan 2010 02:39:54 +0100 Subject: [LLVMdev] "Graphite" for llvm [building infrastructure] In-Reply-To: <13951_1262699107_4B434263_13951_326_1_4B434261.7050102@gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <13951_1262699107_4B434263_13951_326_1_4B434261.7050102@gmail.com> Message-ID: <4B43E9EA.9070201@fim.uni-passau.de> On 01/05/10 14:45, ether wrote: > hi Tobi, > > i just added the Poly > library(http://wiki.llvm.org/Polyhedral_optimization_framework) to llvm > build system, which only contain a toy pass "Poly". > i think we could add the polyhedral optimization stuff in to this library. > > it was test under cmake+visual studio 2009, and i also add the library > build rule to MAKEFILEs, but not sure if it work under linux/cygwin/mac, > sorry. > hope this help > > best regards > > --ether [the complete mail] hi ether, great start. I pushed your work to our git repository at http://repo.or.cz/w/llvm-complete/pofl.git This repository is ment to track work on a first version of LLVM Poly that could be proposed for integration to LLVM. Feel free to register as a user and to commit to the repository. I just had some discussions on the LLVM IRC channel concerning the integration of Poly. The preferred way to implement this seemed to be like the tools/clang integration in llvm. A separated repository (that might also be hosted on the llvm svn server), that is only build if available. So user can decide if they want to try LLVMPoly and check it out on demand. What we need to achieve before proposing a patchset: 1. Integrate ClooG/isl 2. A working pass Frontend/Backend (very limited, no optimizations, but able to handle real world code) Thanks for working on this Tobi From etherzhhb at gmail.com Tue Jan 5 20:17:20 2010 From: etherzhhb at gmail.com (ether) Date: Wed, 06 Jan 2010 10:17:20 +0800 Subject: [LLVMdev] "Graphite" for llvm [building infrastructure] In-Reply-To: <4B43E9EA.9070201@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <13951_1262699107_4B434263_13951_326_1_4B434261.7050102@gmail.com> <4B43E9EA.9070201@fim.uni-passau.de> Message-ID: <4B43F2B0.7060009@gmail.com> hi Tobi, On 2010-1-6 9:39, Tobias Grosser wrote: > On 01/05/10 14:45, ether wrote: >> hi Tobi, >> >> i just added the Poly >> library(http://wiki.llvm.org/Polyhedral_optimization_framework) to llvm >> build system, which only contain a toy pass "Poly". >> i think we could add the polyhedral optimization stuff in to this >> library. >> >> it was test under cmake+visual studio 2009, and i also add the library >> build rule to MAKEFILEs, but not sure if it work under linux/cygwin/mac, >> sorry. >> hope this help >> >> best regards >> >> --ether > > [the complete mail] > > hi ether, > > great start. > > I pushed your work to our git repository at > http://repo.or.cz/w/llvm-complete/pofl.git > > This repository is ment to track work on a first version of LLVM Poly > that could be proposed for integration to LLVM. Feel free to register > as a user and to commit to the repository. ok :) i just learning how to use git yesterday > > I just had some discussions on the LLVM IRC channel concerning the > integration of Poly. > > The preferred way to implement this seemed to be like the tools/clang > integration in llvm. A separated repository (that might also be hosted > on the llvm svn server), that is only build if available. So user can > decide if they want to try LLVMPoly and check it out on demand. got it, i think we could move it out of opt after we finished the first implement. right now we could play with this dirty library first. > > What we need to achieve before proposing a patchset: > > 1. Integrate ClooG/isl > 2. A working pass Frontend/Backend (very limited, no optimizations, > but able to handle real world code) > > Thanks for working on this you are welcome. > > Tobi > > best regards --ether From gyounghwakim at gmail.com Wed Jan 6 03:17:37 2010 From: gyounghwakim at gmail.com (Gyounghwa Kim) Date: Wed, 6 Jan 2010 18:17:37 +0900 Subject: [LLVMdev] [Help] calling a native C function from inside LLVM IR Message-ID: <61ad7e1f1001060117n516e9065qbab783fce6c91f1c@mail.gmail.com> Dear experts, Is there any way to call a native C/C++ functions from inside LLVM IR? I appreciate your help in advance. Thanks. From jon at ffconsultancy.com Wed Jan 6 05:19:00 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 6 Jan 2010 11:19:00 +0000 Subject: [LLVMdev] [Help] calling a native C function from inside LLVM IR In-Reply-To: <61ad7e1f1001060117n516e9065qbab783fce6c91f1c@mail.gmail.com> References: <61ad7e1f1001060117n516e9065qbab783fce6c91f1c@mail.gmail.com> Message-ID: <201001061119.00753.jon@ffconsultancy.com> On Wednesday 06 January 2010 09:17:37 Gyounghwa Kim wrote: > Dear experts, > > Is there any way to call a native C/C++ functions from inside LLVM IR? > I appreciate your help in advance. Provided you have the type information, C is easy: just declare the external function and call it from the LLVM IR using the C calling convention. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From baldrick at free.fr Wed Jan 6 04:39:49 2010 From: baldrick at free.fr (Duncan Sands) Date: Wed, 06 Jan 2010 11:39:49 +0100 Subject: [LLVMdev] [Help] How can we call an object's virtual function inside IR? In-Reply-To: <61ad7e1f1001060210u6c047722s2ed072adbf616ef4@mail.gmail.com> References: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> <4B431B98.70403@free.fr> <61ad7e1f1001060210u6c047722s2ed072adbf616ef4@mail.gmail.com> Message-ID: <4B446875.7050402@free.fr> Hi Gyounghwa Kim, > First of all, thank you very much for your answer. > I tried your sugestion and found out that it is not what I wanted. > What I have to do is call a native C function from inside this > generated function. > Is there any way that we can find and call native C functions not > created by LLVM IR? You can insert a declaration of the function into the IR, then call it. Of course, for this to work you need to link in the native function when running. If you are building a standalone application then this is no problem. If you are running the IR using the JIT then it is also possible, hopefully someone else will explain how. > I am asking this question because I want to fix this example to get a > class member variable (ClassA * %record) and call the member function > of it from inside LLVM IR. You are allowed to call a pointer, i.e. a function declaration is not required. So you can just load the pointer out of %record, bitcast it to the right function type, and call it. Ciao, Duncan. PS: Please reply to the list and not to me personally. That way, others may answer, and the discussion is archived which may help in the future if someone else has the same question. If LLVM IR cannot access the member > function of a class. If it is not supported, we can change class > member functions like a c function. For example, ClassA->funca () can > be created as funcb(&ClassA ) -a C style function. Then we need to > call funcb from inside LLVM IR. > > Will that be possible? > I tried to search web and documents, but really couldn't find it. > > [[a [10.00]] > [3.00]] > ; ModuleID = 'ExprF' > > define i1 @expr(double* %record) { > entry: > %0 = getelementptr double* %record, i32 0 ; > [#uses=1] > %1 = load double* %0 ; [#uses=1] > %2 = frem double %1, 1.000000e+01 ; [#uses=1] > %3 = fcmp ogt double %2, 3.000000e+00 ; [#uses=1] > ret i1 %3 > } > > On Tue, Jan 5, 2010 at 7:59 PM, Duncan Sands wrote: >> Hi Gyounghwa Kim, try pasting C++ code into http://llvm.org/demo/ >> in order to see the LLVM IR that llvm-g++ turns it into. That way >> you will see how this can be done. >> >> Best wishes, >> >> Duncan. >> From kennethuil at gmail.com Wed Jan 6 06:38:55 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Wed, 6 Jan 2010 06:38:55 -0600 Subject: [LLVMdev] [Help] How can we call an object's virtual function inside IR? In-Reply-To: <4B446875.7050402@free.fr> References: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> <4B431B98.70403@free.fr> <61ad7e1f1001060210u6c047722s2ed072adbf616ef4@mail.gmail.com> <4B446875.7050402@free.fr> Message-ID: <400d33ea1001060438t765b2b4buba5e034b8fb9fd3c@mail.gmail.com> On Wed, Jan 6, 2010 at 4:39 AM, Duncan Sands wrote: > Hi Gyounghwa Kim, > >> First of all, thank you very much for your answer. >> I tried your sugestion and found out that it is not what I wanted. >> What I have to do is call a native C function from inside this >> generated function. >> Is there any way that we can find and call native C functions not >> created by LLVM IR? > > You can insert a declaration of the function into the IR, then call > it. ?Of course, for this to work you need to link in the native function > when running. ?If you are building a standalone application then this > is no problem. ?If you are running the IR using the JIT then it is also > possible, hopefully someone else will explain how. If you are running the IR using the JIT: 1. If the function is exported from the executable itself, or if it is in a static library linked with the executable using the -rdynamic flag, then you can insert a declaration of the function into the IR and then call it. 2. Otherwise, you can insert a declaration of the function into the IR, call ExecutionEngine::addGlobalMapping with the llvm::Function object and a native function pointer to link the declaration to the actual function, and then call it. From junk at giantblob.com Wed Jan 6 07:20:13 2010 From: junk at giantblob.com (James Williams) Date: Wed, 6 Jan 2010 13:20:13 +0000 Subject: [LLVMdev] Correct way to resolve recursive type information? Message-ID: Hi, I've followed the instructions on constructing recursive types ( http://llvm.org/docs/ProgrammersManual.html#BuildRecType) and I can succesfully create simple recursive types using the C bindings (e.g. struct Test { struct Test *t };). I want to generalize this to get type information from my language into generated LLVM code. My language allows arbitrary forward type declarations that I resolve using two passes - first all type names are entered into the symbol table in turn and then all type structures are built in turn with references to other types being resolved from the symbol table. To make this work with LLVM I plan to: - in the first type resolution pass, for every structured type in the compiled source create a type handle referencing an opaque type with LLVMCreateTypeHandle(LLVMCreateOpaqueType() and store it in the type's symbol table entry - in the second type resolution pass, create an LLVM structured type for every structured type in the program. The element types for any referenced types will be those types' opaque types - in a third pass, for every structured type in the program, resolve its opaque type to its structured type with LLVMRefineType Will I have a problem with TypeRefs becoming invalid underneath me as I repeatedly call LLVMRefineType in the third pass? If so how can I construct a web of mutually recursive types - is there some kind of atomic LLVMRefineType alternative that can refine the whole lot in one go? I'd be grateful for any advice, -- James Williams -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/9276e14a/attachment.html From jay.foad at gmail.com Wed Jan 6 08:27:44 2010 From: jay.foad at gmail.com (Jay Foad) Date: Wed, 6 Jan 2010 14:27:44 +0000 Subject: [LLVMdev] ipsccp vs getelementptr Message-ID: Hi, This code: @str = private constant [1 x i8] zeroinitializer define internal i8* @f(i8* %p) nounwind readnone { %q = getelementptr inbounds i8* %p, i32 1 ret i8* %q } define i32 @main() nounwind { %p = call i8* @f(i8* getelementptr ([1 x i8]* @str, i32 0, i32 0)) nounwind %c = icmp eq i8* %p, getelementptr ([1 x i8]* @str, i32 1, i32 0) br i1 %c, label %pass, label %fail fail: tail call void @abort() noreturn nounwind unreachable pass: ret i32 0 } declare void @abort() appears to be mis-optimised by "opt -ipsccp" into this: define internal i8* @f(i8* %p) nounwind readnone { ret i8* undef } define i32 @main() nounwind { %p = call i8* @f(i8* getelementptr ([1 x i8]* @str, i32 0, i32 0)) nounwind br label %fail fail: tail call void @abort() noreturn nounwind unreachable } >From looking at the debug output, IPSCCP works out that the result of the call to @f will be: i8* getelementptr ([1 x i8]* @str, i32 0, i32 1) which is compared with: i8* getelementptr ([1 x i8]* @str, i32 1, i32 0) and it thinks that these two expressions are different, so the comparison will return false. I can see that these two getelementptr expressions don't have exactly the same indexes in the same place, but surely they both evaluate to the same thing, namely one byte after the start of @str, so they should be considered equal? I'm using an LLVM 2.6-based tree - I haven't checked the behaviour with current svn trunk. Thanks, Jay. From jay.foad at gmail.com Wed Jan 6 08:41:22 2010 From: jay.foad at gmail.com (Jay Foad) Date: Wed, 6 Jan 2010 14:41:22 +0000 Subject: [LLVMdev] ipsccp vs getelementptr In-Reply-To: References: Message-ID: > From looking at the debug output, IPSCCP works out that the result of > the call to @f will be: > > i8* getelementptr ([1 x i8]* @str, i32 0, i32 1) > > which is compared with: > > i8* getelementptr ([1 x i8]* @str, i32 1, i32 0) > > and it thinks that these two expressions are different, so the > comparison will return false. > > I can see that these two getelementptr expressions don't have exactly > the same indexes in the same place, but surely they both evaluate to > the same thing, namely one byte after the start of @str, so they > should be considered equal? I see that this has already been fixed here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090907/086867.html Sorry for the noise! Thanks, Jay. From viridia at gmail.com Wed Jan 6 10:56:13 2010 From: viridia at gmail.com (Talin) Date: Wed, 6 Jan 2010 08:56:13 -0800 Subject: [LLVMdev] Correct way to resolve recursive type information? In-Reply-To: References: Message-ID: In my compiler, the multiple passes for resolving type names is done first, using high-level type classes, before any LLVM IR is involved. The high level classes are needed anyway since the LLVM type model does not capture concepts such as const, up-casting, and other aspects of high-level types. (Actually, I don't have discrete passes, what I have is a work-queue of symbols to be processed and analysis tasks to be performed on those symbols, with the ability to perform certain analysis tasks "just in time" if needed by other tasks. My friend jokingly refers to this as the "breadth-first compiler". However, that level of complexity is only required if you are doing things like type inference.) Only when the types are fully resolved do I convert to LLVM types. This only requires a single pass: for each type: if the type has not yet been constructed: set the 'under construction' bit for that type. create an opaque type as a placeholder for each member type (recursively) construct the type (using the member type's placeholder if the member type is still under construction.) create the type from the member types. clear the 'under construction' bit. replace the placeholder with the constructed type. On Wed, Jan 6, 2010 at 5:20 AM, James Williams wrote: > Hi, > > I've followed the instructions on constructing recursive types ( > http://llvm.org/docs/ProgrammersManual.html#BuildRecType) and I can > succesfully create simple recursive types using the C bindings (e.g. struct > Test { struct Test *t };). I want to generalize this to get type information > from my language into generated LLVM code. My language allows arbitrary > forward type declarations that I resolve using two passes - first all type > names are entered into the symbol table in turn and then all type structures > are built in turn with references to other types being resolved from the > symbol table. > > To make this work with LLVM I plan to: > > - in the first type resolution pass, for every structured type in the > compiled source create a type handle referencing an opaque type with > LLVMCreateTypeHandle(LLVMCreateOpaqueType() and store it in the type's > symbol table entry > - in the second type resolution pass, create an LLVM structured type for > every structured type in the program. The element types for any referenced > types will be those types' opaque types > - in a third pass, for every structured type in the program, resolve its > opaque type to its structured type with LLVMRefineType > > Will I have a problem with TypeRefs becoming invalid underneath me as I > repeatedly call LLVMRefineType in the third pass? If so how can I construct > a web of mutually recursive types - is there some kind of atomic > LLVMRefineType alternative that can refine the whole lot in one go? > > I'd be grateful for any advice, > -- James Williams > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/b7c23192/attachment.html From gvenn.cfe.dev at gmail.com Wed Jan 6 11:24:19 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Wed, 6 Jan 2010 12:24:19 -0500 Subject: [LLVMdev] License query for demo code Message-ID: <556C3417-1292-4ADA-AFD5-086DF8DE58C5@gmail.com> I've developed example code, and am currently developing the accompanying documentation, which treats an example of a JIT based exception implementation based on the current LLVM 2.7 implementation, both of which I would like to put into the LLVM wiki. Given that I want to give this code a University Illinois license (the one included with LLVM), and want to correctly reference the LLVM compiler-rt project, and the LLVM Kaleidoscope example--both of which my code was extrapolated from, what do I put in the Developed by section of the license? I want to make sure that I'm blamed for the code, but simultaneously don't want to imply that this code was created from scratch, and that the two projects, portions of whose code was copied verbatim before being modified are correctly attributed. Thanks in advance Garrison From junk at giantblob.com Wed Jan 6 14:41:18 2010 From: junk at giantblob.com (James Williams) Date: Wed, 6 Jan 2010 20:41:18 +0000 Subject: [LLVMdev] Fwd: Correct way to resolve recursive type information? In-Reply-To: References: Message-ID: 2010/1/6 Talin In my compiler, the multiple passes for resolving type names is done first, > using high-level type classes, before any LLVM IR is involved. The high > level classes are needed anyway since the LLVM type model does not capture > concepts such as const, up-casting, and other aspects of high-level types. > (Actually, I don't have discrete passes, what I have is a work-queue of > symbols to be processed and analysis tasks to be performed on those symbols, > with the ability to perform certain analysis tasks "just in time" if needed > by other tasks. My friend jokingly refers to this as the "breadth-first > compiler". However, that level of complexity is only required if you are > doing things like type inference.) > > Only when the types are fully resolved do I convert to LLVM types. This > only requires a single pass: > > for each type: > if the type has not yet been constructed: > set the 'under construction' bit for that type. > create an opaque type as a placeholder > for each member type (recursively) construct the type > (using the member type's placeholder if the member type is > still under construction.) > create the type from the member types. > clear the 'under construction' bit. > replace the placeholder with the constructed type. > That's much cleaner than what I'm doing (actually one pass over the syntax tree to declare all namespaces and type names, then recursively specialize templates and infer types for variable definitions, then a pass to construct structured types including template specializations). However, I want to get LLVM in quickly and with the minimum upheaval so I don't want to start a major refactoring of this code if I don't need to. Having read read the manual I think I'll be OK so long as I use type handles for any type I want to hold references to until all refineAbstractTypeTo()'s are done. -- James On Wed, Jan 6, 2010 at 5:20 AM, James Williams wrote: > >> Hi, >> >> I've followed the instructions on constructing recursive types ( >> http://llvm.org/docs/ProgrammersManual.html#BuildRecType) and I can >> succesfully create simple recursive types using the C bindings (e.g. struct >> Test { struct Test *t };). I want to generalize this to get type information >> from my language into generated LLVM code. My language allows arbitrary >> forward type declarations that I resolve using two passes - first all type >> names are entered into the symbol table in turn and then all type structures >> are built in turn with references to other types being resolved from the >> symbol table. >> >> To make this work with LLVM I plan to: >> >> - in the first type resolution pass, for every structured type in the >> compiled source create a type handle referencing an opaque type with >> LLVMCreateTypeHandle(LLVMCreateOpaqueType() and store it in the type's >> symbol table entry >> - in the second type resolution pass, create an LLVM structured type for >> every structured type in the program. The element types for any referenced >> types will be those types' opaque types >> - in a third pass, for every structured type in the program, resolve its >> opaque type to its structured type with LLVMRefineType >> >> Will I have a problem with TypeRefs becoming invalid underneath me as I >> repeatedly call LLVMRefineType in the third pass? If so how can I construct >> a web of mutually recursive types - is there some kind of atomic >> LLVMRefineType alternative that can refine the whole lot in one go? >> >> I'd be grateful for any advice, >> -- James Williams >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > > -- > -- Talin > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/db008072/attachment.html From viridia at gmail.com Wed Jan 6 14:45:02 2010 From: viridia at gmail.com (Talin) Date: Wed, 6 Jan 2010 12:45:02 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 Message-ID: This patch adds a UnionType to DerivedTypes.h. It also adds code to the bitcode reader / writer and the assembly parser for the new type, as well as a tiny .ll test file in test/Assembler. It does not contain any code related to code generation or type layout - I wanted to see if this much was acceptable before I proceeded any further. Unlike my previous patch, in which the Union type was implemented as a packing option to type Struct (thereby re-using the machinery of Struct), this patch defines Union as a completely separate type from Struct. I was a little uncertain as to how to write the tests. I'd particularly like to write tests for the bitcode reader/writer stuff, but I am not sure how to do that. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/00a6d6fd/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: uniontype.patch Type: application/octet-stream Size: 23154 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100106/00a6d6fd/attachment.obj From fimarn at yahoo.com Wed Jan 6 15:12:16 2010 From: fimarn at yahoo.com (fima rabin) Date: Wed, 6 Jan 2010 13:12:16 -0800 (PST) Subject: [LLVMdev] something wrong with .ll file? Message-ID: <399879.56615.qm@web50501.mail.re2.yahoo.com> I am trying to compile a little intrinsic function for my machine. Here is a dump from clang-cc with --emit-llvm option: ===================== ; ModuleID = 'foo.c' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" target triple = "i386-pc-linux-gnu" @main.i = internal global i32 0 ; [#uses=0] @main.x = internal global [10 x float] zeroinitializer ; <[10 x float]*> [#uses=0] @main.y = internal global [10 x float] zeroinitializer ; <[10 x float]*> [#uses=0] define i32 @main() nounwind { entry: %retval = alloca i32 ; [#uses=2] %m1 = alloca <2 x double>, align 16 ; <<2 x double>*> [#uses=0] %m2 = alloca <2 x double>, align 16 ; <<2 x double>*> [#uses=0] %j = alloca i32, align 4 ; [#uses=0] store i32 0, i32* %retval call void @llvm.mymachine.su.route(i32 5, i32 4) %0 = load i32* %retval ; [#uses=1] ret i32 %0 } declare void @llvm.mymachine.su.route(i32, i32) nounwind readnone =========================================== As you can see, the intrinsic function takes two integer arguments and does not return anything. For some reason I am getting into trouble when I use llc to process my .bc file. In visitTargetIntrinsic() the second argument to function ComputeValueVTs() - I.getType() == llvm::Type::VoidTyID. This leads me to the exception: "Cannot have nodes without results!". Is there is something wrong with my byte code or I messed up somewhere in llc code? Are there any other dumps that I can use while processing .bc file? Thanks. -- Fima From wendling at apple.com Wed Jan 6 15:54:16 2010 From: wendling at apple.com (Bill Wendling) Date: Wed, 6 Jan 2010 13:54:16 -0800 Subject: [LLVMdev] something wrong with .ll file? In-Reply-To: <399879.56615.qm@web50501.mail.re2.yahoo.com> References: <399879.56615.qm@web50501.mail.re2.yahoo.com> Message-ID: <31753BB5-6EF0-4E58-B363-21F04D8E23E4@apple.com> On Jan 6, 2010, at 1:12 PM, fima rabin wrote: > I am trying to compile a little intrinsic function for my machine. Here is a dump from clang-cc with --emit-llvm option: > ===================== > > ; ModuleID = 'foo.c' > target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" > target triple = "i386-pc-linux-gnu" > > @main.i = internal global i32 0 ; [#uses=0] > @main.x = internal global [10 x float] zeroinitializer ; <[10 x float]*> [#uses=0] > @main.y = internal global [10 x float] zeroinitializer ; <[10 x float]*> [#uses=0] > > define i32 @main() nounwind { > entry: > %retval = alloca i32 ; [#uses=2] > %m1 = alloca <2 x double>, align 16 ; <<2 x double>*> [#uses=0] > %m2 = alloca <2 x double>, align 16 ; <<2 x double>*> [#uses=0] > %j = alloca i32, align 4 ; [#uses=0] > store i32 0, i32* %retval > call void @llvm.mymachine.su.route(i32 5, i32 4) > %0 = load i32* %retval ; [#uses=1] > ret i32 %0 > } > > declare void @llvm.mymachine.su.route(i32, i32) nounwind readnone > > =========================================== > > As you can see, the intrinsic function takes two integer arguments and does not > return anything. > > For some reason I am getting into trouble when I use llc to process my .bc file. In visitTargetIntrinsic() > the second argument to function ComputeValueVTs() - I.getType() == llvm::Type::VoidTyID. This leads me to > the exception: "Cannot have nodes without results!". > > Is there is something wrong with my byte code or I messed up somewhere in llc code? > > Are there any other dumps that I can use while processing .bc file? > What's the TD definition of your intrinsic? -bw From fimarn at yahoo.com Wed Jan 6 16:11:09 2010 From: fimarn at yahoo.com (fima rabin) Date: Wed, 6 Jan 2010 14:11:09 -0800 (PST) Subject: [LLVMdev] something wrong with .ll file? In-Reply-To: <31753BB5-6EF0-4E58-B363-21F04D8E23E4@apple.com> References: <399879.56615.qm@web50501.mail.re2.yahoo.com> <31753BB5-6EF0-4E58-B363-21F04D8E23E4@apple.com> Message-ID: <103432.50522.qm@web50507.mail.re2.yahoo.com> Here is my .td definition in IntrinsicsMymachine.td let TargetPrefix = "mymachine" in { // def int_mymachine_su_route : Intrinsic<[llvm_void_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; } -- fima ----- Original Message ---- From: Bill Wendling To: fima rabin Cc: llvmdev at cs.uiuc.edu Sent: Wed, January 6, 2010 4:54:16 PM Subject: Re: [LLVMdev] something wrong with .ll file? On Jan 6, 2010, at 1:12 PM, fima rabin wrote: > I am trying to compile a little intrinsic function for my machine. Here is a dump from clang-cc with --emit-llvm option: > ===================== > > ; ModuleID = 'foo.c' > target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" > target triple = "i386-pc-linux-gnu" > > @main.i = internal global i32 0 ; [#uses=0] > @main.x = internal global [10 x float] zeroinitializer ; <[10 x float]*> [#uses=0] > @main.y = internal global [10 x float] zeroinitializer ; <[10 x float]*> [#uses=0] > > define i32 @main() nounwind { > entry: > %retval = alloca i32 ; [#uses=2] > %m1 = alloca <2 x double>, align 16 ; <<2 x double>*> [#uses=0] > %m2 = alloca <2 x double>, align 16 ; <<2 x double>*> [#uses=0] > %j = alloca i32, align 4 ; [#uses=0] > store i32 0, i32* %retval > call void @llvm.mymachine.su.route(i32 5, i32 4) > %0 = load i32* %retval ; [#uses=1] > ret i32 %0 > } > > declare void @llvm.mymachine.su.route(i32, i32) nounwind readnone > > =========================================== > > As you can see, the intrinsic function takes two integer arguments and does not > return anything. > > For some reason I am getting into trouble when I use llc to process my .bc file. In visitTargetIntrinsic() > the second argument to function ComputeValueVTs() - I.getType() == llvm::Type::VoidTyID. This leads me to > the exception: "Cannot have nodes without results!". > > Is there is something wrong with my byte code or I messed up somewhere in llc code? > > Are there any other dumps that I can use while processing .bc file? > What's the TD definition of your intrinsic? -bw From jay.foad at gmail.com Thu Jan 7 07:19:44 2010 From: jay.foad at gmail.com (Jay Foad) Date: Thu, 7 Jan 2010 13:19:44 +0000 Subject: [LLVMdev] configuring llvm-gcc 2.6 for mips Message-ID: Hi, If I configure llvm-gcc 2.6 with --target=mips or --target=mips-elf, I get: c++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -pedantic -Wno-long-long -Wno-variadic-macros -Wmissing-format-attribute -fno-common -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"mips-elf\" -frandom-seed=0 -I. -I. -I/home/foad/toolchain/llvm/llvm-gcc/gcc -I/home/foad/toolchain/llvm/llvm-gcc/gcc/. -I/home/foad/toolchain/llvm/llvm-gcc/gcc/../include -I/home/foad/toolchain/llvm/llvm-gcc/gcc/../libcpp/include -I/home/foad/toolchain/llvm/llvm-gcc/gcc/../libdecnumber -I../libdecnumber -I/home/foad/toolchain/obj/llvm-obj/include -I/home/foad/toolchain/llvm/llvm/include -DENABLE_LLVM -I/home/foad/toolchain/llvm/llvm/include -I/home/foad/toolchain/obj/llvm-obj/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I/home/foad/toolchain/llvm/llvm-gcc/gcc -I/home/foad/toolchain/llvm/llvm-gcc/gcc/. -I/home/foad/toolchain/llvm/llvm-gcc/gcc/../include -I/home/foad/toolchain/llvm/llvm-gcc/gcc/../libcpp/include -I/home/foad/toolchain/llvm/llvm-gcc/gcc/../libdecnumber -I../libdecnumber -I/home/foad/toolchain/obj/llvm-obj/include -I/home/foad/toolchain/llvm/llvm/include /home/foad/toolchain/llvm/llvm-gcc/gcc/llvm-backend.cpp -o llvm-backend.o /home/foad/toolchain/llvm/llvm-gcc/gcc/llvm-backend.cpp:341:2: error: #error LLVM_TARGET_NAME macro not specified by GCC backend make[2]: *** [llvm-backend.o] Error 1 make[2]: Leaving directory `/home/foad/toolchain/obj/gcc-mips-obj/gcc' Am I doing something wrong? Thanks, Jay. From jon at ffconsultancy.com Thu Jan 7 09:06:07 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Thu, 7 Jan 2010 15:06:07 +0000 Subject: [LLVMdev] sqrt Message-ID: <201001071506.07186.jon@ffconsultancy.com> What is the state of sqrt in LLVM? It was an intrinsic but there are no OCaml bindings for it and, last I looked, it generated inefficient code on Linux due to this bug: http://www.llvm.org/PR3219 Is the intrinsic deprecated? Am I losing a lot of performance by calling sqrt from libm instead of using the intrinsic? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From jay.foad at gmail.com Thu Jan 7 08:30:45 2010 From: jay.foad at gmail.com (Jay Foad) Date: Thu, 7 Jan 2010 14:30:45 +0000 Subject: [LLVMdev] "Value has wrong type!" on Bool:4 bitfield Message-ID: I've built a debug build of llvm 2.6, and llvm-gcc 2.6 for arm-elf with --enable-checking=yes. On the attached test case (which is g++.dg/expr/bitfield4.C from the GCC 4.2 testsuite) I get: $ cc1plus bitfield4.ii -emit-llvm-bc -o bitfield4.o -quiet cc1plus: /home/foad/svn/antix/toolchain/branches/w/foad/2757llvm26/toolchain/llvm/llvm-gcc/gcc/llvm-convert.cpp:999: llvm::Value* TreeToLLVM::Emit(tree_node*, const MemRef*): Assertion `(Result == 0 || (((enum tree_code) (((exp)->common.type))->common.code) == VOID_TYPE) || isa(ConvertType(((exp)->common.type))) || Result->getType() == ConvertType(((exp)->common.type))) && "Value has wrong type!"' failed. At this point Result is "i8 1". ConvertType(TREE_TYPE(exp)) is i4. And the code generated for the basic block so far is: entry: %retval = alloca i32 ; [#uses=0] %0 = alloca i4 ; [#uses=1] %"alloca point" = bitcast i32 0 to i32 ; [#uses=0] %1 = load i8* getelementptr inbounds (%struct.S* @s, i32 0, i32 0), align 1 ; [#uses=1] %2 = shl i8 %1, 4 ; [#uses=1] %3 = lshr i8 %2, 4 ; [#uses=1] %4 = trunc i8 %3 to i4 ; [#uses=1] store i4 %4, i4* %0, align 1 %5 = load i8* getelementptr inbounds (%struct.S* @s, i32 0, i32 0), align 1 ; [#uses=1] %6 = and i8 %5, -16 ; [#uses=1] %7 = or i8 %6, 1 ; [#uses=1] store i8 %7, i8* getelementptr inbounds (%struct.S* @s, i32 0, i32 0), align 1 Any idea what's going wrong? Thanks, Jay. -------------- next part -------------- A non-text attachment was scrubbed... Name: bitfield4.ii Type: application/octet-stream Size: 284 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100107/c8c2a60d/attachment.obj From jay.foad at gmail.com Thu Jan 7 08:40:42 2010 From: jay.foad at gmail.com (Jay Foad) Date: Thu, 7 Jan 2010 14:40:42 +0000 Subject: [LLVMdev] "Value has wrong type!" on Bool:4 bitfield In-Reply-To: References: Message-ID: > cc1plus: /home/foad/svn/antix/toolchain/branches/w/foad/2757llvm26/toolchain/llvm/llvm-gcc/gcc/llvm-convert.cpp:999: > llvm::Value* TreeToLLVM::Emit(tree_node*, const MemRef*): Assertion > `(Result == 0 || (((enum tree_code) > (((exp)->common.type))->common.code) == VOID_TYPE) || > isa(ConvertType(((exp)->common.type))) || > Result->getType() == ConvertType(((exp)->common.type))) && "Value has > wrong type!"' failed. I see that this was addressed here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090831/086426.html Sorry for the noise again! Jay. From bruno.cardoso at gmail.com Thu Jan 7 09:43:47 2010 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Thu, 7 Jan 2010 13:43:47 -0200 Subject: [LLVMdev] configuring llvm-gcc 2.6 for mips In-Reply-To: References: Message-ID: <275e64e41001070743gbb81ac1h1e843329963735ac@mail.gmail.com> Hi Jay, > /home/foad/toolchain/llvm/llvm-gcc/gcc/llvm-backend.cpp:341:2: error: > #error LLVM_TARGET_NAME macro not specified by GCC backend > make[2]: *** [llvm-backend.o] Error 1 > make[2]: Leaving directory `/home/foad/toolchain/obj/gcc-mips-obj/gcc' > > Am I doing something wrong? This was fixed on trunk: http://llvm.org/viewvc/llvm-project?rev=88860&view=rev Best regards, -- Bruno Cardoso Lopes http://www.brunocardoso.cc From clattner at apple.com Thu Jan 7 11:48:03 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 7 Jan 2010 09:48:03 -0800 Subject: [LLVMdev] sqrt In-Reply-To: <201001071506.07186.jon@ffconsultancy.com> References: <201001071506.07186.jon@ffconsultancy.com> Message-ID: <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> On Jan 7, 2010, at 7:06 AM, Jon Harrop wrote: > > What is the state of sqrt in LLVM? > > It was an intrinsic but there are no OCaml bindings for it and, last > I looked, > it generated inefficient code on Linux due to this bug: > > http://www.llvm.org/PR3219 > > Is the intrinsic deprecated? Am I losing a lot of performance by > calling sqrt > from libm instead of using the intrinsic? There is a fundamental difference between sqrt() and llvm.sqrt: the former is defined on negative values and sets errno (on linux). The later is undefined. Both work well for their stated purpose, llvm.sqrt should not be slower than sqrt even on linux. Both llvm.sqrt and sqrt could be much better on linux, but noone seems compelled to do the work. -Chris From deeppatel1987 at gmail.com Thu Jan 7 14:28:57 2010 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Thu, 7 Jan 2010 20:28:57 +0000 Subject: [LLVMdev] sqrt In-Reply-To: <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> References: <201001071506.07186.jon@ffconsultancy.com> <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> Message-ID: <305d6f61001071228r7c6288f2o1c81fe4ee5ad5669@mail.gmail.com> On Thu, Jan 7, 2010 at 5:48 PM, Chris Lattner wrote: > > On Jan 7, 2010, at 7:06 AM, Jon Harrop wrote: > >> >> What is the state of sqrt in LLVM? >> >> It was an intrinsic but there are no OCaml bindings for it and, last >> I looked, >> it generated inefficient code on Linux due to this bug: >> >> ?http://www.llvm.org/PR3219 >> >> Is the intrinsic deprecated? Am I losing a lot of performance by >> calling sqrt >> from libm instead of using the intrinsic? > > There is a fundamental difference between sqrt() and llvm.sqrt: the > former is defined on negative values and sets errno (on linux). ?The > later is undefined. ?Both work well for their stated purpose, > llvm.sqrt should not be slower than sqrt even on linux. ?Both > llvm.sqrt and sqrt could be much better on linux, but noone seems > compelled to do the work. Many platforms could also benefit from recognizing 1.0/sqrt() as rsqrt(). deep From dllaurence at dslextreme.com Thu Jan 7 15:28:21 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Thu, 07 Jan 2010 13:28:21 -0800 Subject: [LLVMdev] First-class aggregate semantics Message-ID: <4B4651F5.80108@laurences.net> I think I'm missing something basic about the semantics of returning an aggregate type (in my case, a structure) from a function. Returning a structure containing only compile-time constants is simple enough. But I don't quite get how this works with a struct composed at run-time. If I constructed it on the stack with alloca, would I be letting a stack variable escape to to a context where it doesn't exist if I return it? Or does the return semantics guarantee it will be copied (or space allocated in the caller) appropriately? Otherwise I should abandon the idea of returning such a struct and simply pass in a pointer to space allocated in the caller. I think my confusion stems from thinking in terms of high-level languages and not having done nearly enough assembly work to know what LLVM really wants to do, and I'd be grateful for a clue about the idiomatic way to do this. Dustin From dag at cray.com Thu Jan 7 15:38:15 2010 From: dag at cray.com (David Greene) Date: Thu, 7 Jan 2010 15:38:15 -0600 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B4651F5.80108@laurences.net> References: <4B4651F5.80108@laurences.net> Message-ID: <201001071538.15322.dag@cray.com> On Thursday 07 January 2010 15:28, Dustin Laurence wrote: > I think I'm missing something basic about the semantics of returning an > aggregate type (in my case, a structure) from a function. Returning a > structure containing only compile-time constants is simple enough. But > I don't quite get how this works with a struct composed at run-time. If > I constructed it on the stack with alloca, would I be letting a stack > variable escape to to a context where it doesn't exist if I return it? > Or does the return semantics guarantee it will be copied (or space > allocated in the caller) appropriately? Otherwise I should abandon the > idea of returning such a struct and simply pass in a pointer to space > allocated in the caller. > > I think my confusion stems from thinking in terms of high-level > languages and not having done nearly enough assembly work to know what > LLVM really wants to do, and I'd be grateful for a clue about the > idiomatic way to do this. The way this works on many targets is that the caller allocates stack space in its frame for the returned struct and passes a pointer to it as a first "hidden" argument to the callee. The callee then copies that data into the space pointed to by the address. This is all specified by the ABI so it varies by processor and OS. The target-dependent lowering pieces of LLVM should take care of it. Long-term, first-class status means that returns of structs should "just work" and you don't need to worry about getting a pointer to invalid memory. I believe right now, however, only structs up to a certain size are supported, perhaps because under some ABIs, small structs can be returned in registers and one doesn't need to worry about generating the hidden argument. Someone working directly on this can answer with more authority. -Dave From dllaurence at dslextreme.com Thu Jan 7 15:56:11 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Thu, 07 Jan 2010 13:56:11 -0800 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <201001071538.15322.dag@cray.com> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> Message-ID: <4B46587B.2070808@laurences.net> On 01/07/2010 01:38 PM, David Greene wrote: > The way this works on many targets is that the caller allocates stack > space in its frame for the returned struct and passes a pointer to it > as a first "hidden" argument to the callee. The callee then copies > that data into the space pointed to by the address. > Long-term, first-class status means that returns of structs should > "just work" and you don't need to worry about getting a pointer to > invalid memory. OK, so my thought of constructing the object on the stack was correct? What I originally wanted to do was roughly %Token = type {%c_int, %i8*} define %Token @foo() { ... ret %Token {%c_int %token, %i8* %value} } but the compiler complains about the invalid usage of a local name. So I decided the problem was that I was thinking in terms of languages that would create a temporary implicitly, and in IR I need to do it explicitly. So it occurred to me to create the struct on the stack, as I mentioned. What bothers me about that is the explicit specification with alloca that the space is reserved in the callee's frame. Do I just trust the optimizer to eliminate that and turn the reference to alloca'd memory into a reference to the space reserved by the caller? Or is that going to create an unnecessary copy from the alloca'd memory to that reserved by the caller? From what you said my guess is the former (optimizer eliminates the pointless temporary), but us premature optimizers like to be reassured we haven't given up an all-important microsecond. :-) > ...I believe right now, however, only structs up to a > certain size are supported, perhaps because under some ABIs, small > structs can be returned in registers and one doesn't need to worry > about generating the hidden argument. In the case that prompted the question the struct isn't going to be bigger than two of whatever the architecture regards as a word, which surely should be fine, but in principle shouldn't LLVM and not the front-end programmer be making the decision about whether the struct is big enough to spill into memory? Dustin From kennethuil at gmail.com Thu Jan 7 15:57:44 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Thu, 7 Jan 2010 15:57:44 -0600 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <201001071538.15322.dag@cray.com> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> Message-ID: <400d33ea1001071357r3c3c65e9y119ef62cfeffbc58@mail.gmail.com> On Thu, Jan 7, 2010 at 3:38 PM, David Greene wrote: > On Thursday 07 January 2010 15:28, Dustin Laurence wrote: >> I think I'm missing something basic about the semantics of returning an >> aggregate type (in my case, a structure) from a function. ?Returning a >> structure containing only compile-time constants is simple enough. ?But >> I don't quite get how this works with a struct composed at run-time. ?If >> I constructed it on the stack with alloca, would I be letting a stack >> variable escape to to a context where it doesn't exist if I return it? >> Or does the return semantics guarantee it will be copied (or space >> allocated in the caller) appropriately? ?Otherwise I should abandon the >> idea of returning such a struct and simply pass in a pointer to space >> allocated in the caller. >> >> I think my confusion stems from thinking in terms of high-level >> languages and not having done nearly enough assembly work to know what >> LLVM really wants to do, and I'd be grateful for a clue about the >> idiomatic way to do this. > > The way this works on many targets is that the caller allocates stack > space in its frame for the returned struct and passes a pointer to it > as a first "hidden" argument to the callee. ?The callee then copies > that data into the space pointed to by the address. > > This is all specified by the ABI so it varies by processor and OS. > The target-dependent lowering pieces of LLVM should take care of it. > > Long-term, first-class status means that returns of structs should > "just work" and you don't need to worry about getting a pointer to > invalid memory. ?I believe right now, however, only structs up to a > certain size are supported, perhaps because under some ABIs, small > structs can be returned in registers and one doesn't need to worry > about generating the hidden argument. > > Someone working directly on this can answer with more authority. On x86, the hidden argument is generated automatically at codegen time if it's needed. As far as I know, other platforms don't yet have that support. From jay.foad at gmail.com Thu Jan 7 17:21:23 2010 From: jay.foad at gmail.com (Jay Foad) Date: Thu, 7 Jan 2010 23:21:23 +0000 Subject: [LLVMdev] configuring llvm-gcc 2.6 for mips In-Reply-To: <275e64e41001070743gbb81ac1h1e843329963735ac@mail.gmail.com> References: <275e64e41001070743gbb81ac1h1e843329963735ac@mail.gmail.com> Message-ID: >> /home/foad/toolchain/llvm/llvm-gcc/gcc/llvm-backend.cpp:341:2: error: >> #error LLVM_TARGET_NAME macro not specified by GCC backend >> make[2]: *** [llvm-backend.o] Error 1 >> make[2]: Leaving directory `/home/foad/toolchain/obj/gcc-mips-obj/gcc' >> >> Am I doing something wrong? > > This was fixed on trunk: > http://llvm.org/viewvc/llvm-project?rev=88860&view=rev OK, thanks for the pointer! Jay. From viridia at gmail.com Thu Jan 7 17:52:04 2010 From: viridia at gmail.com (Talin) Date: Thu, 7 Jan 2010 15:52:04 -0800 Subject: [LLVMdev] Two suggestions for improving LLVM error reporting Message-ID: I realize that LLVM assertion results aren't intended to be seen by end-users ever - they are there to tell front-end developers that they screwed up. Even so, I can think of two things that would make it easier for front-end developers to track down what they did wrong without having to jump into the debugger: 1) Be able to set a "default module" for dump(). The version of dump() in llvm::Type takes a module parameter, which allows the type to be printed out using the type names registered in that module. However, the various asserts which call dump() or str() don't have a module pointer handy, which causes the type to be printed out as unrecognizable gobbledygook. If there was some way that we could associate either a default module with the LLVMContext - or perhaps register type names with the LLVMContext directly - and have dump() with no arguments use these name mappings - it would make the output more readable. (I kind of like the idea of having a type name dictionary in the LLVM context which is used whenever there is no module pointer handy. For one thing, I could fill it with more readable names than the ones I use in the module's name table.) 2) In the case where type names are available, never print out type names in the form of "%26". I realize that not all types have names - but at least always expand one level, so we can tell whether it's a struct or a pointer or whatever. "{ %32 }" or "%32*" tells me a lot more than just a number by itself. Got a CreateLoad() call that failed? Well if it prints out "{ %32 }" for the assertion argument you instantly know why, even if you don't know what "%32" resolves to. In fact, what would be even cooler would be to set a threshold for the number of levels that were always expanded. I hate trying to read 400-character-long type names, but expanding two or three levels would be quite useful. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100107/2927f7e6/attachment.html From robert.a.zeh at gmail.com Thu Jan 7 17:53:11 2010 From: robert.a.zeh at gmail.com (Robert A. Zeh) Date: Thu, 7 Jan 2010 17:53:11 -0600 Subject: [LLVMdev] sqrt In-Reply-To: <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> References: <201001071506.07186.jon@ffconsultancy.com> <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> Message-ID: <2FAA3426-D7B0-4CEA-BFDC-2B3DAB9A3267@gmail.com> On Jan 7, 2010, at 11:48 AM, Chris Lattner wrote: > There is a fundamental difference between sqrt() and llvm.sqrt: the > former is defined on negative values and sets errno (on linux). The > later is undefined. Both work well for their stated purpose, > llvm.sqrt should not be slower than sqrt even on linux. Both > llvm.sqrt and sqrt could be much better on linux, but noone seems > compelled to do the work. > > -Chris What exactly is the work for llvm.sqrt on Linux? From clattner at apple.com Thu Jan 7 18:04:36 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 7 Jan 2010 16:04:36 -0800 Subject: [LLVMdev] sqrt In-Reply-To: <2FAA3426-D7B0-4CEA-BFDC-2B3DAB9A3267@gmail.com> References: <201001071506.07186.jon@ffconsultancy.com> <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> <2FAA3426-D7B0-4CEA-BFDC-2B3DAB9A3267@gmail.com> Message-ID: On Jan 7, 2010, at 3:53 PM, Robert A. Zeh wrote: > > On Jan 7, 2010, at 11:48 AM, Chris Lattner wrote: > >> There is a fundamental difference between sqrt() and llvm.sqrt: the >> former is defined on negative values and sets errno (on linux). The >> later is undefined. Both work well for their stated purpose, >> llvm.sqrt should not be slower than sqrt even on linux. Both >> llvm.sqrt and sqrt could be much better on linux, but noone seems >> compelled to do the work. >> >> -Chris > > What exactly is the work for llvm.sqrt on Linux? Ah sorry, llvm.sqrt works fine on linux. The issue is that a raw call to sqrt() in a C program doesn't typically compile to llvm.sqrt on linux, because it sets errno. This can be controlled with -fmath-errno. -Chris From dalej at apple.com Thu Jan 7 18:14:32 2010 From: dalej at apple.com (Dale Johannesen) Date: Thu, 7 Jan 2010 16:14:32 -0800 Subject: [LLVMdev] sqrt In-Reply-To: References: <201001071506.07186.jon@ffconsultancy.com> <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> <2FAA3426-D7B0-4CEA-BFDC-2B3DAB9A3267@gmail.com> Message-ID: <4367C926-465F-4FF3-B53A-AB4AA17F3C25@apple.com> On Jan 7, 2010, at 4:04 PMPST, Chris Lattner wrote: > On Jan 7, 2010, at 3:53 PM, Robert A. Zeh wrote: > On Jan 7, 2010, at 11:48 AM, Chris Lattner wrote: >> >>> There is a fundamental difference between sqrt() and llvm.sqrt: the >>> former is defined on negative values and sets errno (on linux). The >>> later is undefined. Both work well for their stated purpose, >>> llvm.sqrt should not be slower than sqrt even on linux. Both >>> llvm.sqrt and sqrt could be much better on linux, but noone seems >>> compelled to do the work. >>> >>> -Chris >> >> What exactly is the work for llvm.sqrt on Linux? > > Ah sorry, llvm.sqrt works fine on linux. The issue is that a raw > call to sqrt() in a C program doesn't typically compile to llvm.sqrt > on linux, because it sets errno. This can be controlled with -fmath- > errno. It is more than errno. sqrt() should't compile to llvm.sqrt on any platform that uses IEEE754 math, because IEEE sqrt() is well defined on negative arguments and llvm.sqrt isn't. There are currently no command line arguments to override this. From evan.cheng at apple.com Thu Jan 7 18:47:02 2010 From: evan.cheng at apple.com (Evan Cheng) Date: Thu, 7 Jan 2010 16:47:02 -0800 Subject: [LLVMdev] Removing the constant pool In-Reply-To: References: <1262702491.4903.12.camel@quill-linux.kl.imgtec.org> Message-ID: <24BF6893-1705-4418-BB18-38C325070721@apple.com> Is that really sufficient? See X86ISelLowering.cpp, look for addLegalFPImmediate. Usually targets have to tell legalizer what fp immediates are legl. Evan On Jan 5, 2010, at 8:38 AM, Anton Korobeynikov wrote: > Hello > >> I was wondering if it is possible to stop floating-point constants being converted to use the constant pool? As for our back-end we would like to be able to treat floating point constants the same way integer constants are treated instead of having to go via the constant pool. > Yes, surely. Just make ISD::ConstantFP for given type legal and handle > it during isel. > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Thu Jan 7 18:49:11 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Fri, 8 Jan 2010 03:49:11 +0300 Subject: [LLVMdev] Removing the constant pool In-Reply-To: <24BF6893-1705-4418-BB18-38C325070721@apple.com> References: <1262702491.4903.12.camel@quill-linux.kl.imgtec.org> <24BF6893-1705-4418-BB18-38C325070721@apple.com> Message-ID: Hello, Evan > Is that really sufficient? See X86ISelLowering.cpp, look for addLegalFPImmediate. Usually targets have to tell legalizer what fp immediates are legl. Right, because ConstantFPs are normally "Expand". These tells the codegen that "only some are legal". At least this was how I deduced from the code :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From jon at ffconsultancy.com Thu Jan 7 20:40:51 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Fri, 8 Jan 2010 02:40:51 +0000 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <400d33ea1001071357r3c3c65e9y119ef62cfeffbc58@mail.gmail.com> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <400d33ea1001071357r3c3c65e9y119ef62cfeffbc58@mail.gmail.com> Message-ID: <201001080240.51469.jon@ffconsultancy.com> On Thursday 07 January 2010 21:57:44 Kenneth Uildriks wrote: > On x86, the hidden argument is generated automatically at codegen time > if it's needed. As far as I know, other platforms don't yet have that > support. You mean to return a struct in sret form? I thought support was recently added to return structs in registers? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From matti.niemenmaa+news at iki.fi Thu Jan 7 06:16:58 2010 From: matti.niemenmaa+news at iki.fi (Matti Niemenmaa) Date: Thu, 07 Jan 2010 14:16:58 +0200 Subject: [LLVMdev] LLVM{Add,Remove}FunctionAttr totally broken In-Reply-To: <13363F03-8968-451A-982A-D6A72B848EFF@fuhm.net> References: <13363F03-8968-451A-982A-D6A72B848EFF@fuhm.net> Message-ID: <4B45D0BA.30506@iki.fi> On 2009-12-30 01:16, James Y Knight wrote: > The LLVMAddFunctionAttr and LLVMRemoveFunctionAttr are busted: they > actually set the return value's attributes, not the function's > attributes. There seems to be no C API for actually setting the > function attributes. > > LLVMGetFunctionAttr, however, does correctly return the function > attributes, not the return value's attributes. There is no C API for > getting the return value attributes. (And if the above functions are > fixed, there would also be no way to set return value attributes). > > I'd like to propose that LLVM{Add,Remove}FunctionAttr be fixed to > actually set the function attributes. And that a new API, LLVM > {Add,Remove,Get}RetAttr be added, like: > > void LLVMAddRetAttr(LLVMValueRef Fn, LLVMAttribute PA); > void LLVMRemoveRetAttr(LLVMValueRef Fn, LLVMAttribute PA); > LLVMAttribute LLVMGetRetAttr(LLVMValueRef Fn); > > which will do the associated actions on the return value. > > If this is acceptable, I can submit a trivial patch that implements it. Given that they are indeed completely broken, I suggest filing a bug in the Bugzilla to make sure this gets fixed. Remember to attach your patch! From jon at ffconsultancy.com Thu Jan 7 20:52:00 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Fri, 8 Jan 2010 02:52:00 +0000 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B46587B.2070808@laurences.net> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> Message-ID: <201001080252.00201.jon@ffconsultancy.com> On Thursday 07 January 2010 21:56:11 Dustin Laurence wrote: > On 01/07/2010 01:38 PM, David Greene wrote: > > The way this works on many targets is that the caller allocates stack > > space in its frame for the returned struct and passes a pointer to it > > as a first "hidden" argument to the callee. The callee then copies > > that data into the space pointed to by the address. > > > > > Long-term, first-class status means that returns of structs should > > "just work" and you don't need to worry about getting a pointer to > > invalid memory. > > OK, so my thought of constructing the object on the stack was correct? No. The idea is that you pass the structs around as values and not that you alloca them and pass by reference/pointer. > What bothers me about that is the explicit specification with alloca > that the space is reserved in the callee's frame. Yes. Don't do that. > Do I just trust the > optimizer to eliminate that and turn the reference to alloca'd memory > into a reference to the space reserved by the caller? No. LLVM is trusting you not to return pointers to locals. > Or is that going > to create an unnecessary copy from the alloca'd memory to that reserved > by the caller? From what you said my guess is the former (optimizer > eliminates the pointless temporary), but us premature optimizers like to > be reassured we haven't given up an all-important microsecond. :-) I have had great success with my HLVM project by passing around large numbers of large structs by hand. LLVM has not only survived but actually generated decent code that beats most languages according to my benchmarks. In particular, HLVM uses "fat" quadword references (where word = sizeof(void*)) that are passed everywhere by value except when a struct is returned and HLVM gets the caller to alloca and passes that space by pointer to the callee for it to fill in. > > ...I believe right now, however, only structs up to a > > certain size are supported, perhaps because under some ABIs, small > > structs can be returned in registers and one doesn't need to worry > > about generating the hidden argument. > > In the case that prompted the question the struct isn't going to be > bigger than two of whatever the architecture regards as a word, which > surely should be fine, but in principle shouldn't LLVM and not the > front-end programmer be making the decision about whether the struct is > big enough to spill into memory? Good question. There was a very interesting discussion about this here a while ago and everyone coming to LLVM says the same thing: why doesn't LLVM just handle this for me automatically? The answer is that LLVM cannot make that decision because it depends upon the ABI. C99 apparently returns user-defined structs of two doubles by reference but complex numbers in registers. So the ABI requires knowledge of the front-end and, therefore, LLVM cannot fully automate this. Something LLVM could do is spill safely when it knows you don't care about the foreign ABI (e.g. with fastcc) and that work is underway. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From arplynn at gmail.com Thu Jan 7 20:03:32 2010 From: arplynn at gmail.com (Alastair Lynn) Date: Fri, 8 Jan 2010 02:03:32 +0000 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B46587B.2070808@laurences.net> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> Message-ID: <3F8857D1-2996-4268-AB77-A9B1C83311C2@gmail.com> Hi Dustin- You'll probably need to use insertvalue to construct your return value. Alastair On 7 Jan 2010, at 21:56, Dustin Laurence wrote: > define %Token @foo() > { > ... > > ret %Token {%c_int %token, %i8* %value} > } From dllaurence at dslextreme.com Thu Jan 7 21:48:24 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Thu, 07 Jan 2010 19:48:24 -0800 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <3F8857D1-2996-4268-AB77-A9B1C83311C2@gmail.com> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> <3F8857D1-2996-4268-AB77-A9B1C83311C2@gmail.com> Message-ID: <4B46AB08.7050206@laurences.net> On 01/07/2010 06:03 PM, Alastair Lynn wrote: > > You'll probably need to use insertvalue to construct your return value. Ah ha! The fact is I didn't really understand the significance of this part when I read it, and so didn't remember it when I needed it. OK, so I have tested it and I can now build up a struct like this %s1 = insertvalue {i32, i32} {i32 0, i32 0}, i32 1, 0 ; s1 = {1,0} %s2 = insertvalue {i32, i32} %s1, i32 2, 1 ; %s2 == {1,2} which reminds me of another thing I never understood. I can't make my code (slightly) more readable by changing that to something like %s0 = {i32 0, i32 0} %s1 = insertvalue {i32, i32} %s0, i32 1, 0 ; s1 = {1,0} %s2 = insertvalue {i32, i32} %s1, i32 2, 1 ; %s2 == {1,2} because LLVM will complain that it "expected instruction opcode" at the assignment to %s0. If there is a general way to give names to constants in that way I didn't find it. In fact, I think I tended not to use temporaries like I would variables precisely because when I tried the second alternative as the natural way to hand-code it and it didn't work, I didn't think how to phrase it so only the results of operations get named. Help me understand the underlying logic--why can one only name the results of operations? I realize that the local temporaries are notionally register variables for a machine with an infinite number of registers, but my very dim memory of real assembly was that I not only could load constants into registers but had to do so. What part of the picture am I missing here? You need an IR tutorial. Or, to speak correctly, *I* need a tutorial. :-) But I'm learning.... Dustin From dllaurence at dslextreme.com Thu Jan 7 21:57:17 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Thu, 07 Jan 2010 19:57:17 -0800 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <201001080252.00201.jon@ffconsultancy.com> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> <201001080252.00201.jon@ffconsultancy.com> Message-ID: <4B46AD1D.8080002@laurences.net> On 01/07/2010 06:52 PM, Jon Harrop wrote: > No. The idea is that you pass the structs around as values and not that you > alloca them and pass by reference/pointer. OK, then I need to learn more syntax (which Alistair Lynn got me started on, it appears :-). > No. LLVM is trusting you not to return pointers to locals. How naive. :-) > I have had great success with my HLVM project by passing around large numbers > of large structs by hand. LLVM has not only survived but actually generated > decent code that beats most languages according to my benchmarks. That's good to know, because I prefer the style of returning structs rather than passing around pointers or using static data. I'll be happy to convert my lexer over to returning structs instead of pulling lex-style tricks. > ...LLVM cannot make that > decision because it depends upon the ABI. C99 apparently returns user-defined > structs of two doubles by reference but complex numbers in registers. So the > ABI requires knowledge of the front-end and, therefore, LLVM cannot fully > automate this. Huh. I'd never have guessed (and would have been quite annoyed if I had, since numerical code is often at the edge of whatever the computing budget is (meaning the problem was the largest one the researcher could afford to solve, not the one he wished he was solving). > Something LLVM could do is spill safely when it knows you don't care about the > foreign ABI (e.g. with fastcc) and that work is underway. Dustin From baldrick at free.fr Fri Jan 8 00:24:53 2010 From: baldrick at free.fr (Duncan Sands) Date: Fri, 08 Jan 2010 07:24:53 +0100 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B4651F5.80108@laurences.net> References: <4B4651F5.80108@laurences.net> Message-ID: <4B46CFB5.10505@free.fr> Hi Dustin, > I think I'm missing something basic about the semantics of returning an > aggregate type (in my case, a structure) from a function. Returning a > structure containing only compile-time constants is simple enough. But > I don't quite get how this works with a struct composed at run-time. If > I constructed it on the stack with alloca, would I be letting a stack > variable escape to to a context where it doesn't exist if I return it? first class aggregates are basically implemented by sticking the value of each struct field in a register. For example, returning a struct with two i32 fields amounts to placing each of the fields in a machine register (eg: EAX, ECX) then returning from the function. The caller gets hold of the values by reading EAX and ECX. Note that it doesn't return a pointer to the struct, it returns the value of the struct. Suppose you have stored the struct in an alloca. Then to get it as a first class aggregate, you first need to do a load of the alloca (this results in an LLVM register of aggregate type, equivalent to two machine registers of type i32), then return the loaded value. At the machine code level, this corresponds to loading the first field from the stack into EAX, loading the second field from the stack into EDX then returning. > Or does the return semantics guarantee it will be copied (or space > allocated in the caller) appropriately? Otherwise I should abandon the > idea of returning such a struct and simply pass in a pointer to space > allocated in the caller. If the struct is so big that there aren't enough machine registers available to return it, then the code generator will automagically allocate some stack space in the caller, pass a pointer to it into the called function, and have the callee return the struct by copying it into the passed stack space [*]. Ciao, Duncan. [*] This functionality was implemented after LLVM 2.6 was released. In LLVM 2.6 the code generator will crash if it runs out of registers. In this case you should pass in a pointer by hand. From baldrick at free.fr Fri Jan 8 00:46:05 2010 From: baldrick at free.fr (Duncan Sands) Date: Fri, 08 Jan 2010 07:46:05 +0100 Subject: [LLVMdev] sqrt In-Reply-To: <4367C926-465F-4FF3-B53A-AB4AA17F3C25@apple.com> References: <201001071506.07186.jon@ffconsultancy.com> <871D53C7-949B-4943-9059-CB176115DD4B@apple.com> <2FAA3426-D7B0-4CEA-BFDC-2B3DAB9A3267@gmail.com> <4367C926-465F-4FF3-B53A-AB4AA17F3C25@apple.com> Message-ID: <4B46D4AD.5000504@free.fr> Hi Dale, > It is more than errno. sqrt() should't compile to llvm.sqrt on any > platform that uses IEEE754 math, because IEEE sqrt() is well defined > on negative arguments and llvm.sqrt isn't. There are currently no > command line arguments to override this. -funsafe-math-optimizations? Ciao, Duncan. From baldrick at free.fr Fri Jan 8 00:55:03 2010 From: baldrick at free.fr (Duncan Sands) Date: Fri, 08 Jan 2010 07:55:03 +0100 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <201001080252.00201.jon@ffconsultancy.com> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> <201001080252.00201.jon@ffconsultancy.com> Message-ID: <4B46D6C7.2090700@free.fr> Hi Jon, >> In the case that prompted the question the struct isn't going to be >> bigger than two of whatever the architecture regards as a word, which >> surely should be fine, but in principle shouldn't LLVM and not the >> front-end programmer be making the decision about whether the struct is >> big enough to spill into memory? > > Good question. There was a very interesting discussion about this here a while > ago and everyone coming to LLVM says the same thing: why doesn't LLVM just > handle this for me automatically? The answer is that LLVM cannot make that > decision because it depends upon the ABI. actually LLVM does now handle this for you automatically: if there aren't enough registers to return the first class aggregate in registers, then it is automagically returned on the stack. If this is not ABI conformant then it is up to the front-end to not generate IR that returns such large structs. In practice front-ends only generate functions returning first class aggregates when the ABI says the aggregate should be entirely returned in registers. Thus by definition it is sure not to require more registers than the machine has! This is why the enhancement to automagically use the stack if there aren't enough registers has no impact on ABI conformance. Ciao, Duncan. From christian.plessl at uni-paderborn.de Fri Jan 8 02:56:55 2010 From: christian.plessl at uni-paderborn.de (Christian Plessl) Date: Fri, 8 Jan 2010 09:56:55 +0100 Subject: [LLVMdev] lli segfaults when using JIT in LLVM 2.6 on OS X 10.6/x86 Message-ID: Hi all I'm currently porting some code to LLVM 2.6 and I stumbled over a weird problem with using JIT compilation in lli on Mac OS X 10.6.2. When running lli without any specific command line options it crashes with a segfault. When I specify "-force-interpreter" or "-march=x86-64", everything works as expected. I can reproduce the problem as follows: // hello.c int main(int argc, char *argv[]){ return 42; } llvm-gcc -emit-llvm -c hello.c -o hello.bc lli hello.bc => segfaults lli -march=x86-64 hello.bc => works lli -force-interpreter hello.bc => works I'm running Mac OS X 10.6.2 on a MacBook Pro (Intel Core2 Duo). I have built LLVM from the released sources of version 2.6 using the CMake build system with the following configuration: cmake .. -DCMAKE_BUILD_TYPE:STRING=Debug -DCMAKE_INSTALL_PREFIX:PATH=$HOME/opt/llvm Does anybody have an idea what causes this problem and how to solve it? Best regards, Christian From etherzhhb at gmail.com Fri Jan 8 07:16:34 2010 From: etherzhhb at gmail.com (ether) Date: Fri, 08 Jan 2010 21:16:34 +0800 Subject: [LLVMdev] integrate LLVM Poly into existing LLVM infrastructure In-Reply-To: <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> Message-ID: <4B473032.5080803@gmail.com> hi all, On 2010-1-7 0:11, John Mosby wrote: > In LLVM we could add support for generalized CFG regions and > RegionPasses. A region is a part of the CFG. The only information we > have is, that it has one entry and one exit, this it can be optimized > separately. > I think this is the best way to add region analysis. I must admit > this approach > helps me on another, similar project I'm working on in parallel (no > pun intended). > Tobias, is this how you are architecting your region analysis already? > > John > i just implementing the skeleton of Region/RegionInfo like LoopBase and LoopInfoBase[1] in the llvm existing codes, and found that theres lots of common between "Region" and "Loop": 1. both of them are consist of several BasicBlocks 2. both of them have some kind of nested structures, so both a loop and a region could have parent or childrens 3. both of them have a BasicBlocks(header of a loop and "entry" of a region) that dominates all others and the Region class will have the most stuffs very similar in LoopBase, like: ParentRegion, SubRegions, Blocks, getRegionDepth(), getExitBlock(), getExitingBlock() ...... so, could us just treat "Loop" as some kind of general "Region" of BasicBlocks, and make Loop and Region inherit from "RegionBase"? [1] http://llvm.org/doxygen/LoopInfo_8h-source.html best regards --ether From etherzhhb at gmail.com Fri Jan 8 07:20:24 2010 From: etherzhhb at gmail.com (ether) Date: Fri, 08 Jan 2010 21:20:24 +0800 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B473032.5080803@gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> Message-ID: <4B473118.4090104@gmail.com> sorry that i forgot to change the subjuect hi all, On 2010-1-7 0:11, John Mosby wrote: > In LLVM we could add support for generalized CFG regions and > RegionPasses. A region is a part of the CFG. The only information we > have is, that it has one entry and one exit, this it can be optimized > separately. > I think this is the best way to add region analysis. I must admit this > approach > helps me on another, similar project I'm working on in parallel (no > pun intended). > Tobias, is this how you are architecting your region analysis already? > > John > i just implementing the skeleton of Region/RegionInfo like LoopBase and LoopInfoBase[1] in the llvm existing codes, and found that theres lots of common between "Region" and "Loop": 1. both of them are consist of several BasicBlocks 2. both of them have some kind of nested structures, so both a loop and a region could have parent or childrens 3. both of them have a BasicBlocks(header of a loop and "entry" of a region) that dominates all others and the Region class will have the most stuffs very similar in LoopBase, like: ParentRegion, SubRegions, Blocks, getRegionDepth(), getExitBlock(), getExitingBlock() ...... so, could us just treat "Loop" as some kind of general "Region" of BasicBlocks, and make Loop and Region inherit from "RegionBase"? [1] http://llvm.org/doxygen/LoopInfo_8h-source.html best regards --ether From gvenn.cfe.dev at gmail.com Fri Jan 8 13:12:16 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Fri, 8 Jan 2010 14:12:16 -0500 Subject: [LLVMdev] Exception Implementation Example added to Wiki Message-ID: <7F239E64-D3F4-4113-B670-489DC29FEFC6@gmail.com> I just added an exception example to the wiki intended to be run in a JIT environment. Although this information is heavily date dependent, as the LLVM exception subsystem will be modified as time goes on, I could have used such an example when I was looking into this; hence the submission. Having said this, I'm not an LLVM expert, and even though the code works, I'm sure there are omissions and inaccuracies, so if the experts have the time ... Hopefully some members of the community will find this beneficial Garrison PS: Is there a better to upload source to the wiki than merely pasting it in? Only image file types seemed to be allowed for upload. From grosser at fim.uni-passau.de Fri Jan 8 13:27:58 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Fri, 08 Jan 2010 20:27:58 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> Message-ID: <4B47873E.6000409@fim.uni-passau.de> On 01/08/10 14:20, ether wrote: > sorry that i forgot to change the subjuect Hi ether, sounds interesting. Actually is/may be some kind of region. If you want you can have a look at the analysis, that I wrote. It is not yet finished, not completely documented and work in progress. However the first big comment might be interesting for you. Or seeing the results of opt -regions -analyze The git repo to see it is here: http://repo.or.cz/w/llvm-complete/tobias-sandbox.git/shortlog/refs/heads/region I will think about this and maybe reply again. Tobi > hi all, > > On 2010-1-7 0:11, John Mosby wrote: >> In LLVM we could add support for generalized CFG regions and >> RegionPasses. A region is a part of the CFG. The only information we >> have is, that it has one entry and one exit, this it can be optimized >> separately. >> I think this is the best way to add region analysis. I must admit this >> approach >> helps me on another, similar project I'm working on in parallel (no >> pun intended). >> Tobias, is this how you are architecting your region analysis already? >> >> John >> > > i just implementing the skeleton of Region/RegionInfo like LoopBase and > LoopInfoBase[1] in the llvm existing codes, and found that theres lots > of common between "Region" and "Loop": > > 1. both of them are consist of several BasicBlocks > 2. both of them have some kind of nested structures, so both a loop and > a region could have parent or childrens > 3. both of them have a BasicBlocks(header of a loop and "entry" of a > region) that dominates all others > > and the Region class will have the most stuffs very similar in LoopBase, > like: ParentRegion, SubRegions, Blocks, getRegionDepth(), > getExitBlock(), getExitingBlock() ...... > > so, could us just treat "Loop" as some kind of general "Region" of > BasicBlocks, and make Loop and Region inherit from "RegionBase"? > > > [1] http://llvm.org/doxygen/LoopInfo_8h-source.html > > best regards > --ether From dllaurence at dslextreme.com Fri Jan 8 15:52:45 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Fri, 08 Jan 2010 13:52:45 -0800 Subject: [LLVMdev] Inlining Message-ID: <4B47A92D.4020403@laurences.net> OK, I wanted to understand function inlining in LLVM but had avoided going to the effort of finding out if the inlining was really happening. The advice I got to "use the assembly source, Luke" suggested I go ahead and investigate inlining for a bit of practice, since (so I figured) even a monkey with really weak x86-fu could tell whether a function call was happening or not. If this monkey can tell, it isn't happening. :-) I'll try to provide all useful information. For my null test, I attempted to specify no inlining in a little program that computes a Very Important Number :-) : --- define fastcc i32 @foo(i32 %arg) noinline { %result = mul i32 %arg, 7 ret i32 %result } define i32 @main(i32 %argc, i8 **%argv) { %retVal = call fastcc i32 @foo(i32 6) noinline ret i32 %retVal } --- and after my Makefile executed the following commands: gemini:~/Projects/Nil/nil(0)$ make testInline.s testInline llvm-as testInline.ll llc -O0 -f testInline.bc cc testInline.s -o testInline rm testInline.bc gemini:~/Projects/Nil/nil(0)$ we can compute that Very Important Number gemini:~/Projects/Nil/nil(0)$ ./testInline ; echo $? 42 gemini:~/Projects/Nil/nil(0)$ and the generated assembly (with much red tape snipped for now): --- .file "testInline.bc" .text .align 16 .globl foo .type foo, at function foo: # @foo .Leh_func_begin1: .LBB1_0: imull $7, %edi, %eax ret .size foo, .-foo .Leh_func_end1: .align 16 .globl main .type main, at function main: # @main .Leh_func_begin2: .LBB2_0: subq $8, %rsp .Llabel1: movl $6, %edi call foo addq $8, %rsp ret .size main, .-main .Leh_func_end2: --- Even this monkey (thinks he) can see the constant 6 being passed to foo in %edi. So far so good. Now I tried to get it to inline, without much luck. Putting together everything I tried into one test, I changed 'noinline' to 'alwaysinline' (and changing the linkage, as I gather that would be appropriate for multiple files) --- ; testInline.ll -- test code for inlining. define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { %result = mul i32 %arg, 7 ret i32 %result } define i32 @main(i32 %argc, i8 **%argv) { %retVal = call fastcc i32 @foo(i32 6) alwaysinline ret i32 %retVal } --- and bumped up the optimization level to O3: rm -f nil c_defs c_defs.llh *.bc *.s *.o testInline # *.ll gemini:~/Projects/Nil/nil(0)$ make testInline.s testInline llvm-as testInline.ll llc -O3 -f testInline.bc cc testInline.s -o testInline rm testInline.bc gemini:~/Projects/Nil/nil(0)$ which generates --- .file "testInline.bc" .section .gnu.linkonce.t.foo,"ax", at progbits .align 16 .weak foo .type foo, at function foo: # @foo .Leh_func_begin1: .LBB1_0: imull $7, %edi, %eax ret .size foo, .-foo .Leh_func_end1: .text .align 16 .globl main .type main, at function main: # @main .Leh_func_begin2: .LBB2_0: subq $8, %rsp .Llabel1: movl $6, %edi call foo addq $8, %rsp ret .size main, .-main .Leh_func_end2: --- Which only differs in putting foo in a linkonce section instead of in .text and in specifying (what I think is) .weak linkage instead of .globl, so apparently the multiplication was not inlined. There are no other differences in the snipped red-tape, I checked with diff. What did monkey do wrong? Dustin From kennethuil at gmail.com Fri Jan 8 16:05:21 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Fri, 8 Jan 2010 16:05:21 -0600 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B46CFB5.10505@free.fr> References: <4B4651F5.80108@laurences.net> <4B46CFB5.10505@free.fr> Message-ID: <400d33ea1001081405j14c63d06o513ed420d8939880@mail.gmail.com> On Fri, Jan 8, 2010 at 12:24 AM, Duncan Sands wrote: > Hi Dustin, > >> I think I'm missing something basic about the semantics of returning an >> aggregate type (in my case, a structure) from a function. ?Returning a >> structure containing only compile-time constants is simple enough. ?But >> I don't quite get how this works with a struct composed at run-time. ?If >> I constructed it on the stack with alloca, would I be letting a stack >> variable escape to to a context where it doesn't exist if I return it? > > first class aggregates are basically implemented by sticking the value of each > struct field in a register. ?For example, returning a struct with two i32 > fields amounts to placing each of the fields in a machine register (eg: EAX, > ECX) then returning from the function. ?The caller gets hold of the values > by reading EAX and ECX. ?Note that it doesn't return a pointer to the struct, > it returns the value of the struct. ?Suppose you have stored the struct in > an alloca. ?Then to get it as a first class aggregate, you first need to do > a load of the alloca (this results in an LLVM register of aggregate type, > equivalent to two machine registers of type i32), then return the loaded value. > At the machine code level, this corresponds to loading the first field from > the stack into EAX, loading the second field from the stack into EDX then > returning. > >> Or does the return semantics guarantee it will be copied (or space >> allocated in the caller) appropriately? ?Otherwise I should abandon the >> idea of returning such a struct and simply pass in a pointer to space >> allocated in the caller. > > If the struct is so big that there aren't enough machine registers available > to return it, then the code generator will automagically allocate some stack > space in the caller, pass a pointer to it into the called function, and have > the callee return the struct by copying it into the passed stack space [*]. There are small target hooks that need to be implemented for each target to get this to work. As far as I know, the only hook implemented was for x86. From rjmccall at apple.com Fri Jan 8 16:10:28 2010 From: rjmccall at apple.com (John McCall) Date: Fri, 8 Jan 2010 14:10:28 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <4B47A92D.4020403@laurences.net> References: <4B47A92D.4020403@laurences.net> Message-ID: <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> On Jan 8, 2010, at 1:52 PM, Dustin Laurence wrote: > gemini:~/Projects/Nil/nil(0)$ make testInline.s testInline > llvm-as testInline.ll > llc -O3 -f testInline.bc 'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat machine-code and object-file optimizations, but it does not apply high-level optimizations like CSE or inlining. 'opt' is the tool which does IR-to-IR optimization. John. From viridia at gmail.com Fri Jan 8 16:14:26 2010 From: viridia at gmail.com (Talin) Date: Fri, 8 Jan 2010 14:14:26 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: Message-ID: Anyone want to take a look at this and tell me if I am on a reasonable path? On Wed, Jan 6, 2010 at 12:45 PM, Talin wrote: > This patch adds a UnionType to DerivedTypes.h. It also adds code to the > bitcode reader / writer and the assembly parser for the new type, as well as > a tiny .ll test file in test/Assembler. It does not contain any code related > to code generation or type layout - I wanted to see if this much was > acceptable before I proceeded any further. > > Unlike my previous patch, in which the Union type was implemented as a > packing option to type Struct (thereby re-using the machinery of Struct), > this patch defines Union as a completely separate type from Struct. > > I was a little uncertain as to how to write the tests. I'd particularly > like to write tests for the bitcode reader/writer stuff, but I am not sure > how to do that. > > -- > -- Talin > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100108/bbd61984/attachment.html From dllaurence at dslextreme.com Fri Jan 8 17:08:42 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Fri, 08 Jan 2010 15:08:42 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> Message-ID: <4B47BAFA.50605@laurences.net> On 01/08/2010 02:10 PM, John McCall wrote: > 'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat > machine-code and object-file optimizations, but it does not apply > high-level optimizations like CSE or inlining. 'opt' is the tool > which does IR-to-IR optimization. A vital clue, but I'm still not getting it: --- gemini:~/Projects/Nil/nil(0)$ make testInline.optdis.ll llvm-as testInline.ll opt -always-inline testInline.bc -o testInline.optbc llvm-dis -f testInline.optbc -o testInline.optdis.ll rm testInline.bc testInline.optbc gemini:~/Projects/Nil/nil(0)$ cat testInline.optdis.ll ; ModuleID = 'testInline.optbc' define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { %result = mul i32 %arg, 7 ; [#uses=1] ret i32 %result } define i32 @main(i32 %argc, i8** %argv) { %retVal = call fastcc i32 @foo(i32 6) alwaysinline ; [#uses=1] ret i32 %retVal } gemini:~/Projects/Nil/nil(0)$ --- Perhaps the -always-inline pass has a prerequisite pass? I also tried it with "-O3 -always-inline", which got halfway there: --- ; ModuleID = 'testInline.optbc' define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { %result = mul i32 %arg, 7 ; [#uses=1] ret i32 %result } define i32 @main(i32 %argc, i8** nocapture %argv) { %retVal = tail call fastcc i32 @foo(i32 6) alwaysinline ; [#uses=1] ret i32 %retVal } --- I'm pleased to get the tailcall optimization, but in this case was looking for the 'no call at all' optimization. :-) Dustin From aaronngray.lists at googlemail.com Fri Jan 8 17:10:57 2010 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Fri, 8 Jan 2010 23:10:57 +0000 Subject: [LLVMdev] Cygwin llvm-gcc regression Message-ID: <9719867c1001081510t75df73aek2b4b4e14829150fe@mail.gmail.com> I am getting an assertion firing while building TOT llvm-gcc on Cygwin in libgcc2.c :- /home/ang/build/llvm-gcc/./gcc/xgcc -B/home/ang/build/llvm-gcc/./gcc/ -B/home/an g/llvm-gcc/i686-pc-cygwin/bin/ -B/home/ang/llvm-gcc/i686-pc-cygwin/lib/ -isystem /home/ang/llvm-gcc/i686-pc-cygwin/include -isystem /home/ang/llvm-gcc/i686-pc-c ygwin/sys-include -O2 -I/home/ang/svn/llvm-gcc/gcc/../winsup/w32api/include -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prot otypes -Wold-style-definition -isystem ./include -g -DIN_LIBGCC2 -D__GCC_FLO AT_NOT_NEEDED -I. -I. -I/home/ang/svn/llvm-gcc/gcc -I/home/ang/svn/llvm-gcc/gcc /. -I/home/ang/svn/llvm-gcc/gcc/../include -I/home/ang/svn/llvm-gcc/gcc/../libcp p/include -I/home/ang/svn/llvm-gcc/gcc/../libdecnumber -I../libdecnumber -I/hom e/ang/build/llvm/include -I/home/ang/svn/llvm/include -DL_powixf2 -c /home/ang/s vn/llvm-gcc/gcc/libgcc2.c -o libgcc/./_powixf2.o assertion "(!TYPE_SIZE(Tr) || !Ty->isSized() || !isInt64(TYPE_SIZE(Tr), true) || getInt64(TYPE_SIZE(Tr), true) == getTargetData().getTypeAllocSizeInBits(Ty)) && "LLVM type size doesn't match GCC type size!"" failed: file "/home/ang/svn/llvm -gcc/gcc/llvm-types.cpp", line 83 This assertion has not changed from the past and am not sure what is causing this. If someone more familiar with the code could have a look at this please. Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100108/9faffc34/attachment.html From dag at cray.com Fri Jan 8 18:02:30 2010 From: dag at cray.com (David Greene) Date: Fri, 8 Jan 2010 18:02:30 -0600 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <201001080252.00201.jon@ffconsultancy.com> References: <4B4651F5.80108@laurences.net> <4B46587B.2070808@laurences.net> <201001080252.00201.jon@ffconsultancy.com> Message-ID: <201001081802.30548.dag@cray.com> On Thursday 07 January 2010 20:52, Jon Harrop wrote: > Good question. There was a very interesting discussion about this here a > while ago and everyone coming to LLVM says the same thing: why doesn't LLVM > just handle this for me automatically? The answer is that LLVM cannot make > that decision because it depends upon the ABI. C99 apparently returns > user-defined structs of two doubles by reference but complex numbers in > registers. So the ABI requires knowledge of the front-end and, therefore, > LLVM cannot fully automate this. It's not a C99 thing, but an ABI thing. And for the x86-64 ABI, complex double and a struct of two doubles is returned in exactly the same way. That may not be true for other ABIs. I'm not as familiar with them. On some targets it certainly should be possible to do the right thing. -Dave From clattner at apple.com Fri Jan 8 18:36:47 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 8 Jan 2010 16:36:47 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <4B47BAFA.50605@laurences.net> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> Message-ID: On Jan 8, 2010, at 3:08 PM, Dustin Laurence wrote: > On 01/08/2010 02:10 PM, John McCall wrote: > >> 'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat >> machine-code and object-file optimizations, but it does not apply >> high-level optimizations like CSE or inlining. 'opt' is the tool >> which does IR-to-IR optimization. > > A vital clue, but I'm still not getting it: Try opt -O3. -Chris From jlerouge at apple.com Fri Jan 8 19:01:21 2010 From: jlerouge at apple.com (Julien Lerouge) Date: Fri, 8 Jan 2010 17:01:21 -0800 Subject: [LLVMdev] [PATCH] Fix nondeterministic behaviour in the CodeExtractor Message-ID: <20100109010120.GB6338@pom.apple.com> Hello, The CodeExtractor contains a std::set to keep track of the blocks to extract. Iterators on this set are not deterministic, and so the functions that are generated are not (the order of the inputs/outputs can change). The attached patch uses a SetVector instead. Ok to apply ? Thanks, Julien -- Julien Lerouge PGP Key Id: 0xB1964A62 PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62 PGP Public Key from: keyserver.pgp.com -------------- next part -------------- Index: lib/Transforms/Utils/CodeExtractor.cpp =================================================================== --- lib/Transforms/Utils/CodeExtractor.cpp (revision 93030) +++ lib/Transforms/Utils/CodeExtractor.cpp (working copy) @@ -29,6 +29,7 @@ #include "llvm/Support/Debug.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/SetVector.h" #include "llvm/ADT/StringExtras.h" #include #include @@ -45,7 +46,7 @@ namespace { class CodeExtractor { typedef std::vector Values; - std::set BlocksToExtract; + SetVector BlocksToExtract; DominatorTree* DT; bool AggregateArgs; unsigned NumExitBlocks; @@ -135,7 +136,7 @@ // We only want to code extract the second block now, and it becomes the new // header of the region. BasicBlock *OldPred = Header; - BlocksToExtract.erase(OldPred); + BlocksToExtract.remove(OldPred); BlocksToExtract.insert(NewBB); Header = NewBB; @@ -180,7 +181,7 @@ } void CodeExtractor::splitReturnBlocks() { - for (std::set::iterator I = BlocksToExtract.begin(), + for (SetVector::iterator I = BlocksToExtract.begin(), E = BlocksToExtract.end(); I != E; ++I) if (ReturnInst *RI = dyn_cast((*I)->getTerminator())) { BasicBlock *New = (*I)->splitBasicBlock(RI, (*I)->getName()+".ret"); @@ -206,7 +207,7 @@ // void CodeExtractor::findInputsOutputs(Values &inputs, Values &outputs) { std::set ExitBlocks; - for (std::set::const_iterator ci = BlocksToExtract.begin(), + for (SetVector::const_iterator ci = BlocksToExtract.begin(), ce = BlocksToExtract.end(); ci != ce; ++ci) { BasicBlock *BB = *ci; @@ -482,7 +483,7 @@ std::map ExitBlockMap; unsigned switchVal = 0; - for (std::set::const_iterator i = BlocksToExtract.begin(), + for (SetVector::const_iterator i = BlocksToExtract.begin(), e = BlocksToExtract.end(); i != e; ++i) { TerminatorInst *TI = (*i)->getTerminator(); for (unsigned i = 0, e = TI->getNumSuccessors(); i != e; ++i) @@ -633,7 +634,7 @@ Function::BasicBlockListType &oldBlocks = oldFunc->getBasicBlockList(); Function::BasicBlockListType &newBlocks = newFunction->getBasicBlockList(); - for (std::set::const_iterator i = BlocksToExtract.begin(), + for (SetVector::const_iterator i = BlocksToExtract.begin(), e = BlocksToExtract.end(); i != e; ++i) { // Delete the basic block from the old function, and the list of blocks oldBlocks.remove(*i); From mmuller at enduden.com Fri Jan 8 18:49:12 2010 From: mmuller at enduden.com (Michael Muller) Date: Fri, 08 Jan 2010 19:49:12 -0500 Subject: [LLVMdev] Using a function from another module Message-ID: <16528.1262998152.232706.1794962677@succubus> Hi all, I'm trying to use a function defined in one LLVM module from another module (in the JIT) but for some reason it's not working out. My sequence of activity is roughly like this: 1) Create moduleA 2) Create moduleB with "func()" 3) execEng = ExecutionEngine::create( new ExistingModuleProvider(moduleB)); 4) execute "func()" (this works fine) 4) add "func()" to moduleA as a declaration (no code blocks) with External linkage. 5) execEng->addModuleProvider(new ExistingModuleProvider(moduleA)); 6) run a function in moduleA that calls "func()" I get: LLVM ERROR: Program used external function 'func' which could not be resolved! I'm guessing I'm either going about this wrong or missing something. Can anyone offer me some insight? ============================================================================= michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- We are the music-makers, and we are the dreamers of dreams - Arthur O'Shaughnessy ============================================================================= From clattner at apple.com Fri Jan 8 19:04:17 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 8 Jan 2010 17:04:17 -0800 Subject: [LLVMdev] [PATCH] Fix nondeterministic behaviour in the CodeExtractor In-Reply-To: <20100109010120.GB6338@pom.apple.com> References: <20100109010120.GB6338@pom.apple.com> Message-ID: On Jan 8, 2010, at 5:01 PM, Julien Lerouge wrote: > Hello, > > The CodeExtractor contains a std::set to keep track of > the > blocks to extract. Iterators on this set are not deterministic, and so > the functions that are generated are not (the order of the > inputs/outputs can change). > > The attached patch uses a SetVector instead. Ok to apply ? Nice catch, please apply, -Chris > > Thanks, > Julien > > > -- > Julien Lerouge > PGP Key Id: 0xB1964A62 > PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62 > PGP Public Key from: keyserver.pgp.com > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From overminddl1 at gmail.com Fri Jan 8 19:06:36 2010 From: overminddl1 at gmail.com (OvermindDL1) Date: Fri, 8 Jan 2010 18:06:36 -0700 Subject: [LLVMdev] Exception Implementation Example added to Wiki In-Reply-To: <7F239E64-D3F4-4113-B670-489DC29FEFC6@gmail.com> References: <7F239E64-D3F4-4113-B670-489DC29FEFC6@gmail.com> Message-ID: <3f49a9f41001081706of82d965q2079a733614c9e73@mail.gmail.com> On Fri, Jan 8, 2010 at 12:12 PM, Garrison Venn wrote: > I just added an exception example to the wiki intended to be run in a JIT environment. Although this information is heavily date dependent, > as the LLVM exception subsystem will be modified as time goes on, I could have used such an example when I was looking into this; hence the submission. > Having said this, I'm not an LLVM expert, and even though the code works, I'm sure there are omissions and inaccuracies, so if the experts have > the time ... > > Hopefully some members of the community will find this beneficial > > Garrison > > PS: Is there a better to upload source to the wiki than merely pasting it in? Only image file types seemed to be allowed for upload. Oh I could definitely use an example like that, would be very nice if there was such an example as an included project in LLVM itself like the other examples and tutorials, would force it to stay updated too. EDIT: Er, helps if I send to the LLVM list... From dag at cray.com Fri Jan 8 19:20:51 2010 From: dag at cray.com (David Greene) Date: Fri, 8 Jan 2010 19:20:51 -0600 Subject: [LLVMdev] Unaligned SSE Memop Support Patch Message-ID: <201001081920.51607.dag@cray.com> This patch adds a feature to allow SSE memops to be unaligned on supported architectures. Mostly I want to see if the naming is reasonable. Supporting unaligned memops requires a bit twiddle on 10h processors. This patch makes the assumption that the OS sets the bit correctly. Comments? -Dave -------------- next part -------------- A non-text attachment was scrubbed... Name: uamem.patch Type: text/x-diff Size: 3585 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100108/515ece53/attachment.bin From gohman at apple.com Fri Jan 8 20:43:12 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 8 Jan 2010 18:43:12 -0800 Subject: [LLVMdev] Unaligned SSE Memop Support Patch In-Reply-To: <201001081920.51607.dag@cray.com> References: <201001081920.51607.dag@cray.com> Message-ID: Please mention "SIMD" or "vector" in the description string in the SubtargetFeature definition, to avoid confusion. Also, it would be good to mention "for example, the AMD 10h" in a comment somewhere. Otherwise, looks good to me. Dan On Jan 8, 2010, at 5:20 PM, David Greene wrote: > This patch adds a feature to allow SSE memops to be unaligned > on supported architectures. > > Mostly I want to see if the naming is reasonable. > > Supporting unaligned memops requires a bit twiddle on 10h > processors. This patch makes the assumption that the OS > sets the bit correctly. > > Comments? > > -Dave > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From etherzhhb at gmail.com Fri Jan 8 20:43:02 2010 From: etherzhhb at gmail.com (ether) Date: Sat, 09 Jan 2010 10:43:02 +0800 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B47873E.6000409@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B47873E.6000409@fim.uni-passau.de> Message-ID: <4B47ED36.5080103@gmail.com> hi Tobi On 2010-1-9 3:27, Tobias Grosser wrote: > On 01/08/10 14:20, ether wrote: >> sorry that i forgot to change the subjuect > > Hi ether, > > sounds interesting. Actually is/may be some kind of region. If you > want you can have a look at the analysis, that I wrote. It is not yet > finished, not completely documented and work in progress. However the > first big comment might be interesting for you. Or seeing the results of > opt -regions -analyze > > The git repo to see it is here: > http://repo.or.cz/w/llvm-complete/tobias-sandbox.git/shortlog/refs/heads/region > that make sense to me, and if you make your Region class a subclass of LoopBase, the codes like "addChildLoop" and "getLoopDepth()" from LoopBase may help you a lot to manipulate regions in the later optimization passes (of course, we should give it a more meaningful name like "addChildRegion") :) and i think if we ignore the "goto" statement and "return" statement (i remember theres a pass in llvm that will make a function only return in one basicblock) in loops, loops also will have only one exit block, so we can treat loop as a special region that have back edge, and we can say, a loop must be a region but a region not necessary a region. we can read something about this in <>, 9.7 region-based analysis. best regards --ether > > I will think about this and maybe reply again. > > Tobi > > >> hi all, >> >> On 2010-1-7 0:11, John Mosby wrote: >>> In LLVM we could add support for generalized CFG regions and >>> RegionPasses. A region is a part of the CFG. The only information we >>> have is, that it has one entry and one exit, this it can be optimized >>> separately. >>> I think this is the best way to add region analysis. I must admit this >>> approach >>> helps me on another, similar project I'm working on in parallel (no >>> pun intended). >>> Tobias, is this how you are architecting your region analysis already? >>> >>> John >>> >> >> i just implementing the skeleton of Region/RegionInfo like LoopBase and >> LoopInfoBase[1] in the llvm existing codes, and found that theres lots >> of common between "Region" and "Loop": >> >> 1. both of them are consist of several BasicBlocks >> 2. both of them have some kind of nested structures, so both a loop and >> a region could have parent or childrens >> 3. both of them have a BasicBlocks(header of a loop and "entry" of a >> region) that dominates all others >> >> and the Region class will have the most stuffs very similar in LoopBase, >> like: ParentRegion, SubRegions, Blocks, getRegionDepth(), >> getExitBlock(), getExitingBlock() ...... >> >> so, could us just treat "Loop" as some kind of general "Region" of >> BasicBlocks, and make Loop and Region inherit from "RegionBase"? >> >> >> [1] http://llvm.org/doxygen/LoopInfo_8h-source.html >> >> best regards >> --ether > > From pazzodalegare at gmail.com Fri Jan 8 19:55:45 2010 From: pazzodalegare at gmail.com (Pazzo Da Legare) Date: Sat, 9 Jan 2010 02:55:45 +0100 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 Message-ID: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> Dear ML, I'm trying to understand how to build a llvm (2.6) cross compiler for arm-elf target using the gcc frontend with newlib. Could you please indicate, if possible steps I should follow? I try to configure and build llvm with ../llvm-2.6/configure --prefix=/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6 --enable-optimized --disable-threads --enable-targets=cbe,cpp,arm and LLVM-GCC frontend with ../llvm-gcc4.2-2.6.source/configure --prefix=/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6 --program-prefix=llvm- --enable-llvm=/Users/dummy/Develop/llvm/llvm-2.6 --enable-languages=c,c++ --host=i686-apple-darwin9 --build=i686-apple-darwin9 --target=arm-elf --with-gxx-include-dir=/usr/include/c++/4.0.0 --enable-interwork --with-newlib --with-header=../newlib-1.18.0/newlib/libc/include But I got the followings errors: /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s: Assembler messages: /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s:96: Error: selected processor does not support `sxtb r5,r5' /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s:537: Error: selected processor does not support `sxtb r6,r6' /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s:705: Error: selected processor does not support `sxtb r1,r1' /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s:711: Error: selected processor does not support `sxtb r1,r1' Thank you for your help, pz From nicholas at mxc.ca Fri Jan 8 23:17:58 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 09 Jan 2010 00:17:58 -0500 Subject: [LLVMdev] Inlining In-Reply-To: <4B47BAFA.50605@laurences.net> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> Message-ID: <4B481186.7050907@mxc.ca> Dustin Laurence wrote: > On 01/08/2010 02:10 PM, John McCall wrote: > > >> 'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat >> machine-code and object-file optimizations, but it does not apply >> high-level optimizations like CSE or inlining. 'opt' is the tool >> which does IR-to-IR optimization. >> > > A vital clue, but I'm still not getting it: > > --- > gemini:~/Projects/Nil/nil(0)$ make testInline.optdis.ll > llvm-as testInline.ll > opt -always-inline testInline.bc -o testInline.optbc > llvm-dis -f testInline.optbc -o testInline.optdis.ll > rm testInline.bc testInline.optbc > gemini:~/Projects/Nil/nil(0)$ cat testInline.optdis.ll > ; ModuleID = 'testInline.optbc' > > define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { > Try using 'internal' linkage instead of 'linkonce'. If you're sure you really want linkonce then you'd need to use linkonce_odr to get inlining here. Also, drop the alwaysinline attribute and '-always-inline' flag. The normal inliner (aka. "opt -inline" which is run as part of "opt -O3") should inline it. Nick > %result = mul i32 %arg, 7 ; [#uses=1] > ret i32 %result > } > > define i32 @main(i32 %argc, i8** %argv) { > %retVal = call fastcc i32 @foo(i32 6) alwaysinline ; [#uses=1] > ret i32 %retVal > } > gemini:~/Projects/Nil/nil(0)$ > --- > > Perhaps the -always-inline pass has a prerequisite pass? I also tried > it with "-O3 -always-inline", which got halfway there: > > --- > ; ModuleID = 'testInline.optbc' > > define linkonce fastcc i32 @foo(i32 %arg) alwaysinline { > %result = mul i32 %arg, 7 ; [#uses=1] > ret i32 %result > } > > define i32 @main(i32 %argc, i8** nocapture %argv) { > %retVal = tail call fastcc i32 @foo(i32 6) alwaysinline ; [#uses=1] > ret i32 %retVal > } > --- > > I'm pleased to get the tailcall optimization, but in this case was > looking for the 'no call at all' optimization. :-) > > Dustin > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From nicholas at mxc.ca Fri Jan 8 23:32:22 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 09 Jan 2010 00:32:22 -0500 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B46AB08.7050206@laurences.net> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> <3F8857D1-2996-4268-AB77-A9B1C83311C2@gmail.com> <4B46AB08.7050206@laurences.net> Message-ID: <4B4814E6.2020800@mxc.ca> Dustin Laurence wrote: > On 01/07/2010 06:03 PM, Alastair Lynn wrote: > >> You'll probably need to use insertvalue to construct your return value. >> > > Ah ha! > > The fact is I didn't really understand the significance of this part > when I read it, and so didn't remember it when I needed it. OK, so I > have tested it and I can now build up a struct like this > > %s1 = insertvalue {i32, i32} {i32 0, i32 0}, i32 1, 0 ; s1 = {1,0} > %s2 = insertvalue {i32, i32} %s1, i32 2, 1 ; %s2 == {1,2} > As a small refinement, I recommend: %s1 = insertvalue {i32, i32} undef, i32 1, 0 %s2 = insertvalue {i32, i32} %s1, i32 2, 1 > which reminds me of another thing I never understood. I can't make my > code (slightly) more readable by changing that to something like > > %s0 = {i32 0, i32 0} > %s1 = insertvalue {i32, i32} %s0, i32 1, 0 ; s1 = {1,0} > %s2 = insertvalue {i32, i32} %s1, i32 2, 1 ; %s2 == {1,2} > No, there is no copy or move instruction in LLVM. Recall that the text format is 1:1 with the in-memory model of the program. A copy instruction in the IR would literally mean "go look at my operand instead", leading to logic in every optimization that checks for a CopyInst and chases the pointer. The astute reader will note that I'm lying again, but it's for your own good. ;-) "%x = bitcast i32 %y to i32" is a legal way to copy, but the intention behind a BitcastInst is that it is used to change the type. Nick > because LLVM will complain that it "expected instruction opcode" at the > assignment to %s0. If there is a general way to give names to constants > in that way I didn't find it. In fact, I think I tended not to use > temporaries like I would variables precisely because when I tried the > second alternative as the natural way to hand-code it and it didn't > work, I didn't think how to phrase it so only the results of operations > get named. > > Help me understand the underlying logic--why can one only name the > results of operations? I realize that the local temporaries are > notionally register variables for a machine with an infinite number of > registers, but my very dim memory of real assembly was that I not only > could load constants into registers but had to do so. What part of the > picture am I missing here? > > You need an IR tutorial. Or, to speak correctly, *I* need a tutorial. > :-) But I'm learning.... > > Dustin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From clattner at apple.com Sat Jan 9 00:11:03 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 8 Jan 2010 22:11:03 -0800 Subject: [LLVMdev] Two suggestions for improving LLVM error reporting In-Reply-To: References: Message-ID: <71057A37-D1E7-4C57-ADE2-AFBD3EF30338@apple.com> On Jan 7, 2010, at 3:52 PM, Talin wrote: > I realize that LLVM assertion results aren't intended to be seen by end-users ever - they are there to tell front-end developers that they screwed up. Even so, I can think of two things that would make it easier for front-end developers to track down what they did wrong without having to jump into the debugger: Ok. > 1) Be able to set a "default module" for dump(). The version of dump() in llvm::Type takes a module parameter, which allows the type to be printed out using the type names registered in that module. However, the various asserts which call dump() or str() don't have a module pointer handy, which causes the type to be printed out as unrecognizable gobbledygook. Yes, this is lame. > If there was some way that we could associate either a default module with the LLVMContext - or perhaps register type names with the LLVMContext directly - and have dump() with no arguments use these name mappings - it would make the output more readable. (I kind of like the idea of having a type name dictionary in the LLVM context which is used whenever there is no module pointer handy. For one thing, I could fill it with more readable names than the ones I use in the module's name table.) My eventual goal (perhaps for LLVM 3.0) is to eliminate our current structural type system altogether. The benefits are blown away by the costs, eventually we should just fix this design mistake. In the meantime, I'm not opposed to adding a Module* to VMCore that type dumping defaults to if non-null. > 2) In the case where type names are available, never print out type names in the form of "%26". I realize that not all types have names - but at least always expand one level, so we can tell whether it's a struct or a pointer or whatever. "{ %32 }" or "%32*" tells me a lot more than just a number by itself. Got a CreateLoad() call that failed? Well if it prints out "{ %32 }" for the assertion argument you instantly know why, even if you don't know what "%32" resolves to. Sure. I think the asmwriter already has this capability, it uses it when actually printing out the %42 = type { blah} lines. It wouldn't do to print out "%42 = type %42" after all :) > In fact, what would be even cooler would be to set a threshold for the number of levels that were always expanded. I hate trying to read 400-character-long type names, but expanding two or three levels would be quite useful. :) -Chris From jon at ffconsultancy.com Sat Jan 9 02:07:09 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Sat, 9 Jan 2010 08:07:09 +0000 Subject: [LLVMdev] Two suggestions for improving LLVM error reporting In-Reply-To: <71057A37-D1E7-4C57-ADE2-AFBD3EF30338@apple.com> References: <71057A37-D1E7-4C57-ADE2-AFBD3EF30338@apple.com> Message-ID: <201001090807.09538.jon@ffconsultancy.com> On Saturday 09 January 2010 06:11:03 Chris Lattner wrote: > My eventual goal (perhaps for LLVM 3.0) is to eliminate our current > structural type system altogether. The benefits are blown away by the > costs, eventually we should just fix this design mistake. In the meantime, > I'm not opposed to adding a Module* to VMCore that type dumping defaults to > if non-null. Can you elaborate on this? I'm loving LLVM's structural types and they're making my work a lot easier. Having to come up with names for all of the structural types in my language just to satisfy LLVM's new type system would suck... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From clattner at apple.com Sat Jan 9 01:00:25 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 8 Jan 2010 23:00:25 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: Message-ID: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> On Jan 6, 2010, at 12:45 PM, Talin wrote: > This patch adds a UnionType to DerivedTypes.h. Cool. When proposing an IR extension, it is usually best to start with a LangRef.html patch so that we can discuss the semantics of the extension. Please do write this before you get much farther. I assume that you want unions usable in the same situations as a struct. However, how do "constant unions" work? How do I initialize a global variable whose type is "union of float and i32" for example? > It also adds code to the bitcode reader / writer and the assembly parser for the new type, as well as a tiny .ll test file in test/Assembler. It does not contain any code related to code generation or type layout - I wanted to see if this much was acceptable before I proceeded any further. The .ll file isn't included in your patch, but I see that you chose a syntax of 'union { i32, float }' which seems very reasonable. > Unlike my previous patch, in which the Union type was implemented as a packing option to type Struct (thereby re-using the machinery of Struct), this patch defines Union as a completely separate type from Struct. I think this approach makes sense. It means that things like TargetData StructLayout won't work for unions, but you don't need them since all elements are at offset 0. > I was a little uncertain as to how to write the tests. I'd particularly like to write tests for the bitcode reader/writer stuff, but I am not sure how to do that. A reasonable example are tests like test/Feature/newcasts.ll Here are some thoughts on your patch: +class UnionType : public CompositeType { ... + /// UnionType::get - Create an empty union type. + /// + static UnionType *get(LLVMContext &Context) { + return get(Context, std::vector()); + } I don't think that an empty union is going to be important enough to add a special accessor for it. Is an empty union ever a useful thing to do? If you completely disallow them from IR, it would end up simplifying some things. We don't allow empty vectors, and it seems that an empty union has exactly the same semantics as an empty struct, so having both empty structs and empty unions doesn't seem necessary. + static UnionType *get(LLVMContext &Context, + const std::vector &Params); Since this is new code, please have the constructor method take a 'const Type*const* Elements, unsigned NumElements' instead of requiring the caller to make an std::vector. This allows use of SmallVector etc. It is desirable to do this for all the other type classes in DerivedTypes.h, but we haven't gotten around to doing it yet. + /// UnionType::get - This static method is a convenience method for + /// creating union types by specifying the elements as arguments. + /// Note that this method always returns a non-packed struct. To get + /// an empty struct, pass NULL, NULL. + static UnionType *get(LLVMContext &Context, + const Type *type, ...) END_WITH_NULL; Please update the comments. Also, if you disallow empty unions here, you don't need to pass a context. +++ include/llvm/Type.h (working copy) @@ -86,6 +86,7 @@ PointerTyID, ///< 12: Pointers OpaqueTyID, ///< 13: Opaque: type with unknown structure VectorTyID, ///< 14: SIMD 'packed' format, or other vector type + UnionTyID, ///< 15: Unions Please put this up next to Struct for simplicity, the numbering here doesn't need to be stable. The numbering in llvm-c/Core.h does need to be stable though. +bool UnionType::indexValid(const Value *V) const { + // Union indexes require 32-bit integer constants. + if (V->getType() == Type::getInt32Ty(V->getContext())) Please use V->getType()->isInteger(32) which is probably new since you started your patch. +UnionType::UnionType(LLVMContext &C, const std::vector &Types) + : CompositeType(C, UnionTyID) { + ContainedTys = reinterpret_cast(this + 1); + NumContainedTys = Types.size(); + bool isAbstract = false; + for (unsigned i = 0; i < Types.size(); ++i) { No need to evaluate Types.size() every time through the loop. +bool LLParser::ParseUnionType(PATypeHolder &Result) { ... + if (!EatIfPresent(lltok::lbrace)) { + return Error(EltTyLoc, "'{' expected after 'union'"); + } Please use: if (ParseToken(lltok::lbrace, "'{' expected after 'union'")) return true; + EltTyLoc = Lex.getLoc(); + if (ParseTypeRec(Result)) return true; + ParamsList.push_back(Result); + + if (Result->isVoidTy()) + return Error(EltTyLoc, "union element can not have void type"); + if (!UnionType::isValidElementType(Result)) + return Error(EltTyLoc, "invalid element type for union"); + + while (EatIfPresent(lltok::comma)) { + EltTyLoc = Lex.getLoc(); + if (ParseTypeRec(Result)) return true; + + if (Result->isVoidTy()) + return Error(EltTyLoc, "union element can not have void type"); + if (!UnionType::isValidElementType(Result)) + return Error(EltTyLoc, "invalid element type for union"); + + ParamsList.push_back(Result); + } This can be turned into a: do { ... } while (EatIfPresent(lltok::comma)); loop to avoid the duplication of code. +++ lib/Bitcode/Writer/BitcodeWriter.cpp (working copy) ... + case Type::UnionTyID: { + const UnionType *ST = cast(T); + // UNION: [eltty x N] + Code = bitc::TYPE_CODE_UNION; + // Output all of the element types. + for (StructType::element_iterator I = ST->element_begin(), + E = ST->element_end(); I != E; ++I) + TypeVals.push_back(VE.getTypeID(*I)); + AbbrevToUse = UnionAbbrev; + break; + } Please rename ST -> UT and use the right iterator type. I didn't look closely at the C bindings. If you eliminate empty unions they should get a bit simpler. Otherwise the patch looks like a fine start. Lets please get the LangRef spec ironed out, then you can start committing subsystems to support this. My biggest concern about this extension is updating all the places in the optimizer to know about it. To get adequate testing coverage on this, we should probably switch llvm-gcc or clang to start using the union type in at least some common case, which will allow us to get coverage on it through the optimizer. Thanks for working on this Talin! -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100108/0abf33b1/attachment-0001.html From clattner at apple.com Sat Jan 9 01:09:19 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 8 Jan 2010 23:09:19 -0800 Subject: [LLVMdev] Two suggestions for improving LLVM error reporting In-Reply-To: <201001090807.09538.jon@ffconsultancy.com> References: <71057A37-D1E7-4C57-ADE2-AFBD3EF30338@apple.com> <201001090807.09538.jon@ffconsultancy.com> Message-ID: <954D7902-92EC-4055-AA99-9256E9CC264C@apple.com> On Jan 9, 2010, at 12:07 AM, Jon Harrop wrote: > On Saturday 09 January 2010 06:11:03 Chris Lattner wrote: >> My eventual goal (perhaps for LLVM 3.0) is to eliminate our current >> structural type system altogether. The benefits are blown away by the >> costs, eventually we should just fix this design mistake. In the meantime, >> I'm not opposed to adding a Module* to VMCore that type dumping defaults to >> if non-null. > > Can you elaborate on this? I'm loving LLVM's structural types and they're > making my work a lot easier. > > Having to come up with names for all of the structural types in my language > just to satisfy LLVM's new type system would suck... There are two things I don't like about our current system: 1. Type resolution is really slow in some cases, because it has to incrementally detect when mutating type graphs become isomorphic and zap them. This is seen during bc/ll loading and particularly during module linking. This is one of the biggest sources of LTO slowness that I'm aware of. 2. Our implementation of type resolution uses union find to lazily update Type*'s in values etc. This means that Value::getType() is not a trivial accessor that returns a pointer - it has to check to see if the pointer is forwarded, and if so, forward the pointer in the type. I'm not advocating elimination of pointer equality tests, but I am advocating the elimination of pointer equality tests for type *structure*. I want to introduce a first class "named type" type, and allow only *them* to be abstract, instead of having our current Opaque type. This means that if you have a "%foo***" and %foo gets resolved to "i32", that "%foo***" would not get implicitly unioned with "i32***". This would also eliminate the current complexity forming circular types, make many frontends simpler etc. It would mean that a cyclic type would be *required* to go through a named type though. It would also allow elimination of upreferences, PATypeHolder, PATypeHandle, etc. It would also eliminate the frequent confusion around "my function should take a %foo*, it the IR dump shows it as %bar*" (because they have the same type structure). OTOH, it would mean that we couldn't have the totally awesome and oh-so-useful \1* type ;-) -Chris From baldrick at free.fr Sat Jan 9 01:59:50 2010 From: baldrick at free.fr (Duncan Sands) Date: Sat, 09 Jan 2010 08:59:50 +0100 Subject: [LLVMdev] Cygwin llvm-gcc regression In-Reply-To: <9719867c1001081510t75df73aek2b4b4e14829150fe@mail.gmail.com> References: <9719867c1001081510t75df73aek2b4b4e14829150fe@mail.gmail.com> Message-ID: <4B483776.5010904@free.fr> Hi Aaron, > assertion "(!TYPE_SIZE(Tr) || !Ty->isSized() || !isInt64(TYPE_SIZE(Tr), > true) || > getInt64(TYPE_SIZE(Tr), true) == > getTargetData().getTypeAllocSizeInBits(Ty)) && > "LLVM type size doesn't match GCC type size!"" failed: file > "/home/ang/svn/llvm > -gcc/gcc/llvm-types.cpp", line 83 > > This assertion has not changed from the past and am not sure what is > causing this. this means that gcc and llvm disagree about how big a type is. It is probably caused by llvm thinking long double has size 12 (or 16), while cygwin gcc thinks it is 16 (or 12). Ciao, Duncan. From baldrick at free.fr Sat Jan 9 02:15:35 2010 From: baldrick at free.fr (Duncan Sands) Date: Sat, 09 Jan 2010 09:15:35 +0100 Subject: [LLVMdev] Inlining In-Reply-To: <4B47A92D.4020403@laurences.net> References: <4B47A92D.4020403@laurences.net> Message-ID: <4B483B27.9030005@free.fr> Hi Dustin, > define linkonce fastcc i32 @foo(i32 %arg) alwaysinline linkonce implies that the function body may change at link time. Thus it would be wrong to inline it, since the code being inlined would not be the final code. Use linkonce_odr to tell the compiler that the function body can be replaced only by an equivalent function body. Ciao, Duncan. From dllaurence at dslextreme.com Sat Jan 9 03:31:24 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 01:31:24 -0800 Subject: [LLVMdev] Inlining In-Reply-To: References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> Message-ID: <4B484CEC.6050801@laurences.net> On 01/08/2010 04:36 PM, Chris Lattner wrote: > > On Jan 8, 2010, at 3:08 PM, Dustin Laurence wrote: >> A vital clue, but I'm still not getting it: > > Try opt -O3. I actually had, but as Nick Lewycky noticed the 'linkonce' linkage specification was preventing the inlining. Dustin From dllaurence at dslextreme.com Sat Jan 9 03:46:29 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 01:46:29 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <4B481186.7050907@mxc.ca> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> <4B481186.7050907@mxc.ca> Message-ID: <4B485075.9070705@laurences.net> On 01/08/2010 09:17 PM, Nick Lewycky wrote: > Try using 'internal' linkage instead of 'linkonce'. That did it, thanks. --- gemini:~/Projects/LLVM/Tests/Inline(0)$ cat testInline.optdis.ll ; ModuleID = 'testInline.optbc' define i32 @main(i32 %argc, i8** nocapture %argv) nounwind readnone { ret i32 42 } gemini:~/Projects/LLVM/Tests/Inline(0)$ --- > If you're sure you > really want linkonce then you'd need to use linkonce_odr to get inlining > here. I'm sure of nothing. I only used it because the IR Reference says linkonce: Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded. So I thought it was telling me to use linkonce for inline functions.+ A bit more experimentation shows that it still inlines with the default linkage, the only difference being it retains a non-inlined version as well. With 'internal' it omits the non-inlined version. I think I see why that's true. > Also, drop the alwaysinline attribute and '-always-inline' flag. The > normal inliner (aka. "opt -inline" which is run as part of "opt -O3") > should inline it. Yes, it still did after I removed them. Since I'm clearly not guessing well here, when would one want to use "alwaysinline"? Dustin From dllaurence at dslextreme.com Sat Jan 9 04:11:55 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 02:11:55 -0800 Subject: [LLVMdev] First-class aggregate semantics In-Reply-To: <4B4814E6.2020800@mxc.ca> References: <4B4651F5.80108@laurences.net> <201001071538.15322.dag@cray.com> <4B46587B.2070808@laurences.net> <3F8857D1-2996-4268-AB77-A9B1C83311C2@gmail.com> <4B46AB08.7050206@laurences.net> <4B4814E6.2020800@mxc.ca> Message-ID: <4B48566B.9060509@laurences.net> On 01/08/2010 09:32 PM, Nick Lewycky wrote: > As a small refinement, I recommend: > > %s1 = insertvalue {i32, i32} undef, i32 1, 0 > %s2 = insertvalue {i32, i32} %s1, i32 2, 1 Ah, excellent. I hadn't found a use for 'undef' yet, but I like that. I don't like substituting values into a literal with arbitrary values because, well, it's unlovely. Saying the values don't matter is much better. I don't think there is a place in my lexer that happens to need exactly that as one or the other members of the token structure always seems to be a literal, but your example did lead me to use undef for individual members being replaced in an insertvalue instruction. > No, there is no copy or move instruction in LLVM. Recall that the text > format is 1:1 with the in-memory model of the program. A copy > instruction in the IR would literally mean "go look at my operand > instead", leading to logic in every optimization that checks for a > CopyInst and chases the pointer. My desire to do such things is a direct consequence of my apparently odd choice to learn the IR by approaching it as a programming language. While it clearly wasn't intended for it, my first computer had three registers, only one of which was a mighty sixteen bits wide, so I've hurt worse before. :-) That said, the need to parameterize code is too great not to do something, and since LLVM isn't designed to do it I have been using CPP. Doing without even the modest abilities of CPP is simply not to be thought of. If I did enough hand-coding it would be well worth writing a custom preprocessor. > The astute reader will note that I'm lying again, but it's for your own > good. ;-) "%x = bitcast i32 %y to i32" is a legal way to copy, but the > intention behind a BitcastInst is that it is used to change the type. If I was concerned about my own good I probably wouldn't be hand-coding. :-) Dustin From gvenn.cfe.dev at gmail.com Sat Jan 9 06:54:45 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Sat, 9 Jan 2010 07:54:45 -0500 Subject: [LLVMdev] Exception Implementation Example added to Wiki In-Reply-To: <3f49a9f41001081706of82d965q2079a733614c9e73@mail.gmail.com> References: <7F239E64-D3F4-4113-B670-489DC29FEFC6@gmail.com> <3f49a9f41001081706of82d965q2079a733614c9e73@mail.gmail.com> Message-ID: <002425D9-68CC-44C9-B904-B53CE6A02F95@gmail.com> If the powers at be want this, I could easily transform the source to the LLVM coding standards, and add the necessary portable UNIX support--someone else would have to add non-UNIX support although the System library probably helps with this. However I'm guessing the LLVM release flux of the exception system, along with a lack of universal platform, dwarf JIT support might be a hinderance in such an endeavor. I don't know what the current platform boundaries are for either JIT or JIT with dwarf emission, but I do know that the LLVM exception design is being reconsidered for future releases (possibly 2.7?). Also, as noted in the wiki, please see: http://code.google.com/p/tart/ and http://www.incasoftware.de/~kamm/projects/index.php/2008/08/19/exception-handling-in-llvmdc-using-llvm/ for real world implementations. There are many others, as can be seen in the LLVM project page. Garrison On Jan 8, 2010, at 20:06, OvermindDL1 wrote: > On Fri, Jan 8, 2010 at 12:12 PM, Garrison Venn wrote: >> I just added an exception example to the wiki intended to be run in a JIT environment. Although this information is heavily date dependent, >> as the LLVM exception subsystem will be modified as time goes on, I could have used such an example when I was looking into this; hence the submission. >> Having said this, I'm not an LLVM expert, and even though the code works, I'm sure there are omissions and inaccuracies, so if the experts have >> the time ... >> >> Hopefully some members of the community will find this beneficial >> >> Garrison >> >> PS: Is there a better to upload source to the wiki than merely pasting it in? Only image file types seemed to be allowed for upload. > > > Oh I could definitely use an example like that, would be very nice if > there was such an example as an included project in LLVM itself like > the other examples and tutorials, would force it to stay updated too. > > EDIT: Er, helps if I send to the LLVM list... > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From overminddl1 at gmail.com Sat Jan 9 07:07:28 2010 From: overminddl1 at gmail.com (OvermindDL1) Date: Sat, 9 Jan 2010 06:07:28 -0700 Subject: [LLVMdev] Exception Implementation Example added to Wiki In-Reply-To: <002425D9-68CC-44C9-B904-B53CE6A02F95@gmail.com> References: <7F239E64-D3F4-4113-B670-489DC29FEFC6@gmail.com> <3f49a9f41001081706of82d965q2079a733614c9e73@mail.gmail.com> <002425D9-68CC-44C9-B904-B53CE6A02F95@gmail.com> Message-ID: <3f49a9f41001090507m333594fdtcc87e50f7d476349@mail.gmail.com> On Sat, Jan 9, 2010 at 5:54 AM, Garrison Venn wrote: > If the powers at be want this, I could easily transform the source to the LLVM coding standards, and add > the necessary portable UNIX support--someone else would have to add non-UNIX support although the > System library probably helps with this. However I'm guessing the LLVM release flux of the exception system, > along with a lack of universal platform, dwarf JIT support might be a hinderance in such an endeavor. I don't > know what the current platform boundaries are for either JIT or JIT with dwarf emission, but I do know that > the LLVM exception design is being reconsidered for future releases (possibly 2.7?). > > Also, as noted in the wiki, please see: > > http://code.google.com/p/tart/ and > http://www.incasoftware.de/~kamm/projects/index.php/2008/08/19/exception-handling-in-llvmdc-using-llvm/ > > for real world implementations. There are many others, as can be seen in the LLVM project page. I am one such non-unix platform, so if the example does not work for me, it is still as worthless as bad documentation for note... From gvenn.cfe.dev at gmail.com Sat Jan 9 07:39:15 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Sat, 9 Jan 2010 08:39:15 -0500 Subject: [LLVMdev] Exception Implementation Example added to Wiki In-Reply-To: <3f49a9f41001090507m333594fdtcc87e50f7d476349@mail.gmail.com> References: <7F239E64-D3F4-4113-B670-489DC29FEFC6@gmail.com> <3f49a9f41001081706of82d965q2079a733614c9e73@mail.gmail.com> <002425D9-68CC-44C9-B904-B53CE6A02F95@gmail.com> <3f49a9f41001090507m333594fdtcc87e50f7d476349@mail.gmail.com> Message-ID: Understood. Sorry for my lack of background on these platforms. Do you know if llvm.eh.selector and dwarf emission for a JIT execution environment works on your platform of choice? If so, and if the unwind system conforms to http://refspecs.freestandards.org/abi-eh-1.21.html, the port will not be too bad. Beyond use of fprintf and strtoul, and an include of unwind.h, I don't believe there is much else that is specific to UNIX. These may even exist in your platform's headers. The structures/APIs defined in unwind.h are fully defined and useable as is in http://refspecs.freestandards.org/abi-eh-1.21.html. Garrison On Jan 9, 2010, at 8:07, OvermindDL1 wrote: > On Sat, Jan 9, 2010 at 5:54 AM, Garrison Venn wrote: >> If the powers at be want this, I could easily transform the source to the LLVM coding standards, and add >> the necessary portable UNIX support--someone else would have to add non-UNIX support although the >> System library probably helps with this. However I'm guessing the LLVM release flux of the exception system, >> along with a lack of universal platform, dwarf JIT support might be a hinderance in such an endeavor. I don't >> know what the current platform boundaries are for either JIT or JIT with dwarf emission, but I do know that >> the LLVM exception design is being reconsidered for future releases (possibly 2.7?). >> >> Also, as noted in the wiki, please see: >> >> http://code.google.com/p/tart/ and >> http://www.incasoftware.de/~kamm/projects/index.php/2008/08/19/exception-handling-in-llvmdc-using-llvm/ >> >> for real world implementations. There are many others, as can be seen in the LLVM project page. > > I am one such non-unix platform, so if the example does not work for > me, it is still as worthless as bad documentation for note... > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From anton at korobeynikov.info Sat Jan 9 09:15:29 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sat, 9 Jan 2010 18:15:29 +0300 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> Message-ID: Hello > But I got the followings errors: > > /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s: > Assembler messages: > /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s:96: Correct. You haven't specified any ARM specific stuff during llvm-gcc conffigure (cpu type, fpu type, floating point abi, etc). This means that default will be used. LLVM defaults to something like ARMv5, binutils - to ARMv4. So, you just need to configure llvm-gcc properly. Keep in mind, that binutils for ARM are known to be buggy, you need to use binutils CVS snapshot (and even this is buggy - some bugs are not yet fixed there). -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From kennethuil at gmail.com Sat Jan 9 09:28:18 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Sat, 9 Jan 2010 09:28:18 -0600 Subject: [LLVMdev] Two suggestions for improving LLVM error reporting In-Reply-To: <400d33ea1001090727g535a56e7p73a642503c6ba262@mail.gmail.com> References: <71057A37-D1E7-4C57-ADE2-AFBD3EF30338@apple.com> <201001090807.09538.jon@ffconsultancy.com> <954D7902-92EC-4055-AA99-9256E9CC264C@apple.com> <400d33ea1001090727g535a56e7p73a642503c6ba262@mail.gmail.com> Message-ID: <400d33ea1001090728h71bca673x5c3945df18f0d9a@mail.gmail.com> On Sat, Jan 9, 2010 at 9:27 AM, Kenneth Uildriks wrote: > On Sat, Jan 9, 2010 at 1:09 AM, Chris Lattner wrote: >> >> On Jan 9, 2010, at 12:07 AM, Jon Harrop wrote: >> >>> On Saturday 09 January 2010 06:11:03 Chris Lattner wrote: >>>> My eventual goal (perhaps for LLVM 3.0) is to eliminate our current >>>> structural type system altogether. ?The benefits are blown away by the >>>> costs, eventually we should just fix this design mistake. ?In the meantime, >>>> I'm not opposed to adding a Module* to VMCore that type dumping defaults to >>>> if non-null. >>> >>> Can you elaborate on this? I'm loving LLVM's structural types and they're >>> making my work a lot easier. >>> >>> Having to come up with names for all of the structural types in my language >>> just to satisfy LLVM's new type system would suck... >> >> There are two things I don't like about our current system: >> >> 1. Type resolution is really slow in some cases, because it has to incrementally detect when mutating type graphs become isomorphic and zap them. ?This is seen during bc/ll loading and particularly during module linking. ?This is one of the biggest sources of LTO slowness that I'm aware of. >> >> 2. Our implementation of type resolution uses union find to lazily update Type*'s in values etc. ?This means that Value::getType() is not a trivial accessor that returns a pointer - it has to check to see if the pointer is forwarded, and if so, forward the pointer in the type. >> >> I'm not advocating elimination of pointer equality tests, but I am advocating the elimination of pointer equality tests for type *structure*. ?I want to introduce a first class "named type" type, and allow only *them* to be abstract, instead of having our current Opaque type. >> >> This means that if you have a "%foo***" and %foo gets resolved to "i32", that "%foo***" would not get implicitly unioned with "i32***". ?This would also eliminate the current complexity forming circular types, make many frontends simpler etc. ?It would mean that a cyclic type would be *required* to go through a named type though. >> >> It would also allow elimination of upreferences, PATypeHolder, PATypeHandle, etc. ?It would also eliminate the frequent confusion around "my function should take a %foo*, it the IR dump shows it as %bar*" (because they have the same type structure). >> >> OTOH, it would mean that we couldn't have the totally awesome and oh-so-useful \1* type ;-) Sweet! ?Could named types also be concrete in this scheme? From arplynn at gmail.com Sat Jan 9 10:35:29 2010 From: arplynn at gmail.com (Alastair Lynn) Date: Sat, 9 Jan 2010 16:35:29 +0000 Subject: [LLVMdev] Inlining In-Reply-To: <4B483B27.9030005@free.fr> References: <4B47A92D.4020403@laurences.net> <4B483B27.9030005@free.fr> Message-ID: Hi Duncan- Forgive my confusion, but I can't help notice that LangRef states: Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded. Why would linkonce be used to implement inline functions if it's not safe to inline linkonce functions? Alastair On 9 Jan 2010, at 08:15, Duncan Sands wrote: > Hi Dustin, > >> define linkonce fastcc i32 @foo(i32 %arg) alwaysinline > > linkonce implies that the function body may change at link time. Thus it would > be wrong to inline it, since the code being inlined would not be the final code. > Use linkonce_odr to tell the compiler that the function body can be replaced > only by an equivalent function body. > > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From baldrick at free.fr Sat Jan 9 10:41:26 2010 From: baldrick at free.fr (Duncan Sands) Date: Sat, 09 Jan 2010 17:41:26 +0100 Subject: [LLVMdev] Inlining In-Reply-To: References: <4B47A92D.4020403@laurences.net> <4B483B27.9030005@free.fr> Message-ID: <4B48B1B6.3010705@free.fr> Hi Alastair, > Forgive my confusion, but I can't help notice that LangRef states: > > Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded. > > Why would linkonce be used to implement inline functions if it's not safe to inline linkonce functions? I was wrong - linkonce is an exception to the general rule that a "weak" linkage type prevents inlining unless of the "_odr" form. Ciao, Duncan. From rengolin at systemcall.org Sat Jan 9 11:00:50 2010 From: rengolin at systemcall.org (Renato Golin) Date: Sat, 9 Jan 2010 17:00:50 +0000 Subject: [LLVMdev] Automatic Vectorization In-Reply-To: <352a1fb21001041000v711fbe2aq9c3d1eafbdd569d3@mail.gmail.com> References: <352a1fb21001041000v711fbe2aq9c3d1eafbdd569d3@mail.gmail.com> Message-ID: 2010/1/4 Devang Patel : > A separate VectorizationPass ?that requires dependence analysis is the > way to go. Hi Devang, With all the holiday break stuff I forgot about this thread. I got to a dead end... The docs [1] explain how to write a function pass and mention a loop pass as a type of function pass, but the registration doesn't work the same way, I had to do some hacking and never got it to show up in the 'opt' list. I also got to a loop in the call (the code is quite extensive and I still have limited time to dig in). I got that by writing a loop pass and putting it on the pass vector it'd be called by the loop pass, but I couldn't figure it out how to add it to the loop vector. If I got what you saying, I should created a separate function pass (registered the way the docs say or adding it to LinkAllPasses, I don't know), depending on LoopDependenceAnalysis (somewhere there was a way to determine dependency, I have to check out again), that would do the same as LoopPass (find all loops, run a vectorization loop pass on each). If that's so, why not extend LoopPass and register it directly on the vector? I imagine the dependency wouldn't be as easy but at least there would be less duplicated code... or I just didn't understand much of it... what's way more likely... ;) -- cheers, --renato [1] http://llvm.org/docs/WritingAnLLVMPass.html http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm From dllaurence at dslextreme.com Sat Jan 9 11:36:08 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 09:36:08 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <4B48B1B6.3010705@free.fr> References: <4B47A92D.4020403@laurences.net> <4B483B27.9030005@free.fr> <4B48B1B6.3010705@free.fr> Message-ID: <4B48BE88.5070402@laurences.net> On 01/09/2010 08:41 AM, Duncan Sands wrote: > I was wrong - linkonce is an exception to the general rule that a "weak" linkage > type prevents inlining unless of the "_odr" form. Except it really did prevent inlining in my test. If I follow, and I probably don't, what you said matched the behavior of LLVM and the docs didn't. Dustin From samuraileumas at yahoo.com Sat Jan 9 12:00:43 2010 From: samuraileumas at yahoo.com (Samuel Crow) Date: Sat, 9 Jan 2010 10:00:43 -0800 (PST) Subject: [LLVMdev] Inlining In-Reply-To: <4B485075.9070705@laurences.net> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> <4B481186.7050907@mxc.ca> <4B485075.9070705@laurences.net> Message-ID: <248127.39332.qm@web62007.mail.re1.yahoo.com> Hello Dustin, Always inline is the closest to a preprocessor macro you can get in LLVM Assembly since it doesn't have a preprocessor at all. LLVM does aggressive inlining for functions used only once so those instances don't require specification as alwaysinline. --Sam ----- Original Message ---- > From: Dustin Laurence > Cc: llvmdev at cs.uiuc.edu > Sent: Sat, January 9, 2010 3:46:29 AM > Subject: Re: [LLVMdev] Inlining > > > Also, drop the alwaysinline attribute and '-always-inline' flag. The > > normal inliner (aka. "opt -inline" which is run as part of "opt -O3") > > should inline it. > > Yes, it still did after I removed them. Since I'm clearly not guessing > well here, when would one want to use "alwaysinline"? > > Dustin From dllaurence at dslextreme.com Sat Jan 9 12:35:33 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 10:35:33 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <248127.39332.qm@web62007.mail.re1.yahoo.com> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> <4B481186.7050907@mxc.ca> <4B485075.9070705@laurences.net> <248127.39332.qm@web62007.mail.re1.yahoo.com> Message-ID: <4B48CC75.5060407@laurences.net> On 01/09/2010 10:00 AM, Samuel Crow wrote: > > Always inline is the closest to a preprocessor macro you can get in > LLVM Assembly since it doesn't have a preprocessor at all. Mine does. :-) > ...LLVM does > aggressive inlining for functions used only once so those instances > don't require specification as alwaysinline. What I'm trying to do is understand the practical use cases. Concrete example: I have some little type accessor and conversion functions that are typically two or three instructions long because all they really do is manipulate tag data in the low-order bits of pointers. (I'm not exactly innovative, am I?) While small, they are called all over the place for boxing and unboxing language-level objects. In C they would be explicitly inline. What is the LLVM equivalent? My guess is the optimizer will always inline such tiny functions no matter what as it's probably both a space and a time win, so maybe I need a different example. Suppose they were typically five, or ten, or twenty, or forty instructions long? Who is responsible for deciding on the advisability of inlining? The front-end (which in this case is actually me?)? That would be the equivalent of the C99/C++ 'inline' compiler hint. Or in LLVM is it better not to give manual compiler hints about inlining in most cases and let the optimizers decide? I suppose it's a fuzzy question because I'm fishing for intended usage, not just semantics. Dustin From clattner at apple.com Sat Jan 9 13:15:51 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 9 Jan 2010 11:15:51 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <4B48B1B6.3010705@free.fr> References: <4B47A92D.4020403@laurences.net> <4B483B27.9030005@free.fr> <4B48B1B6.3010705@free.fr> Message-ID: <38A1E1DC-63A2-4BCA-903F-560C64FFF93F@apple.com> On Jan 9, 2010, at 8:41 AM, Duncan Sands wrote: > Hi Alastair, > >> Forgive my confusion, but I can't help notice that LangRef states: >> >> Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded. >> >> Why would linkonce be used to implement inline functions if it's not safe to inline linkonce functions? > > I was wrong - linkonce is an exception to the general rule that a "weak" linkage > type prevents inlining unless of the "_odr" form. Actually, the inliner doesn't inline linkonce either, because we have: InlineCost InlineCostAnalyzer::getInlineCost(CallSite CS, SmallPtrSet &NeverInline) { ... // Don't inline functions which can be redefined at link-time to mean // something else. Don't inline functions marked noinline. if (Callee->mayBeOverridden() || Callee->hasFnAttr(Attribute::NoInline) || NeverInline.count(Callee)) return llvm::InlineCost::getNever(); I improved the langref description of linkonce in r93066. -Chris From clattner at apple.com Sat Jan 9 13:17:41 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 9 Jan 2010 11:17:41 -0800 Subject: [LLVMdev] Two suggestions for improving LLVM error reporting In-Reply-To: <400d33ea1001090728h71bca673x5c3945df18f0d9a@mail.gmail.com> References: <71057A37-D1E7-4C57-ADE2-AFBD3EF30338@apple.com> <201001090807.09538.jon@ffconsultancy.com> <954D7902-92EC-4055-AA99-9256E9CC264C@apple.com> <400d33ea1001090727g535a56e7p73a642503c6ba262@mail.gmail.com> <400d33ea1001090728h71bca673x5c3945df18f0d9a@mail.gmail.com> Message-ID: <4EE80D6F-C48D-4589-BF1D-42ACAAEFE325@apple.com> On Jan 9, 2010, at 7:28 AM, Kenneth Uildriks wrote: >>> >>> This means that if you have a "%foo***" and %foo gets resolved to "i32", that "%foo***" would not get implicitly unioned with "i32***". This would also eliminate the current complexity forming circular types, make many frontends simpler etc. It would mean that a cyclic type would be *required* to go through a named type though. >>> >>> It would also allow elimination of upreferences, PATypeHolder, PATypeHandle, etc. It would also eliminate the frequent confusion around "my function should take a %foo*, it the IR dump shows it as %bar*" (because they have the same type structure). >>> >>> OTOH, it would mean that we couldn't have the totally awesome and oh-so-useful \1* type ;-) > > Sweet! Could named types also be concrete in this scheme? Yep, -Chris From samuraileumas at yahoo.com Sat Jan 9 13:21:30 2010 From: samuraileumas at yahoo.com (Samuel Crow) Date: Sat, 9 Jan 2010 11:21:30 -0800 (PST) Subject: [LLVMdev] Inlining In-Reply-To: <4B48CC75.5060407@laurences.net> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> <4B481186.7050907@mxc.ca> <4B485075.9070705@laurences.net> <248127.39332.qm@web62007.mail.re1.yahoo.com> <4B48CC75.5060407@laurences.net> Message-ID: <331965.31642.qm@web62005.mail.re1.yahoo.com> Hello Dustin, Alwaysinline is not a hint. It forces something inline that wouldn't have otherwise been as long as the linkage type permits it. (You just ran into a situation where linkage did not permit it.) Personally, I don't see the need for a preprocessor in most circumstances. If you need to do type substitution you can use an opaque type. The only reason for conditional compilation is if you'd need to be able to generate inline assembly for the host (which shouldn't ever be absolutely necessary in LLVM except for legacy code). One thing I wanted to do with the language we're developing is to do a custom template-like function involving always-inlines containing opaque types. It would rest heavily on the type system remaining as it is (assuming it works the way I think it works) and it seems that Chris Lattner wants to change that. Maybe it's a good thing our project is as far behind schedule as it is. I'd better do some experimenting sometime with opaque types and inlines together to see if they work as expected for producing easy macros. Anyway, sorry for drifting off-topic, --Sam ----- Original Message ---- > From: Dustin Laurence > Cc: llvmdev at cs.uiuc.edu > Sent: Sat, January 9, 2010 12:35:33 PM > Subject: Re: [LLVMdev] Inlining > > On 01/09/2010 10:00 AM, Samuel Crow wrote: > > > > Always inline is the closest to a preprocessor macro you can get in > > LLVM Assembly since it doesn't have a preprocessor at all. > > Mine does. :-) > > > ...LLVM does > > aggressive inlining for functions used only once so those instances > > don't require specification as alwaysinline. > > What I'm trying to do is understand the practical use cases. Concrete > example: I have some little type accessor and conversion functions that > are typically two or three instructions long because all they really do > is manipulate tag data in the low-order bits of pointers. (I'm not > exactly innovative, am I?) While small, they are called all over the > place for boxing and unboxing language-level objects. In C they would > be explicitly inline. What is the LLVM equivalent? > > My guess is the optimizer will always inline such tiny functions no > matter what as it's probably both a space and a time win, so maybe I > need a different example. Suppose they were typically five, or ten, or > twenty, or forty instructions long? Who is responsible for deciding on > the advisability of inlining? The front-end (which in this case is > actually me?)? That would be the equivalent of the C99/C++ 'inline' > compiler hint. Or in LLVM is it better not to give manual compiler > hints about inlining in most cases and let the optimizers decide? > > I suppose it's a fuzzy question because I'm fishing for intended usage, > not just semantics. > > Dustin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From dllaurence at dslextreme.com Sat Jan 9 13:50:07 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 11:50:07 -0800 Subject: [LLVMdev] Inlining In-Reply-To: <331965.31642.qm@web62005.mail.re1.yahoo.com> References: <4B47A92D.4020403@laurences.net> <6C4B3E2E-E084-4BB3-A0E1-27CDB3DE72BE@apple.com> <4B47BAFA.50605@laurences.net> <4B481186.7050907@mxc.ca> <4B485075.9070705@laurences.net> <248127.39332.qm@web62007.mail.re1.yahoo.com> <4B48CC75.5060407@laurences.net> <331965.31642.qm@web62005.mail.re1.yahoo.com> Message-ID: <4B48DDEF.4000104@laurences.net> On 01/09/2010 11:21 AM, Samuel Crow wrote: > Alwaysinline is not a hint. It forces something inline that wouldn't > have otherwise been as long as the linkage type permits it. (You > just ran into a situation where linkage did not permit it.) Understood. I am just wondering if one should generally trust the optimizer, or if it's better to manually insist on inlining functions that should obviously be inlined. My guess is for normal usage you trust the optimizer, and use alwaysinline for unusual things you know need inlining but the optimizer can't figure it out (say inlining an over-large function into a tight inner loop in your star formation hydrodynamics code)? > Personally, I don't see the need for a preprocessor in most > circumstances. I suspect that's because in spite of my funny questions your brain refused to believe that I am doing something as deranged as writing a non-trivial interpreter for a "real" language in raw IR with a text editor, and so you assumed I was doing something sane instead. :-) I challenge you to write LLVM IR with only Stone Knives, Bearskins, a text editor, and llvm-as (and make or anything else you like as long as it doesn't manipulate the source unless you build the tool starting with the Stone Knives) as well engineered as I can with preprocessor help (in this case, simple, brain-dead CPP because it's available and m4 is simply not to be contemplated). Seriously--I don't think it can be done, so if you do it then I'd learn a lot. In essence, I am the manual front-end. I'm not as consistent and predictable as a normal one, but I like to think I make better dinner conversation. :-) > ...If you need to do type substitution you can use an > opaque type. The only reason for conditional compilation is if you'd > need to be able to generate inline assembly for the host (which > shouldn't ever be absolutely necessary in LLVM except for legacy > code). Um...*everything* for me is the equivalent of inline IR for you. Note well that I make absolutely no claims that it is *necessary* in any way, however. :-) I admit I don't really understand opaque types yet, but they won't do what I need down here with my primitive Stone Tools. The single biggest wins were the simple ability to #include (same reason as in C: I can have separate source modules whose interfaces are type-checked) and to define constants to parametrize the code with certain choices (it's awful nice not to have things like the tagged representation of nil hard-coded a hundred places in the code, as I found out when I realized I'd made the wrong choice). Conditional compilation and parametrized macros are OK but not as vital. > One thing I wanted to do with the language we're developing is to do > a custom template-like function involving always-inlines containing > opaque types. And will that have the Turing-completeness of C++ templates? :-D My advice is not to use angle brackets.... Dustin From dllaurence at dslextreme.com Sat Jan 9 14:57:25 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 12:57:25 -0800 Subject: [LLVMdev] Variable declarations vs. definitions Message-ID: <4B48EDB5.20705@laurences.net> I have yet another question that I believe also stems from deep ignorance of the linkage types. How do you declare a global variable without defining it? The IR ref. clearly indicates that you can do this, but it looks like one of the many "too obvious to mention" things that I struggle with. It's easy with functions, of course: "declare @foo" in the header and "define @foo" in the module just like in C. But it turns out I have avoided having to learn to do it with variables until now, when I decided to play with invoke/unwind and see if I could make a primitive exception mechanism to unwind the stack on my recursive parser when an error is encountered. In fact I could avoid it now, but the purpose is to learn as much of the IR as possible, not use the subset of the language I already understand. To be clear: remember I'm using Stone Knives here. :-) I have to figure out how to make a global variable defined in one translation unit visible in another (with a header, just like in C). I could easily make an exception module with accessor functions, and that is likely the better software engineering solution, but again my real goal is to learn as much as possible (I don't need exceptions in the parser, either, but I want to understand them). Dustin From eli.friedman at gmail.com Sat Jan 9 15:11:49 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Sat, 9 Jan 2010 13:11:49 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: <4B48EDB5.20705@laurences.net> References: <4B48EDB5.20705@laurences.net> Message-ID: On Sat, Jan 9, 2010 at 12:57 PM, Dustin Laurence wrote: > I have yet another question that I believe also stems from deep > ignorance of the linkage types. ?How do you declare a global variable > without defining it? ?The IR ref. clearly indicates that you can do > this, but it looks like one of the many "too obvious to mention" things > that I struggle with. ?It's easy with functions, of course: "declare > @foo" in the header and "define @foo" in the module just like in C. ?But > it turns out I have avoided having to learn to do it with variables > until now, when I decided to play with invoke/unwind and see if I could > make a primitive exception mechanism to unwind the stack on my recursive > parser when an error is encountered. The syntax isn't entirely obvious... usually, when you're wondering how to write something in IR, the easiest thing to so is write the equivalent C code, then use http://llvm.org/demo/index.cgi to see what it looks like in iR. In this case, try plugging the following snippet in: extern int x; int *y = &x; -Eli From clattner at apple.com Sat Jan 9 15:12:03 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 9 Jan 2010 13:12:03 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: <4B48EDB5.20705@laurences.net> References: <4B48EDB5.20705@laurences.net> Message-ID: On Jan 9, 2010, at 12:57 PM, Dustin Laurence wrote: > I have yet another question that I believe also stems from deep > ignorance of the linkage types. How do you declare a global variable > without defining it? The equivalent of "extern int G;" is: @G = external global i32 -Chris From anton at korobeynikov.info Sat Jan 9 15:20:04 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 10 Jan 2010 00:20:04 +0300 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: <4B48EDB5.20705@laurences.net> References: <4B48EDB5.20705@laurences.net> Message-ID: Hello, Dustin > To be clear: remember I'm using Stone Knives here. :-) In some cases it's better to realize that it's year 2010 now. :) Just write small snippet in C and use llvm-gcc -emit-llvm to emit LLVM IR corresponding to it. You can use e.g. http://llvm.org/demo/ for this. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From pazzodalegare at gmail.com Sat Jan 9 17:22:17 2010 From: pazzodalegare at gmail.com (Pazzo Da Legare) Date: Sun, 10 Jan 2010 00:22:17 +0100 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> Message-ID: <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> Dear Anton, Thanks for your help! I had a look to llvm (2.6) configure options but I couldn't find any way to specify cpu type, fpu ecc..Could you please give me any indication and/or example? I want to try llvm with Atmel's AT91SAM7X256 (core is ARM7TDMI ) Thank you again, pz 2010/1/9 Anton Korobeynikov : > Hello > >> But I got the followings errors: >> >> /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s: >> Assembler messages: >> /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccDFjySd.s:96: > Correct. You haven't specified any ARM specific stuff during llvm-gcc > conffigure (cpu type, fpu type, floating point abi, etc). This means > that default will be used. LLVM defaults to something like ARMv5, > binutils - to ARMv4. So, you just need to configure llvm-gcc properly. > > Keep in mind, that binutils for ARM are known to be buggy, you need to > use binutils CVS snapshot (and even this is buggy - some bugs are not > yet fixed there). > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > From jlerouge at apple.com Sat Jan 9 18:11:14 2010 From: jlerouge at apple.com (Julien Lerouge) Date: Sat, 9 Jan 2010 16:11:14 -0800 Subject: [LLVMdev] [PATCH] Fix nondeterministic behaviour in the CodeExtractor In-Reply-To: References: <20100109010120.GB6338@pom.apple.com> Message-ID: <20100110001114.GA22469@pom.apple.com> On Fri, Jan 08, 2010 at 05:04:17PM -0800, Chris Lattner wrote: > On Jan 8, 2010, at 5:01 PM, Julien Lerouge wrote: > >Hello, > > > >The CodeExtractor contains a std::set to keep track > >of the > >blocks to extract. Iterators on this set are not deterministic, and so > >the functions that are generated are not (the order of the > >inputs/outputs can change). > > > >The attached patch uses a SetVector instead. Ok to apply ? > Nice catch, please apply, > -Chris Thanks for the quick review. There is actually more, is it ok to apply this one as well (avoid std:vector being sorted) ? Thanks, Julien -- Julien Lerouge PGP Key Id: 0xB1964A62 PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62 PGP Public Key from: keyserver.pgp.com -------------- next part -------------- Index: lib/Transforms/Utils/CodeExtractor.cpp =================================================================== --- lib/Transforms/Utils/CodeExtractor.cpp (revision 93080) +++ lib/Transforms/Utils/CodeExtractor.cpp (working copy) @@ -%ld,%ld +%ld,%ld @@ namespace { class CodeExtractor { - typedef std::vector Values; + typedef SetVector Values; SetVector BlocksToExtract; DominatorTree* DT; bool AggregateArgs; @@ -%ld,%ld +%ld,%ld @@ // instruction is used outside the region, it's an output. for (User::op_iterator O = I->op_begin(), E = I->op_end(); O != E; ++O) if (definedInCaller(*O)) - inputs.push_back(*O); + inputs.insert(*O); // Consider uses of this instruction (outputs). for (Value::use_iterator UI = I->use_begin(), E = I->use_end(); UI != E; ++UI) if (!definedInRegion(*UI)) { - outputs.push_back(I); + outputs.insert(I); break; } } // for: insts @@ -%ld,%ld +%ld,%ld @@ } // for: basic blocks NumExitBlocks = ExitBlocks.size(); - - // Eliminate duplicates. - std::sort(inputs.begin(), inputs.end()); - inputs.erase(std::unique(inputs.begin(), inputs.end()), inputs.end()); - std::sort(outputs.begin(), outputs.end()); - outputs.erase(std::unique(outputs.begin(), outputs.end()), outputs.end()); } /// constructFunction - make a function based on inputs and outputs, as follows: From anton at korobeynikov.info Sat Jan 9 18:43:30 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 10 Jan 2010 03:43:30 +0300 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> Message-ID: Hello, Pazzo > I had a look to llvm (2.6) configure options but I couldn't find any > way to specify cpu type, fpu ecc.. These are not llvm configure options, but gcc's one. Basically, you should configure llvm-gcc in the same way you do for gcc for your platform. > Could you please give me any > indication and/or example? I want to try llvm with Atmel's > AT91SAM7X256 (core is ?ARM7TDMI ) I think adding --with-cpu=arm7tdmi to llvm-gcc configure should work -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From clattner at apple.com Sat Jan 9 19:05:41 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 9 Jan 2010 17:05:41 -0800 Subject: [LLVMdev] [PATCH] Fix nondeterministic behaviour in the CodeExtractor In-Reply-To: <20100110001114.GA22469@pom.apple.com> References: <20100109010120.GB6338@pom.apple.com> <20100110001114.GA22469@pom.apple.com> Message-ID: <9E3DE1C4-76BF-4BA9-81D5-076FAB4529D4@apple.com> On Jan 9, 2010, at 4:11 PM, Julien Lerouge wrote: > On Fri, Jan 08, 2010 at 05:04:17PM -0800, Chris Lattner wrote: > >> On Jan 8, 2010, at 5:01 PM, Julien Lerouge wrote: > >>> Hello, >>> >>> The CodeExtractor contains a std::set to keep track >>> of the >>> blocks to extract. Iterators on this set are not deterministic, and so >>> the functions that are generated are not (the order of the >>> inputs/outputs can change). >>> >>> The attached patch uses a SetVector instead. Ok to apply ? > >> Nice catch, please apply, > >> -Chris > > Thanks for the quick review. There is actually more, is it ok to apply > this one as well (avoid std:vector being sorted) ? Works for me, please apply. -Chris From dllaurence at dslextreme.com Sat Jan 9 19:48:02 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 17:48:02 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: References: <4B48EDB5.20705@laurences.net> Message-ID: <4B4931D2.1030208@laurences.net> On 01/09/2010 01:11 PM, Eli Friedman wrote: > The syntax isn't entirely obvious... Thanks, if it doesn't seem as stupid as I feel, I'll grovel less. :-) > ...usually, when you're wondering > how to write something in IR, the easiest thing to so is write the > equivalent C code, then use http://llvm.org/demo/index.cgi to see what > it looks like in iR. So *that's* why there is a web demo. :-) I think I've actually blocked out the C rules because I've been religiously hiding all variables behind C++ accessors for so long now. I think the last time I made a practice of declaring variables in .h files ANSI C was too newfangled to depend on compiler support. :-) In fact I wouldn't now except my goal is knowledge and I'm willing to bend design rules a bit to make sure I try new things. That is part of the motivation behind my plan to use invoke/unwind in the parser, in fact. It'll make for nicer code, but I'm so used to propagating errors back up the stack that I'm used to the pain. If I do that well enough, anytime I want to remember how to do something I can go back and see how I did it on this project. Dustin From dllaurence at dslextreme.com Sat Jan 9 19:53:34 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 17:53:34 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: References: <4B48EDB5.20705@laurences.net> Message-ID: <4B49331E.6080502@laurences.net> On 01/09/2010 01:12 PM, Chris Lattner wrote: > The equivalent of "extern int G;" is: > > @G = external global i32 OK, then I want to whine a little bit about how that is more obscurely hinted at than discussed. Whine, whine.... :-) Even knowing the word to search on, the only explicit application of the keyword to data is incidental to an example about structures. I think I feel less bad about having bounced that to the list. I'm amazed that the list of linkage types doesn't mention it somewhere. I tried 'common' among other things, but I admit it was a just a desperate shot in the dark before I just gave up and asked. Dustin From dllaurence at dslextreme.com Sat Jan 9 20:04:00 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 18:04:00 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: References: <4B48EDB5.20705@laurences.net> Message-ID: <4B493590.80009@laurences.net> On 01/09/2010 01:20 PM, Anton Korobeynikov wrote: > Hello, Dustin > >> To be clear: remember I'm using Stone Knives here. :-) > In some cases it's better to realize that it's year 2010 now. :) What do you have against digital primitive living? :-) Actually, there is sort of some truth to that joking phrase. I am sort of treating this as a conceptual analog of the old days where your first task with your new 8-bit "personal computer" was to fire up the assembler and start writing your system. If people could do that I should be man enough to do something simpler like write a minimal lisp or forth in the far more congenial LLVM IR, right? > Just write small snippet in C and use llvm-gcc -emit-llvm to emit LLVM > IR corresponding to it. You can use e.g. http://llvm.org/demo/ for > this. I'll make a note to try that before giving up next time. Actually my self-imposed rules don't restrict me using any tool for understanding. It's not a cheat if I learn it well enough to apply it by hand. Kind of like writing a paper--you can (and should) steal ideas and style from Shakespeare, Isaiah, and Abraham Lincoln, you just can't steal their words. :-) In theory, I'll understand LLVM's machine model pretty well when I'm done. Dustin From clattner at apple.com Sat Jan 9 20:53:53 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 9 Jan 2010 18:53:53 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: <4B49331E.6080502@laurences.net> References: <4B48EDB5.20705@laurences.net> <4B49331E.6080502@laurences.net> Message-ID: <4CA74EA9-2CFE-4AF3-B40B-AD193862A5D0@apple.com> On Jan 9, 2010, at 5:53 PM, Dustin Laurence wrote: > On 01/09/2010 01:12 PM, Chris Lattner wrote: > >> The equivalent of "extern int G;" is: >> >> @G = external global i32 > > OK, then I want to whine a little bit about how that is more obscurely > hinted at than discussed. Whine, whine.... :-) Patches welcome! -Chris From dllaurence at dslextreme.com Sat Jan 9 21:13:06 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 19:13:06 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: <4CA74EA9-2CFE-4AF3-B40B-AD193862A5D0@apple.com> References: <4B48EDB5.20705@laurences.net> <4B49331E.6080502@laurences.net> <4CA74EA9-2CFE-4AF3-B40B-AD193862A5D0@apple.com> Message-ID: <4B4945C2.4020606@laurences.net> On 01/09/2010 06:53 PM, Chris Lattner wrote: > Patches welcome! Well, the time I'm motivated to write about something is when I'm learning and it's relevant to me. But the IR Reference is a reference, and should be rigorously correct. I can't be that, most of all on this linkage stuff. For example, the 'external' keyword almost doesn't appear in the manual, but how and where precisely it should be described I probably don't know enough to say. Does it count as a linkage specification? The single thing that would make the IR reference more useful would be more examples, since I often couldn't quite reverse engineer the details from the description and there was no example, or the given example didn't cover my use case. If you are willing to devote more space to expanding the example snippets, perhaps that's doable for an amateur. There certainly are plenty in my source tree. If I'm the only guy on the planet writing extensive IR by hand, perhaps the issues I want to resolve are uncommon ones. Who else worries about how to declare global variables in preprocessed header files? :-) If you're machine-generating source, I imagine you don't need header files. You just emit the declarations as many times as you need them. As I am not a machine, I need consistency and "once and only once" definitions. Dustin From dllaurence at dslextreme.com Sat Jan 9 23:31:12 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 09 Jan 2010 21:31:12 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: References: <4B48EDB5.20705@laurences.net> Message-ID: <4B496620.1080006@laurences.net> On 01/09/2010 01:12 PM, Chris Lattner wrote: > The equivalent of "extern int G;" is: > > @G = external global i32 Hmm. Is it really? This @foo = external global i32 @foo = global i32 5 define i32 @main(i32 %argc, i8 **%argv) { %fooVal = load i32* @foo ret i32 %fooVal } produces a "redefinition of global '@foo'" error. But this extern int x; int x = 5; int *y = &x; compiles to @x = global i32 5 ; [#uses=1] @y = global i32* @x ; [#uses=0] The difference is crucial, because I want to put "@foo = external global i32" in a header file that is then #included in every module where it is used, *including the defining module* (for a consistency check, and because otherwise I'd have to create extra headers that are only #included by the outside world but not the defining module). It appears that the front end is supposed to decide between external and global. In my case I actually could maintain all declarations by hand for one global word used to return exception information, but that wouldn't work for a more involved case. Is there no way to get the same effect as with define/declare for functions? There I have no problem #includeing a declaration into the same file as a definition. The alternative appears to be asking the linker to do it. The docs for "linkonce" say "This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it." That sounds like my case, and it compiles, but I don't know if that's gong to get me into trouble. Dustin From clattner at apple.com Sun Jan 10 00:46:14 2010 From: clattner at apple.com (Chris Lattner) Date: Sat, 9 Jan 2010 22:46:14 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: <4B496620.1080006@laurences.net> References: <4B48EDB5.20705@laurences.net> <4B496620.1080006@laurences.net> Message-ID: On Jan 9, 2010, at 9:31 PM, Dustin Laurence wrote: > On 01/09/2010 01:12 PM, Chris Lattner wrote: > >> The equivalent of "extern int G;" is: >> >> @G = external global i32 > > Hmm. Is it really? Yes. > But this > extern int x; > > int x = 5; The equivalent of that is: @x = global i32 5 I made no claim that 'external' in LLVM has the same semantics as 'extern' in C. > The difference is crucial, because I want to put > > "@foo = external global i32" > > in a header file that is then #included in every module where it is > used, *including the defining module* (for a consistency check, and LLVM IR is not C, and it is not designed for #includes or other related horrible C constructs. -Chris From gvenn.cfe.dev at gmail.com Sun Jan 10 07:09:22 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Sun, 10 Jan 2010 08:09:22 -0500 Subject: [LLVMdev] From OS X to LINUX Message-ID: For close to the last decade or so, I've been developing on OS X and then porting to LINUX. I know there are those who object to this approach, but it works for me. However I noticed that when porting my exception example to LINUX, which involved adding a whopping -rdynamic to the build line, there were technologies missing from my gcc and LINUX installation as compared to what is on OS X 10.6.2. For example, when building LLVM on LINUX, I noticed that ffi and atomic builtins were missing from LINUX and gcc respectively. My question is: What are the minimal packages that are recommended that would bring my LINUX distribution as close as possible to a 32 bit par version of what is on a clean OS X 10.6.2 development environment when developing with LLVM? I'm currently using a CentOS dist., but I'm up for another if that is preferred. On LINUX uname -srvmpio gives: Linux 2.6.18-164.10.1.el5 #1 SMP Thu Jan 7 20:00:41 EST 2010 i686 i686 i386 GNU/Linux, while gcc -v gives: gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) I'm assuming the LLVM build problem with: GCC 4.1.2 20071124 (Red Hat 4.1.2-42), (from the LLVM getting started guide) has disappeared with my release of gcc. Thanks in advance Garrison From mmuller at enduden.com Sun Jan 10 08:58:29 2010 From: mmuller at enduden.com (Michael Muller) Date: Sun, 10 Jan 2010 09:58:29 -0500 Subject: [LLVMdev] Using a function from another module References: <16528.1262998152.232706.1794962677@succubus> Message-ID: <16528.1263135509.629890.1828059212@succubus> Michael Muller wrote: > > Hi all, > > I'm trying to use a function defined in one LLVM module from another module > (in the JIT) but for some reason it's not working out. My sequence of > activity is roughly like this: > > 1) Create moduleA > 2) Create moduleB with "func()" > 3) execEng = ExecutionEngine::create( > new ExistingModuleProvider(moduleB)); > 4) execute "func()" (this works fine) > 4) add "func()" to moduleA as a declaration (no code blocks) with External > linkage. > 5) execEng->addModuleProvider(new ExistingModuleProvider(moduleA)); > 6) run a function in moduleA that calls "func()" > > I get: > LLVM ERROR: Program used external function 'func' which could not be resolved! > > I'm guessing I'm either going about this wrong or missing something. Can > anyone offer me some insight? I've played around with this some more. It looks like the only way that I can get this to work is to do an ExecutionEngine::addGlobalMapping() on the function declaration in moduleA to map it to the function pointer in moduleB. This seems awkward, is there a better way to do this? > > ============================================================================= > michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller > ----------------------------------------------------------------------------- > We are the music-makers, and we are the dreamers of dreams > - Arthur O'Shaughnessy > ============================================================================= > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > ============================================================================= michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- you and I are only different in our minds, the universe makes no such distinction ============================================================================= From kennethuil at gmail.com Sun Jan 10 09:31:56 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Sun, 10 Jan 2010 09:31:56 -0600 Subject: [LLVMdev] Using a function from another module In-Reply-To: <16528.1263135509.629890.1828059212@succubus> References: <16528.1262998152.232706.1794962677@succubus> <16528.1263135509.629890.1828059212@succubus> Message-ID: <400d33ea1001100731s4b61e6bv4bffacbdb34ec926@mail.gmail.com> On Sun, Jan 10, 2010 at 8:58 AM, Michael Muller wrote: > > Michael Muller wrote: >> >> Hi all, >> >> I'm trying to use a function defined in one LLVM module from another module >> (in the JIT) but for some reason it's not working out. ?My sequence of >> activity is roughly like this: >> >> ? 1) Create moduleA >> ? 2) Create moduleB with "func()" >> ? 3) execEng = ExecutionEngine::create( >> ? ? ? ? ?new ExistingModuleProvider(moduleB)); >> ? 4) execute "func()" (this works fine) >> ? 4) add "func()" to moduleA as a declaration (no code blocks) with External >> ? ? ?linkage. >> ? 5) execEng->addModuleProvider(new ExistingModuleProvider(moduleA)); >> ? 6) run a function in moduleA that calls "func()" >> >> I get: >> ? LLVM ERROR: Program used external function 'func' which could not be resolved! >> >> I'm guessing I'm either going about this wrong or missing something. ?Can >> anyone offer me some insight? > > I've played around with this some more. > > It looks like the only way that I can get this to work is to do an > ExecutionEngine::addGlobalMapping() on the function declaration in moduleA to > map it to the function pointer in moduleB. > > This seems awkward, is there a better way to do this? > I'm doing the same thing, and had to do it in the same way. Just because the JIT loads two modules doesn't mean that they're automatically linked together within the JIT... one module cannot call functions in the other unless the external functions are declared and explicitly mapped using addGlobalMapping. I'm guessing it's meant to be that way. From dllaurence at dslextreme.com Sun Jan 10 10:55:01 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sun, 10 Jan 2010 08:55:01 -0800 Subject: [LLVMdev] Variable declarations vs. definitions In-Reply-To: References: <4B48EDB5.20705@laurences.net> <4B496620.1080006@laurences.net> Message-ID: <4B4A0665.708@laurences.net> On 01/09/2010 10:46 PM, Chris Lattner wrote: >> Hmm. Is it really? > > Yes. I guess we have different ideas of what "equivalent" means. The behavior suggests that "external" is the LLVM construct a front-end would use to implement C-type "extern," which is precisely the kind of knowledge I am after but not how I would use "equivalent." But too many years of thinking about stuff with names like "diffeomorphism" may have somewhat altered my usage of "equivalent" from the mainstream and toward the mathematical. :-) > I made no claim that 'external' in LLVM has the same semantics as > 'extern' in C. And the semantics available in LLVM is the sort of knowledge I seek. Apparently, the program is working. :-) > LLVM IR is not C, and it is not designed for #includes or other > related horrible C constructs. Well, there is no question about the horribleness of having to maintain parallel declarations in a different file from your definitions, or any of the other hideous consequences of pure textual manipulation by a tool without syntactic knowledge. But the problem from the standpoint of the problem I chose to solve is that there is *no* solution within LLVM. As bad as CPP is, not having it is far worse unless you do the Right Thing and extend C to make it unnecessary. The analogous situation is that I work with what LLVM itself provides, and therefore as bad as #include is the alternative would be to have hand-maintained parallel declarations of a function in ever file that calls it. Note that this is *not* intended as a criticism of LLVM. I am perfectly aware that it was not designed for hand-coding and expects the front-end to implement whatever tools are necessary. A front-end could generate such declarations easily, for example. So I seem to have achieved what I sought--knowledge. I did not know before where the semantics of C's "extern" were implemented. Now I do--they are implemented by the front-end, which is responsible for deciding in which module to emit the definition instead of the declaration. In fact, much of the fun in life comes from abusing tools for uses they were never intended for. :-) So far, I still think I am learning more about LLVM IR by doing this than I would by any other means. I think I understand by reading, but very often I do *not* until I actually try to use it. As in this case. But I did suggest one preprocessor-free alternative for this particular case: just stick the definition in every source file that needs it and rely on the linker to merge all references. The docs suggest that is what "linkonce" is for, but I have already learned I know nothing about linkage. I can't tell the practical usage difference between linkonce and any of the other linkages that merge definitions. Which, therefore, is another opportunity to learn. Which, if any, is most appropriate for this sort of thing? The goal is learning the idiomatic means, not just finding a workable kludge. Dustin From edwintorok at gmail.com Sun Jan 10 11:02:56 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Sun, 10 Jan 2010 19:02:56 +0200 Subject: [LLVMdev] From OS X to LINUX In-Reply-To: References: Message-ID: <4B4A0840.8040208@gmail.com> On 01/10/2010 03:09 PM, Garrison Venn wrote: > For close to the last decade or so, I've been developing on OS X and then porting to LINUX. I know there are those who object > to this approach, but it works for me. However I noticed that when porting my exception example to LINUX, which involved > adding a whopping -rdynamic to the build line, there were technologies missing from my gcc and LINUX installation as compared > to what is on OS X 10.6.2. For example, when building LLVM on LINUX, I noticed that ffi and atomic builtins were missing from LINUX and gcc > respectively. > > My question is: What are the minimal packages that are recommended that would bring my LINUX distribution as close as possible > to a 32 bit par version of what is on a clean OS X 10.6.2 development environment when developing with LLVM? I'm currently using a > CentOS dist., but I'm up for another if that is preferred. > > On LINUX uname -srvmpio gives: Linux 2.6.18-164.10.1.el5 #1 SMP Thu Jan 7 20:00:41 EST 2010 i686 i686 i386 GNU/Linux, > while gcc -v gives: gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) > That compiler is rather old. I think there are gcc 4.4 packages for centos. If you want a recent compiler with a recent userland then centos is not a good choice, they are server releases, and stay with old versions of packages for a long time. Something like Debian unstable, or Fedora might be a better choice. Best regards, --Edwin From gvenn.cfe.dev at gmail.com Sun Jan 10 11:19:16 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Sun, 10 Jan 2010 12:19:16 -0500 Subject: [LLVMdev] From OS X to LINUX In-Reply-To: <4B4A0840.8040208@gmail.com> References: <4B4A0840.8040208@gmail.com> Message-ID: Yeah, ok, that explains why I am on CentOS. Thanks Edwin. Garrison On Jan 10, 2010, at 12:02, T?r?k Edwin wrote: > On 01/10/2010 03:09 PM, Garrison Venn wrote: >> For close to the last decade or so, I've been developing on OS X and then porting to LINUX. I know there are those who object >> to this approach, but it works for me. However I noticed that when porting my exception example to LINUX, which involved >> adding a whopping -rdynamic to the build line, there were technologies missing from my gcc and LINUX installation as compared >> to what is on OS X 10.6.2. For example, when building LLVM on LINUX, I noticed that ffi and atomic builtins were missing from LINUX and gcc >> respectively. >> >> My question is: What are the minimal packages that are recommended that would bring my LINUX distribution as close as possible >> to a 32 bit par version of what is on a clean OS X 10.6.2 development environment when developing with LLVM? I'm currently using a >> CentOS dist., but I'm up for another if that is preferred. >> >> On LINUX uname -srvmpio gives: Linux 2.6.18-164.10.1.el5 #1 SMP Thu Jan 7 20:00:41 EST 2010 i686 i686 i386 GNU/Linux, >> while gcc -v gives: gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) >> > > That compiler is rather old. I think there are gcc 4.4 packages for centos. > > If you want a recent compiler with a recent userland then centos is not > a good choice, > they are server releases, and stay with old versions of packages for a > long time. > Something like Debian unstable, or Fedora might be a better choice. > > Best regards, > --Edwin > From aaronngray.lists at googlemail.com Sun Jan 10 12:00:34 2010 From: aaronngray.lists at googlemail.com (Aaron Gray) Date: Sun, 10 Jan 2010 18:00:34 +0000 Subject: [LLVMdev] Cygwin llvm-gcc regression In-Reply-To: <9719867c1001100848i2b47236arb9d2f4123d215547@mail.gmail.com> References: <9719867c1001081510t75df73aek2b4b4e14829150fe@mail.gmail.com> <4B483776.5010904@free.fr> <9719867c1001090748v489013e6mda5a14fe7e28f53a@mail.gmail.com> <4B48B596.9000405@free.fr> <9719867c1001091011q2fa55428m1d1ac05828956964@mail.gmail.com> <4B49AB2E.30809@free.fr> <9719867c1001100848i2b47236arb9d2f4123d215547@mail.gmail.com> Message-ID: <9719867c1001101000s266a2f16j7f28e0c469d77b06@mail.gmail.com> 2010/1/10 Aaron Gray > 2010/1/10 Duncan Sands > > Hi Aaron, >> >> >> Thanks, okay heres the results :- >>> >>> LLVM type size doesn't match GCC type size! >>> >>> >> 0x7ff010e0 type constant invariant >>> 96> >>> unit size >> unsigned int> constant invariant 12> >>> align 32 symtab 0 alias set -1 precision 80 >>> pointer_to_this > >>> >> >> as I thought, it's a problem with long double. GCC thinks it is 12 bytes >> long, LLVM presumably thinks it is 16 bytes long [in reality long double >> is 10 bytes long, but here sizes include alignment, and different OS's >> choose different alignments for it]. >> >> Does the following fix it for you? >> >> Index: X86Subtarget.h >> =================================================================== >> --- X86Subtarget.h (revision 93111) >> +++ X86Subtarget.h (working copy) >> @@ -169,7 +169,7 @@ >> p = "e-p:64:64-s:64-f64:64:64-i64:64:64-f80:128:128-n8:16:32:64"; >> else if (isTargetDarwin()) >> p = "e-p:32:32-f64:32:64-i64:32:64-f80:128:128-n8:16:32"; >> - else if (isTargetCygMing() || isTargetWindows()) >> + else if (isTargetWindows()) >> p = "e-p:32:32-f64:64:64-i64:64:64-f80:128:128-n8:16:32"; >> else >> p = "e-p:32:32-f64:32:64-i64:32:64-f80:32:32-n8:16:32"; >> > > Yep ! Thanks a lot Duncan, obviously not doing my job properly. > > http://llvm.org/viewvc/llvm-project?view=rev&revision=91745 > * > * > "Bump alignment requirements for windows targets to achieve compartibility > with vcpp. Based on patch by Michael Beck!" > > Looks like there is an issue to resolve here. As Cygwin will never be VC++ > compatible its probably not a good idea to change Cygwin's DataLayout as > well as Ming's. > Now, I am getting another regression in stage 2 configure :- "checking whether the C compiler works... configure: error: cannot run C compiled programs." This one is strange, I have had it before but cannot remember what the cause is as it was along time ago. Its odd as xgcc was used to compile libgcc, and prev-gcc/xgcc compiles and executes a test c program, yet stage 2's configure is failing. Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100110/75ee0ef9/attachment.html From gvenn.cfe.dev at gmail.com Sun Jan 10 12:38:14 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Sun, 10 Jan 2010 13:38:14 -0500 Subject: [LLVMdev] Using a function from another module In-Reply-To: <400d33ea1001100731s4b61e6bv4bffacbdb34ec926@mail.gmail.com> References: <16528.1262998152.232706.1794962677@succubus> <16528.1263135509.629890.1828059212@succubus> <400d33ea1001100731s4b61e6bv4bffacbdb34ec926@mail.gmail.com> Message-ID: <4455A7C8-6212-49F9-8F50-5178371199D1@gmail.com> Won't passing llvm::Function* around vs strings (function names), also work, at code generation time, without the need for a module A dec to module B impl. mapping? Garrison On Jan 10, 2010, at 10:31, Kenneth Uildriks wrote: > On Sun, Jan 10, 2010 at 8:58 AM, Michael Muller wrote: >> >> Michael Muller wrote: >>> >>> Hi all, >>> >>> I'm trying to use a function defined in one LLVM module from another module >>> (in the JIT) but for some reason it's not working out. My sequence of >>> activity is roughly like this: >>> >>> 1) Create moduleA >>> 2) Create moduleB with "func()" >>> 3) execEng = ExecutionEngine::create( >>> new ExistingModuleProvider(moduleB)); >>> 4) execute "func()" (this works fine) >>> 4) add "func()" to moduleA as a declaration (no code blocks) with External >>> linkage. >>> 5) execEng->addModuleProvider(new ExistingModuleProvider(moduleA)); >>> 6) run a function in moduleA that calls "func()" >>> >>> I get: >>> LLVM ERROR: Program used external function 'func' which could not be resolved! >>> >>> I'm guessing I'm either going about this wrong or missing something. Can >>> anyone offer me some insight? >> >> I've played around with this some more. >> >> It looks like the only way that I can get this to work is to do an >> ExecutionEngine::addGlobalMapping() on the function declaration in moduleA to >> map it to the function pointer in moduleB. >> >> This seems awkward, is there a better way to do this? >> > > I'm doing the same thing, and had to do it in the same way. > > Just because the JIT loads two modules doesn't mean that they're > automatically linked together within the JIT... one module cannot call > functions in the other unless the external functions are declared and > explicitly mapped using addGlobalMapping. I'm guessing it's meant to > be that way. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kennethuil at gmail.com Sun Jan 10 13:02:00 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Sun, 10 Jan 2010 13:02:00 -0600 Subject: [LLVMdev] Using a function from another module In-Reply-To: <4455A7C8-6212-49F9-8F50-5178371199D1@gmail.com> References: <16528.1262998152.232706.1794962677@succubus> <16528.1263135509.629890.1828059212@succubus> <400d33ea1001100731s4b61e6bv4bffacbdb34ec926@mail.gmail.com> <4455A7C8-6212-49F9-8F50-5178371199D1@gmail.com> Message-ID: <400d33ea1001101102w2d411f9ay5adc2ef36c82b6f0@mail.gmail.com> On Sun, Jan 10, 2010 at 12:38 PM, Garrison Venn wrote: > Won't passing llvm::Function* around vs strings (function names), also work, at code generation time, > without the need for a module A dec to module B impl. mapping? > > Garrison Nope. You cannot place a call instruction into one module whose callee is a Function from another module. You have to put a declaration into the same module, and have your call instruction call that. And then they need to be linked together, either by llvm-link or (if JITting) by addGlobalMapping. From mmuller at enduden.com Sun Jan 10 13:23:48 2010 From: mmuller at enduden.com (Michael Muller) Date: Sun, 10 Jan 2010 14:23:48 -0500 Subject: [LLVMdev] Using a function from another module References: <400d33ea1001101102w2d411f9ay5adc2ef36c82b6f0@mail.gmail.com> Message-ID: <16528.1263151428.646196.74068491@succubus> Kenneth Uildriks wrote: > On Sun, Jan 10, 2010 at 12:38 PM, Garrison Venn wrote: > > Won't passing llvm::Function* around vs strings (function names), also work, at code generation time, > > without the need for a module A dec to module B impl. mapping? > > > > Garrison > > Nope. You cannot place a call instruction into one module whose > callee is a Function from another module. You have to put a > declaration into the same module, and have your call instruction call > that. And then they need to be linked together, either by llvm-link > or (if JITting) by addGlobalMapping. > Actually, this is the first thing I tried, and the correct function does seem to get called - but it looks like the Verifier complains about it, which leads me to believe that there may be broader issues involved. ============================================================================= michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- Scnozwangers? Vermicious Knids? What kind of rubbish is that? - Mr. Salt, "Willy Wonka and the Chocolate Factory" ============================================================================= From pazzodalegare at gmail.com Sun Jan 10 14:29:11 2010 From: pazzodalegare at gmail.com (Pazzo Da Legare) Date: Sun, 10 Jan 2010 21:29:11 +0100 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> Message-ID: <1a26b4921001101229p3237c5f9i467d888fce05780c@mail.gmail.com> Dear Anton, Thank you again for your help! I tried with the following options (adding --with-cpu=arm7tdmi and using binutils from cvs snapshot): ../llvm-gcc4.2-2.6.source/configure --prefix=/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6 --program-prefix=llvm- --enable-llvm=/Users/dummy/Develop/llvm/llvm-build --enable-languages=c,c++ --host=i686-apple-darwin9 --build=i686-apple-darwin9 --target=arm-elf --with-cpu=arm7tdmi --with-gxx-include-dir=/usr/include/c++/4.0.0 --enable-interwork --enable-multilib --with-newlib --with-header=../newlib-1.18.0/newlib/libc/include ...but after a while I got same errors: /Users/dummy/Develop/llvm/llvm-gcc-build/./gcc/xgcc -B/Users/dummy/Develop/llvm/llvm-gcc-build/./gcc/ -B/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/bin/ -B/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/lib/ -isystem /usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/include -isystem /usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/sys-include -O2 -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fno-inline -g -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -Dinhibit_libc -I. -I. -I../../llvm-gcc4.2-2.6.source/gcc -I../../llvm-gcc4.2-2.6.source/gcc/. -I../../llvm-gcc4.2-2.6.source/gcc/../include -I./../intl -I../../llvm-gcc4.2-2.6.source/gcc/../libcpp/include -I../../llvm-gcc4.2-2.6.source/gcc/../libdecnumber -I../libdecnumber -I/Users/dummy/Develop/llvm/llvm-build/include -I/Users/dummy/Develop/llvm/llvm-2.6/include -mthumb -fexceptions -c ../../llvm-gcc4.2-2.6.source/gcc/unwind-dw2-fde.c -o libgcc/thumb/unwind-dw2-fde.o /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccEIiJQ4.s: Assembler messages: /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccEIiJQ4.s:96: Error: selected processor does not support `sxtb r5,r5' /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccEIiJQ4.s:537: Error: selected processor does not support `sxtb r6,r6' /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccEIiJQ4.s:705: Error: selected processor does not support `sxtb r1,r1' /var/folders/7f/7fiRIEm-FruFfT7mGbk3uk+++TI/-Tmp-//ccEIiJQ4.s:711: Error: selected processor does not support `sxtb r1,r1' make[3]: *** [libgcc/thumb/unwind-dw2-fde.o] Error 1 make[2]: *** [stmp-multilib] Error 2 make[1]: *** [all-gcc] Error 2 make: *** [all] Error 2 Any clue? Thank in advance for help. pz From anton at korobeynikov.info Sun Jan 10 14:50:37 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 10 Jan 2010 23:50:37 +0300 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: <1a26b4921001101229p3237c5f9i467d888fce05780c@mail.gmail.com> References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> <1a26b4921001101229p3237c5f9i467d888fce05780c@mail.gmail.com> Message-ID: Hello, Pazzo > Any clue? Yes. Sorry, my fault - next time I should check ARM docs before replying. ARM7TDMI is ARMv4T and this is not supported by LLVM (LLVM does v5+ codegen). -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dllaurence at dslextreme.com Sun Jan 10 15:15:16 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sun, 10 Jan 2010 13:15:16 -0800 Subject: [LLVMdev] LangRef 'external' patch Message-ID: <4B4A4364.3030402@laurences.net> Here is a patch for LangRef.html that adds a section for 'external' linkage. It probably needs love from someone with more knowledge, but perhaps the patch will motivate that person to improve it. Dustin -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: LangRef.patch Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100110/7c12fc65/attachment.pl From pazzodalegare at gmail.com Sun Jan 10 17:17:36 2010 From: pazzodalegare at gmail.com (Pazzo Da Legare) Date: Mon, 11 Jan 2010 00:17:36 +0100 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> <1a26b4921001101229p3237c5f9i467d888fce05780c@mail.gmail.com> Message-ID: <1a26b4921001101517h3453044fm649a68534842a40a@mail.gmail.com> Dear ML, Anton, Thank you for your answer and your help. I had a look at ARM.td of LLVM 2.6 (in lib/Target/ARM..) where I found following definitions: // V4T Processors. def : ProcNoItin<"arm7tdmi", [ArchV4T]>; def : ProcNoItin<"arm7tdmi-s", [ArchV4T]>; def : ProcNoItin<"arm710t", [ArchV4T]>; def : ProcNoItin<"arm720t", [ArchV4T]>; def : ProcNoItin<"arm9", [ArchV4T]>; def : ProcNoItin<"arm9tdmi", [ArchV4T]>; def : ProcNoItin<"arm920", [ArchV4T]>; def : ProcNoItin<"arm920t", [ArchV4T]>; def : ProcNoItin<"arm922t", [ArchV4T]>; def : ProcNoItin<"arm940t", [ArchV4T]>; def : ProcNoItin<"ep9312", [ArchV4T]>; I would like to understand if LLVM can be used for ArchV4T or not. Could you please indicate specific documentation for llvm ARM codegen? Does anybody use llvm with arm7tdmi ucontroller (e.g. at91sam7xxx) Thank you again for your help, pz 2010/1/10 Anton Korobeynikov : > Hello, Pazzo > >> Any clue? > Yes. Sorry, my fault - next time I should check ARM docs before replying. > ARM7TDMI is ARMv4T and this is not supported by LLVM (LLVM does v5+ codegen). > From felipe.lessa at gmail.com Sun Jan 10 20:04:51 2010 From: felipe.lessa at gmail.com (Felipe Lessa) Date: Mon, 11 Jan 2010 00:04:51 -0200 Subject: [LLVMdev] LangRef 'external' patch In-Reply-To: <4B4A4364.3030402@laurences.net> References: <4B4A4364.3030402@laurences.net> Message-ID: <20100111020451.GA4398@kira.casa> On Sun, Jan 10, 2010 at 01:15:16PM -0800, Dustin Laurence wrote: > + > +
> +
> +; file foo.ll
> +    @foo = global i32 0
> +...
> +; file usesfoo.ll
> +    @foo = external global i32; resolves to @foo in foo.ll
> +
> +
> + I'm sorry if this is a dumb comment, but isn't that on the 'global i32' line without a corresponding ? Cheers, -- Felipe. From guh at boisestate.edu Sun Jan 10 22:39:21 2010 From: guh at boisestate.edu (Gang-Ryung Uh) Date: Sun, 10 Jan 2010 21:39:21 -0700 Subject: [LLVMdev] Debugging LLVM opt pass Message-ID: *What would be the recommended way to debug LLVM opt pass? Is there any way to perform source level debugging on a particular opt pass? * *-- UGR* -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100110/e250208a/attachment.html From gyounghwakim at gmail.com Mon Jan 11 02:21:30 2010 From: gyounghwakim at gmail.com (Gyounghwa Kim) Date: Mon, 11 Jan 2010 17:21:30 +0900 Subject: [LLVMdev] [Help] How can we call an object's virtual function inside IR? In-Reply-To: <4B446875.7050402@free.fr> References: <61ad7e1f1001050235t42d6a5boacce28e6bab72d00@mail.gmail.com> <4B431B98.70403@free.fr> <61ad7e1f1001060210u6c047722s2ed072adbf616ef4@mail.gmail.com> <4B446875.7050402@free.fr> Message-ID: <61ad7e1f1001110021m45e7fad1ud09c693ab1a6c891@mail.gmail.com> Dear Duncan, Thank you very much for your answer. Actually, I was able to call a C function from LLVM IR, now I try to call a class member function from inside IR and tried to follow your instructions. 1. I am trying to create a ConstantInt pointer to the class member function. 2. Covert it to a Function Pointer and call it from IRBuilder. However, I can't figure out what to pass as the value of the pointer to the member function. Could you help me with this? My code parts are shown below. using namespace llvm; class Record { public: Record(){value=0;} Record(int a){value=a;} int getValue(){return value;} void setValue(int a){value=a;} int addOneToValue(){return value+1;} private: int value; }; typedef int (Record::*RecMemFunc)(); int main(int argc, char*args[]) { Record *rec= new Record(5); int y = rec->addOneToValue(); printf("Hi y = %d\n", y); RecMemFunc rp = &Record::addOneToValue; y = (rec->*rp)(); printf("Hi y = %d\n", y); Constant* constInt = ConstantInt::get(Type::Int64Ty, (int64)thePointer); Value* constPtr = ConstantExpr::getIntToPtr(constInt, PointerType::getUnqual(Type::Int32Ty)); //builder.CreateCall(myFunction, constPtr); : : } What will be the value to pass in instead of thePointer?????? Thank you very much for your help in advance. :) - Gyounghwa On Wed, Jan 6, 2010 at 7:39 PM, Duncan Sands wrote: > Hi Gyounghwa Kim, > >> First of all, thank you very much for your answer. >> I tried your sugestion and found out that it is not what I wanted. >> What I have to do is call a native C function from inside this >> generated function. >> Is there any way that we can find and call native C functions not >> created by LLVM IR? > > You can insert a declaration of the function into the IR, then call > it. ?Of course, for this to work you need to link in the native function > when running. ?If you are building a standalone application then this > is no problem. ?If you are running the IR using the JIT then it is also > possible, hopefully someone else will explain how. > >> I am asking this question because I want to fix this example to get a >> class member variable (ClassA * %record) and call the member function >> of it from inside LLVM IR. > > You are allowed to call a pointer, i.e. a function declaration is not > required. ?So you can just load the pointer out of %record, bitcast it > to the right function type, and call it. > > Ciao, > > Duncan. > > PS: Please reply to the list and not to me personally. ?That way, others > may answer, and the discussion is archived which may help in the future > if someone else has the same question. > > If LLVM IR cannot access the member >> >> function of a class. If it is not supported, we can change class >> member functions like a c function. For example, ClassA->funca () can >> be created as funcb(&ClassA ) -a C style function. Then we need to >> call funcb from inside LLVM IR. >> >> Will that be possible? >> I tried to search web and documents, but really couldn't find it. >> >> [[a [10.00]] > [3.00]] >> ; ModuleID = 'ExprF' >> >> define i1 @expr(double* %record) { >> entry: >> ? ? ? %0 = getelementptr double* %record, i32 0 ? ? ? ? ? ? ? ; >> [#uses=1] >> ? ? ? %1 = load double* %0 ? ? ? ? ? ?; [#uses=1] >> ? ? ? %2 = frem double %1, 1.000000e+01 ? ? ? ? ? ? ? ; [#uses=1] >> ? ? ? %3 = fcmp ogt double %2, 3.000000e+00 ? ? ? ? ? ; [#uses=1] >> ? ? ? ret i1 %3 >> } >> >> On Tue, Jan 5, 2010 at 7:59 PM, Duncan Sands wrote: >>> >>> Hi Gyounghwa Kim, try pasting C++ code into http://llvm.org/demo/ >>> in order to see the LLVM IR that llvm-g++ turns it into. ?That way >>> you will see how this can be done. >>> >>> Best wishes, >>> >>> Duncan. >>> > > From hvdspek at liacs.nl Mon Jan 11 02:33:48 2010 From: hvdspek at liacs.nl (Harmen van der Spek) Date: Mon, 11 Jan 2010 09:33:48 +0100 Subject: [LLVMdev] Debugging LLVM opt pass In-Reply-To: References: Message-ID: Compile all sources in DEBUG mode. Then you can run opt in gdb for instance, set command line arguments using "set args" and set a breakpoint using "break file:linenum". This also works for debugging dynamically linked libraries loaded with the -load option of opt. Harmen Gang-Ryung Uh wrote: > *What would be the recommended way to debug LLVM opt pass? Is > there any way to perform source level debugging on a particular opt pass? > * > *-- UGR* > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From dllaurence at dslextreme.com Mon Jan 11 03:28:20 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Mon, 11 Jan 2010 01:28:20 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary Message-ID: <4B4AEF34.5090601@laurences.net> Here is a patch that cleans up a couple of bugs and makes what I think are a couple of small improvements based on the recent advice about structs that I got. There is more like this that could be done, but I wanted to see how this example was received. Dustin -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: LangRef.struct.patch Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/4a0d4752/attachment.pl From gvenn.cfe.dev at gmail.com Mon Jan 11 03:41:21 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Mon, 11 Jan 2010 04:41:21 -0500 Subject: [LLVMdev] Using a function from another module In-Reply-To: <400d33ea1001101102w2d411f9ay5adc2ef36c82b6f0@mail.gmail.com> References: <16528.1262998152.232706.1794962677@succubus> <16528.1263135509.629890.1828059212@succubus> <400d33ea1001100731s4b61e6bv4bffacbdb34ec926@mail.gmail.com> <4455A7C8-6212-49F9-8F50-5178371199D1@gmail.com> <400d33ea1001101102w2d411f9ay5adc2ef36c82b6f0@mail.gmail.com> Message-ID: <7B73ECC5-A2C6-4FF1-982E-1F096298A145@gmail.com> Cool! I wouldn't have believed it until I saw my test results. Thanks for ed. Garrison On Jan 10, 2010, at 14:02, Kenneth Uildriks wrote: > On Sun, Jan 10, 2010 at 12:38 PM, Garrison Venn wrote: >> Won't passing llvm::Function* around vs strings (function names), also work, at code generation time, >> without the need for a module A dec to module B impl. mapping? >> >> Garrison > > Nope. You cannot place a call instruction into one module whose > callee is a Function from another module. You have to put a > declaration into the same module, and have your call instruction call > that. And then they need to be linked together, either by llvm-link > or (if JITting) by addGlobalMapping. From gvenn.cfe.dev at gmail.com Mon Jan 11 04:14:02 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Mon, 11 Jan 2010 05:14:02 -0500 Subject: [LLVMdev] Using a function from another module In-Reply-To: <16528.1263151428.646196.74068491@succubus> References: <400d33ea1001101102w2d411f9ay5adc2ef36c82b6f0@mail.gmail.com> <16528.1263151428.646196.74068491@succubus> Message-ID: <2AD0CD8C-2958-40D1-B056-8D3CE8C833D6@gmail.com> So, having given my last response, I'm still bothered by this. I think what I find unusual is that one has to manually JIT and map (ExecutionEngine::addGlobalMapping(...)). Maybe I'm out there, but I keep on wanting to have the linkage supplied in the Module A decl. take care of this for me. So instead of an enum value of llvm::GlobalValue::ExternalLinkage, I could conceptually give it (the decl.), a module (module B in this case), or maybe an enum linkage, module pair. Of course representing this in IR might be a problem. Anyway what do I know. Garrison On Jan 10, 2010, at 14:23, Michael Muller wrote: > > Kenneth Uildriks wrote: >> On Sun, Jan 10, 2010 at 12:38 PM, Garrison Venn wrote: >>> Won't passing llvm::Function* around vs strings (function names), also work, at code generation time, >>> without the need for a module A dec to module B impl. mapping? >>> >>> Garrison >> >> Nope. You cannot place a call instruction into one module whose >> callee is a Function from another module. You have to put a >> declaration into the same module, and have your call instruction call >> that. And then they need to be linked together, either by llvm-link >> or (if JITting) by addGlobalMapping. >> > > Actually, this is the first thing I tried, and the correct function does seem > to get called - but it looks like the Verifier complains about it, which leads > me to believe that there may be broader issues involved. > > > ============================================================================= > michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller > ----------------------------------------------------------------------------- > Scnozwangers? Vermicious Knids? What kind of rubbish is that? > - Mr. Salt, "Willy Wonka and the Chocolate Factory" > ============================================================================= From curtis.jones at gmail.com Mon Jan 11 07:07:56 2010 From: curtis.jones at gmail.com (Curtis Jones) Date: Mon, 11 Jan 2010 08:07:56 -0500 Subject: [LLVMdev] Optimization Help Message-ID: <66E5452F-D13D-45B9-9ABE-32EF02947E3A@gmail.com> I am working on a program which links with the FTDI USB driver (ftd2xx.dylib). Prior to a couple of hours ago I had only worked with Debug builds of my code (no compiler optimizations). In testing a production build, I found that using anything other than -O0 results in the FTDI driver returning bad data; just a lot of zeroes. I tested and re-tested to make sure that changing just that one option was the issue. And I have absolutely no idea how to go about figuring out what the actual issue is; or what a reasonable remedy is. If the likely cause isn't obvious based on the little bit of information I've provided, please tell me what details I can provide that would be useful. I don't know where to begin. Any help would be appreciated. Thanks. -- Curtis Jones curtisjones.us 404.723.3728 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2427 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/3ff4ce12/attachment.bin From baldrick at free.fr Mon Jan 11 08:24:00 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 11 Jan 2010 15:24:00 +0100 Subject: [LLVMdev] Cygwin llvm-gcc regression In-Reply-To: <9719867c1001101000s266a2f16j7f28e0c469d77b06@mail.gmail.com> References: <9719867c1001081510t75df73aek2b4b4e14829150fe@mail.gmail.com> <4B483776.5010904@free.fr> <9719867c1001090748v489013e6mda5a14fe7e28f53a@mail.gmail.com> <4B48B596.9000405@free.fr> <9719867c1001091011q2fa55428m1d1ac05828956964@mail.gmail.com> <4B49AB2E.30809@free.fr> <9719867c1001100848i2b47236arb9d2f4123d215547@mail.gmail.com> <9719867c1001101000s266a2f16j7f28e0c469d77b06@mail.gmail.com> Message-ID: <4B4B3480.4090108@free.fr> Hi Aaron, > Now, I am getting another regression in stage 2 configure :- > > "checking whether the C compiler works... configure: error: cannot > run C compiled programs." check config.log to see what happened - probably the compiler crashed. Ciao, Duncan. From junk at giantblob.com Mon Jan 11 09:07:32 2010 From: junk at giantblob.com (James Williams) Date: Mon, 11 Jan 2010 15:07:32 +0000 Subject: [LLVMdev] Operations on constant array value? Message-ID: Hi, I've read http://llvm.org/docs/LangRef.html#t_array and http://llvm.org/docs/GetElementPtr.html and if I've understood right there are no operations that act directly on arrays - instead I need to use getelementptr on a pointer to an array to get a pointer to an array element. I also understand that there is no 'address of' operation. As a result I can't figure out how to use constant derived types without assigning them to a global. Say I want to use the C bindings function LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to a C string constant - there doesn't seem to be any way to directly use the resulting [N x i8] value directly and there's no operator that gives me its address. The only way I can see to get a pointer to the string constant array is to go through a global variable, for example: g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + string_literal_number); string_literal_number = string_literal_number + 1; v = LLVMConstString(string_literal, string_literal.Length, 0); LLVMSetInitializer(g, v); elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), LLVMConstInt(LLVMInt32Type(), 0L, 0) }; return LLVMConstInBoundsGEP(g, elements, 2); Is it possible to get the address of an element of a constant array or struct without first initializing a global variable to the constant? Thanks in advance, -- James Williams -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/adb4421a/attachment.html From eli.friedman at gmail.com Mon Jan 11 09:29:09 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 11 Jan 2010 07:29:09 -0800 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: References: Message-ID: On Mon, Jan 11, 2010 at 7:07 AM, James Williams wrote: > Hi, > > I've read http://llvm.org/docs/LangRef.html#t_array and > http://llvm.org/docs/GetElementPtr.html and if I've understood right there > are no operations that act directly on arrays - instead I need to use > getelementptr on a pointer to an array to get a pointer to an array element. > I also understand that there is no 'address of' operation. > > As a result I can't figure out how to use constant derived types without > assigning them to a global. Say I want to use the C bindings function > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to a > C string constant - there doesn't seem to be any way to directly use the > resulting [N x i8] value directly and there's no operator that gives me its > address. > > The only way I can see to get a pointer to the string constant array is to > go through a global variable, for example: > > g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + > string_literal_number); > string_literal_number = string_literal_number + 1; > v = LLVMConstString(string_literal, string_literal.Length, 0); > LLVMSetInitializer(g, v); > elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), > LLVMConstInt(LLVMInt32Type(), 0L, 0) }; > return LLVMConstInBoundsGEP(g, elements, 2); > > Is it possible to get the address of an element of a constant array or > struct without first initializing a global variable to the constant? No. -Eli From gvenn.cfe.dev at gmail.com Mon Jan 11 09:49:52 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Mon, 11 Jan 2010 10:49:52 -0500 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: References: Message-ID: <6E61F25B-07CF-4AB2-BA4C-E99761C103A3@gmail.com> Does the C API have an equivalent of stack storage? Via the C++ APIs one can shove the string constant on the stack via a store instruction operation on an alloca instruction--the address needed is the alloca. For example: llvm::Value* stringVar = builder.CreateAlloca(stringConstant->getType()); builder.CreateStore(stringConstant, stringVar); The stringVar is your address. Garrison On Jan 11, 2010, at 10:07, James Williams wrote: > Hi, > > I've read http://llvm.org/docs/LangRef.html#t_array and http://llvm.org/docs/GetElementPtr.html and if I've understood right there are no operations that act directly on arrays - instead I need to use getelementptr on a pointer to an array to get a pointer to an array element. I also understand that there is no 'address of' operation. > > As a result I can't figure out how to use constant derived types without assigning them to a global. Say I want to use the C bindings function LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to a C string constant - there doesn't seem to be any way to directly use the resulting [N x i8] value directly and there's no operator that gives me its address. > > The only way I can see to get a pointer to the string constant array is to go through a global variable, for example: > > g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + string_literal_number); > string_literal_number = string_literal_number + 1; > v = LLVMConstString(string_literal, string_literal.Length, 0); > LLVMSetInitializer(g, v); > elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), LLVMConstInt(LLVMInt32Type(), 0L, 0) }; > return LLVMConstInBoundsGEP(g, elements, 2); > > Is it possible to get the address of an element of a constant array or struct without first initializing a global variable to the constant? > > Thanks in advance, > -- James Williams > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/096ca6cd/attachment.html From clattner at apple.com Mon Jan 11 10:54:36 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 11 Jan 2010 08:54:36 -0800 Subject: [LLVMdev] Optimization Help In-Reply-To: <66E5452F-D13D-45B9-9ABE-32EF02947E3A@gmail.com> References: <66E5452F-D13D-45B9-9ABE-32EF02947E3A@gmail.com> Message-ID: <109D7121-F429-46CF-BE0B-2718DCC82E64@apple.com> On Jan 11, 2010, at 5:07 AM, Curtis Jones wrote: > I am working on a program which links with the FTDI USB driver (ftd2xx.dylib). Prior to a couple of hours ago I had only worked with Debug builds of my code (no compiler optimizations). In testing a production build, I found that using anything other than -O0 results in the FTDI driver returning bad data; just a lot of zeroes. I tested and re-tested to make sure that changing just that one option was the issue. And I have absolutely no idea how to go about figuring out what the actual issue is; or what a reasonable remedy is. > > If the likely cause isn't obvious based on the little bit of information I've provided, please tell me what details I can provide that would be useful. I don't know where to begin. It's impossible to tell without more information, but the most likely cause of this is if your code has undefined behavior. For example, if it uses variables without initializing them, reads off the end of an array, etc, the code may significantly change behavior after optimization. Tools like valgrind are often helpful tracking these sorts of things down. If you're building with clang head, you can try the experimental -fcatch-undefined-behavior flag. -Chris From guh at boisestate.edu Mon Jan 11 12:57:29 2010 From: guh at boisestate.edu (Gang-Ryung Uh) Date: Mon, 11 Jan 2010 11:57:29 -0700 Subject: [LLVMdev] LICM ilist question. Message-ID: I am using LLVM 2.6 and I have a question on the use of the BasicBlock::iterator to hoist loop invariant instructions to the loop preheader. When I process the instructions backward as shown in the following code, I got the following error right after the "hoist(I)" is done. Can anyone advise whether I am misusing BasicBlock::iterator? /opt/llvms/src/llvm_26/ include/llvm/ADT/ilist.h:213: llvm::ilist_iterator& llvm::ilist_iterator::operator--() [with NodeTy = llvm::Instruction]: Assertion `NodePtr && "--'d off the beginning of an ilist!"' failed. *LICM::HOistRegion(DomTreeNode *N)* { assert(N != 0 && "Null dominator tree node?"); BasicBlock *BB = N->getBlock(); ... * for (BasicBlock::iterator II = BB->end(); II != BB->begin(); ) *{ Instruction &I = *--II; if (isLoopInvariantInst(I) && canSinkOrHoistInst(I) && isSafeToExecuteUnconditionally(I)) * hoist(I);* } .. } -- UGR -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/7f0da2b3/attachment.html From junk at giantblob.com Mon Jan 11 13:03:33 2010 From: junk at giantblob.com (James Williams) Date: Mon, 11 Jan 2010 19:03:33 +0000 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: References: Message-ID: 2010/1/11 Eli Friedman > On Mon, Jan 11, 2010 at 7:07 AM, James Williams > wrote: > > Hi, > > > > I've read http://llvm.org/docs/LangRef.html#t_array and > > http://llvm.org/docs/GetElementPtr.html and if I've understood right > there > > are no operations that act directly on arrays - instead I need to use > > getelementptr on a pointer to an array to get a pointer to an array > element. > > I also understand that there is no 'address of' operation. > > > > As a result I can't figure out how to use constant derived types without > > assigning them to a global. Say I want to use the C bindings function > > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to > a > > C string constant - there doesn't seem to be any way to directly use the > > resulting [N x i8] value directly and there's no operator that gives me > its > > address. > > > > The only way I can see to get a pointer to the string constant array is > to > > go through a global variable, for example: > > > > g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + > > string_literal_number); > > string_literal_number = string_literal_number + 1; > > v = LLVMConstString(string_literal, string_literal.Length, 0); > > LLVMSetInitializer(g, v); > > elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), > > LLVMConstInt(LLVMInt32Type(), 0L, 0) }; > > return LLVMConstInBoundsGEP(g, elements, 2); > > > > Is it possible to get the address of an element of a constant array or > > struct without first initializing a global variable to the constant? > > No. > OK. I'd have preferred to have to avoid bloating the module symbol table with global symbols that will never be referenced but it's no big deal. > > -Eli > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/de036a05/attachment.html From viridia at gmail.com Mon Jan 11 13:10:26 2010 From: viridia at gmail.com (Talin) Date: Mon, 11 Jan 2010 11:10:26 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> Message-ID: Quick question - should unions enforce that all member types are unique? I realize that a union of { i32, i32 } doesn't make sense, but should the code actually forbid this? As far as constants go, as long as the initializer is an exact match for one of the member types, it should be no problem. On Fri, Jan 8, 2010 at 11:00 PM, Chris Lattner wrote: > > On Jan 6, 2010, at 12:45 PM, Talin wrote: > > This patch adds a UnionType to DerivedTypes.h. > > > Cool. When proposing an IR extension, it is usually best to start with a > LangRef.html patch so that we can discuss the semantics of the extension. > Please do write this before you get much farther. I assume that you want > unions usable in the same situations as a struct. However, how do "constant > unions" work? How do I initialize a global variable whose type is "union of > float and i32" for example? > > It also adds code to the bitcode reader / writer and the assembly parser > for the new type, as well as a tiny .ll test file in test/Assembler. It does > not contain any code related to code generation or type layout - I wanted to > see if this much was acceptable before I proceeded any further. > > > The .ll file isn't included in your patch, but I see that you chose a > syntax of 'union { i32, float }' which seems very reasonable. > > Unlike my previous patch, in which the Union type was implemented as a > packing option to type Struct (thereby re-using the machinery of Struct), > this patch defines Union as a completely separate type from Struct. > > > I think this approach makes sense. It means that things like TargetData > StructLayout won't work for unions, but you don't need them since all > elements are at offset 0. > > I was a little uncertain as to how to write the tests. I'd particularly > like to write tests for the bitcode reader/writer stuff, but I am not sure > how to do that. > > > A reasonable example are tests like test/Feature/newcasts.ll > > Here are some thoughts on your patch: > > +class UnionType : public CompositeType { > ... > + /// UnionType::get - Create an empty union type. > + /// > + static UnionType *get(LLVMContext &Context) { > + return get(Context, std::vector()); > + } > > I don't think that an empty union is going to be important enough to add a > special accessor for it. Is an empty union ever a useful thing to do? If > you completely disallow them from IR, it would end up simplifying some > things. We don't allow empty vectors, and it seems that an empty union has > exactly the same semantics as an empty struct, so having both empty structs > and empty unions doesn't seem necessary. > > + static UnionType *get(LLVMContext &Context, > + const std::vector &Params); > > Since this is new code, please have the constructor method take a 'const > Type*const* Elements, unsigned NumElements' instead of requiring the caller > to make an std::vector. This allows use of SmallVector etc. It is > desirable to do this for all the other type classes in DerivedTypes.h, but > we haven't gotten around to doing it yet. > > + /// UnionType::get - This static method is a convenience method for > + /// creating union types by specifying the elements as arguments. > + /// Note that this method always returns a non-packed struct. To get > + /// an empty struct, pass NULL, NULL. > + static UnionType *get(LLVMContext &Context, > + const Type *type, ...) END_WITH_NULL; > > Please update the comments. Also, if you disallow empty unions here, you > don't need to pass a context. > > > +++ include/llvm/Type.h (working copy) > @@ -86,6 +86,7 @@ > PointerTyID, ///< 12: Pointers > OpaqueTyID, ///< 13: Opaque: type with unknown structure > VectorTyID, ///< 14: SIMD 'packed' format, or other vector type > + UnionTyID, ///< 15: Unions > > Please put this up next to Struct for simplicity, the numbering here > doesn't need to be stable. The numbering in llvm-c/Core.h does need to be > stable though. > > > +bool UnionType::indexValid(const Value *V) const { > + // Union indexes require 32-bit integer constants. > + if (V->getType() == Type::getInt32Ty(V->getContext())) > > Please use V->getType()->isInteger(32) which is probably new since you > started your patch. > > > +UnionType::UnionType(LLVMContext &C, const std::vector > &Types) > + : CompositeType(C, UnionTyID) { > + ContainedTys = reinterpret_cast(this + 1); > + NumContainedTys = Types.size(); > + bool isAbstract = false; > + for (unsigned i = 0; i < Types.size(); ++i) { > > No need to evaluate Types.size() every time through the loop. > > > +bool LLParser::ParseUnionType(PATypeHolder &Result) { > ... > + if (!EatIfPresent(lltok::lbrace)) { > + return Error(EltTyLoc, "'{' expected after 'union'"); > + } > > Please use: > if (ParseToken(lltok::lbrace, "'{' expected after 'union'")) return true; > > > + EltTyLoc = Lex.getLoc(); > + if (ParseTypeRec(Result)) return true; > + ParamsList.push_back(Result); > + > + if (Result->isVoidTy()) > + return Error(EltTyLoc, "union element can not have void type"); > + if (!UnionType::isValidElementType(Result)) > + return Error(EltTyLoc, "invalid element type for union"); > + > + while (EatIfPresent(lltok::comma)) { > + EltTyLoc = Lex.getLoc(); > + if (ParseTypeRec(Result)) return true; > + > + if (Result->isVoidTy()) > + return Error(EltTyLoc, "union element can not have void type"); > + if (!UnionType::isValidElementType(Result)) > + return Error(EltTyLoc, "invalid element type for union"); > + > + ParamsList.push_back(Result); > + } > > This can be turned into a: > > do { > ... > } while (EatIfPresent(lltok::comma)); > > loop to avoid the duplication of code. > > +++ lib/Bitcode/Writer/BitcodeWriter.cpp (working copy) > ... > + case Type::UnionTyID: { > + const UnionType *ST = cast(T); > + // UNION: [eltty x N] > + Code = bitc::TYPE_CODE_UNION; > + // Output all of the element types. > + for (StructType::element_iterator I = ST->element_begin(), > + E = ST->element_end(); I != E; ++I) > + TypeVals.push_back(VE.getTypeID(*I)); > + AbbrevToUse = UnionAbbrev; > + break; > + } > > Please rename ST -> UT and use the right iterator type. > > I didn't look closely at the C bindings. If you eliminate empty unions > they should get a bit simpler. > > Otherwise the patch looks like a fine start. Lets please get the LangRef > spec ironed out, then you can start committing subsystems to support this. > My biggest concern about this extension is updating all the places in the > optimizer to know about it. To get adequate testing coverage on this, we > should probably switch llvm-gcc or clang to start using the union type in at > least some common case, which will allow us to get coverage on it through > the optimizer. > > Thanks for working on this Talin! > > -Chris > > > > > > > > > > > > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/caedbf2c/attachment.html From junk at giantblob.com Mon Jan 11 13:11:32 2010 From: junk at giantblob.com (James Williams) Date: Mon, 11 Jan 2010 19:11:32 +0000 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: <6E61F25B-07CF-4AB2-BA4C-E99761C103A3@gmail.com> References: <6E61F25B-07CF-4AB2-BA4C-E99761C103A3@gmail.com> Message-ID: 2010/1/11 Garrison Venn > Does the C API have an equivalent of stack storage? Via the C++ APIs one > can shove the string constant on the stack via > a store instruction operation on an alloca instruction--the address needed > is the alloca. For example: > > llvm::Value* stringVar = builder.CreateAlloca(stringConstant->getType()); > builder.CreateStore(stringConstant, stringVar); > > The stringVar is your address. > > Garrison > Thanks but I want something I can use to generate code for a pointer to a C string in any context so I should avoid alloca. Otherwise I'll risk massive stack growth if the generated code is within a loop for example. It looks like using a global variable is the canonical way to do this so I'll stick with it. -- James > > On Jan 11, 2010, at 10:07, James Williams wrote: > > Hi, > > I've read http://llvm.org/docs/LangRef.html#t_array and > http://llvm.org/docs/GetElementPtr.html and if I've understood right there > are no operations that act directly on arrays - instead I need to use > getelementptr on a pointer to an array to get a pointer to an array element. > I also understand that there is no 'address of' operation. > > As a result I can't figure out how to use constant derived types without > assigning them to a global. Say I want to use the C bindings function > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to a > C string constant - there doesn't seem to be any way to directly use the > resulting [N x i8] value directly and there's no operator that gives me its > address. > > The only way I can see to get a pointer to the string constant array is to > go through a global variable, for example: > > g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + > string_literal_number); > string_literal_number = string_literal_number + 1; > v = LLVMConstString(string_literal, string_literal.Length, 0); > LLVMSetInitializer(g, v); > elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), > LLVMConstInt(LLVMInt32Type(), 0L, 0) }; > return LLVMConstInBoundsGEP(g, elements, 2); > > Is it possible to get the address of an element of a constant array or > struct without first initializing a global variable to the constant? > > Thanks in advance, > -- James Williams > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/2b59cd2f/attachment.html From jyasskin at google.com Mon Jan 11 13:20:02 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 11 Jan 2010 11:20:02 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary In-Reply-To: <4B4AEF34.5090601@laurences.net> References: <4B4AEF34.5090601@laurences.net> Message-ID: Awesome, thanks! Committed as r93170 with the following change: s/local variables/registers/. "Local variable" refers to allocas in LLVM, rather than %whatever SSA "variables". On Mon, Jan 11, 2010 at 1:28 AM, Dustin Laurence wrote: > Here is a patch that cleans up a couple of bugs and makes what I think > are a couple of small improvements based on the recent advice about > structs that I got. ?There is more like this that could be done, but I > wanted to see how this example was received. From dllaurence at dslextreme.com Mon Jan 11 13:31:27 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Mon, 11 Jan 2010 11:31:27 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary In-Reply-To: References: <4B4AEF34.5090601@laurences.net> Message-ID: <4B4B7C8F.8080804@laurences.net> On 01/11/2010 11:20 AM, Jeffrey Yasskin wrote: > Awesome, thanks! Committed as r93170 with the following change: > > s/local variables/registers/. "Local variable" refers to allocas in > LLVM, rather than %whatever SSA "variables". Excellent. I was not actually happy with that term when I wrote it, but wasn't sure of the standard terminology. It should certainly be consistent, and that way makes more sense. If these patches are useful I'll send more, but I should know one thing. I notice that the example code in the LangRef is not formatted consistently; sometimes in a grey box, sometimes just inline. My guess is the preferred format changed at some point and older ones are just not updated yet. I left the format as I found it this time, but tell me which is preferred and I'll try to always put examples in the newer format. My guess is the preferred format is the grey boxes. Text criticism says that texts evolve toward greater complexity, not less, and I suspect that is as true of technical documentation as it is of three thousand year old Hebrew. :-) Dustin From jyasskin at google.com Mon Jan 11 13:39:37 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 11 Jan 2010 11:39:37 -0800 Subject: [LLVMdev] Using a function from another module In-Reply-To: <16528.1262998152.232706.1794962677@succubus> References: <16528.1262998152.232706.1794962677@succubus> Message-ID: The JIT tries to handle this in some cases (http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/ExecutionEngine.cpp?annotate=92771#l942), but doesn't handle it for functions. There aren't any tests, so I'm not surprised it's broken. The JIT would be simpler if we just dropped multiple-module support and asked people to link their modules together before trying to JIT them. Is there a reason you can't do that? If there is, could you write a test for http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ExecutionEngine/JIT/ that exercises the behavior you want, and file a bug with it attached? I'm not likely to actually implement that any time soon, but having a bug with a test will make it easier for someone else to pick it up. Thanks, Jeffrey On Fri, Jan 8, 2010 at 4:49 PM, Michael Muller wrote: > > Hi all, > > I'm trying to use a function defined in one LLVM module from another module > (in the JIT) but for some reason it's not working out. ?My sequence of > activity is roughly like this: > > ?1) Create moduleA > ?2) Create moduleB with "func()" > ?3) execEng = ExecutionEngine::create( > ? ? ? ? new ExistingModuleProvider(moduleB)); > ?4) execute "func()" (this works fine) > ?4) add "func()" to moduleA as a declaration (no code blocks) with External > ? ? linkage. > ?5) execEng->addModuleProvider(new ExistingModuleProvider(moduleA)); > ?6) run a function in moduleA that calls "func()" > > I get: > ?LLVM ERROR: Program used external function 'func' which could not be resolved! > > I'm guessing I'm either going about this wrong or missing something. ?Can > anyone offer me some insight? > > ============================================================================= > michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller > ----------------------------------------------------------------------------- > We are the music-makers, and we are the dreamers of dreams > ?- Arthur O'Shaughnessy > ============================================================================= > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From clattner at apple.com Mon Jan 11 14:02:11 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 11 Jan 2010 12:02:11 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> Message-ID: On Jan 11, 2010, at 11:10 AM, Talin wrote: > Quick question - should unions enforce that all member types are unique? I realize that a union of { i32, i32 } doesn't make sense, but should the code actually forbid this? Either way works for me. > As far as constants go, as long as the initializer is an exact match for one of the member types, it should be no problem. Right, please propose a syntax and a class to use (ConstantUnion?) for it, -Chris From gvenn.cfe.dev at gmail.com Mon Jan 11 14:36:04 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Mon, 11 Jan 2010 15:36:04 -0500 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: References: Message-ID: I have not tried this, but a linkage type of PrivateLinkage would not add to the symbol table according to the doc. LLVMSetLinkage(g, LLVMPrivateLinkage); Garrison On Jan 11, 2010, at 14:03, James Williams wrote: > 2010/1/11 Eli Friedman > On Mon, Jan 11, 2010 at 7:07 AM, James Williams wrote: > > Hi, > > > > I've read http://llvm.org/docs/LangRef.html#t_array and > > http://llvm.org/docs/GetElementPtr.html and if I've understood right there > > are no operations that act directly on arrays - instead I need to use > > getelementptr on a pointer to an array to get a pointer to an array element. > > I also understand that there is no 'address of' operation. > > > > As a result I can't figure out how to use constant derived types without > > assigning them to a global. Say I want to use the C bindings function > > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to a > > C string constant - there doesn't seem to be any way to directly use the > > resulting [N x i8] value directly and there's no operator that gives me its > > address. > > > > The only way I can see to get a pointer to the string constant array is to > > go through a global variable, for example: > > > > g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + > > string_literal_number); > > string_literal_number = string_literal_number + 1; > > v = LLVMConstString(string_literal, string_literal.Length, 0); > > LLVMSetInitializer(g, v); > > elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), > > LLVMConstInt(LLVMInt32Type(), 0L, 0) }; > > return LLVMConstInBoundsGEP(g, elements, 2); > > > > Is it possible to get the address of an element of a constant array or > > struct without first initializing a global variable to the constant? > > No. > > OK. I'd have preferred to have to avoid bloating the module symbol table with global symbols that will never be referenced but it's no big deal. > > -Eli > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/4bfa79b7/attachment.html From kennethuil at gmail.com Mon Jan 11 15:05:54 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Mon, 11 Jan 2010 15:05:54 -0600 Subject: [LLVMdev] Using a function from another module In-Reply-To: References: <16528.1262998152.232706.1794962677@succubus> Message-ID: <400d33ea1001111305m41f76c6fp7cd826e261fc3c1e@mail.gmail.com> On Mon, Jan 11, 2010 at 1:39 PM, Jeffrey Yasskin wrote: > The JIT tries to handle this in some cases > (http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/ExecutionEngine.cpp?annotate=92771#l942), > but doesn't handle it for functions. There aren't any tests, so I'm > not surprised it's broken. > > The JIT would be simpler if we just dropped multiple-module support > and asked people to link their modules together before trying to JIT > them. ?Is there a reason you can't do that? I'd like to be able to JIT a module, call imported functions from it, and then write out that module without including the bodies of the imported functions in the output .bc file. From jyasskin at google.com Mon Jan 11 15:05:55 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 11 Jan 2010 13:05:55 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary In-Reply-To: <4B4B7C8F.8080804@laurences.net> References: <4B4AEF34.5090601@laurences.net> <4B4B7C8F.8080804@laurences.net> Message-ID: On Mon, Jan 11, 2010 at 11:31 AM, Dustin Laurence wrote: > On 01/11/2010 11:20 AM, Jeffrey Yasskin wrote: >> Awesome, thanks! Committed as r93170 with the following change: >> >> s/local variables/registers/. "Local variable" refers to allocas in >> LLVM, rather than %whatever SSA "variables". > > Excellent. ?I was not actually happy with that term when I wrote it, but > wasn't sure of the standard terminology. ?It should certainly be > consistent, and that way makes more sense. > > If these patches are useful I'll send more, but I should know one thing. > I notice that the example code in the LangRef is not formatted > consistently; sometimes in a grey box, sometimes just inline. ?My guess > is the preferred format changed at some point and older ones are just > not updated yet. ?I left the format as I found it this time, but tell me > which is preferred and I'll try to always put examples in the newer format. Patches that make the docs easier to understand are definitely useful. I'm not sure about the desired state for example code. I'd probably leave things the way you find them. From junk at giantblob.com Mon Jan 11 15:34:30 2010 From: junk at giantblob.com (James Williams) Date: Mon, 11 Jan 2010 21:34:30 +0000 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: References: Message-ID: 2010/1/11 Garrison Venn > I have not tried this, but a linkage type of PrivateLinkage would not add > to the symbol table according > to the doc. > > LLVMSetLinkage(g, LLVMPrivateLinkage); > Thanks - I hadn't thought of that. > > Garrison > > On Jan 11, 2010, at 14:03, James Williams wrote: > > 2010/1/11 Eli Friedman > >> On Mon, Jan 11, 2010 at 7:07 AM, James Williams >> wrote: >> > Hi, >> > >> > I've read http://llvm.org/docs/LangRef.html#t_array and >> > http://llvm.org/docs/GetElementPtr.html and if I've understood right >> there >> > are no operations that act directly on arrays - instead I need to use >> > getelementptr on a pointer to an array to get a pointer to an array >> element. >> > I also understand that there is no 'address of' operation. >> > >> > As a result I can't figure out how to use constant derived types without >> > assigning them to a global. Say I want to use the C bindings function >> > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer >> to a >> > C string constant - there doesn't seem to be any way to directly use the >> > resulting [N x i8] value directly and there's no operator that gives me >> its >> > address. >> > >> > The only way I can see to get a pointer to the string constant array is >> to >> > go through a global variable, for example: >> > >> > g = LLVMAddGlobal(module, LLVMTypeOf(v), "__string_" + >> > string_literal_number); >> > string_literal_number = string_literal_number + 1; >> > v = LLVMConstString(string_literal, string_literal.Length, 0); >> > LLVMSetInitializer(g, v); >> > elements = { LLVMConstInt(LLVMInt32Type(), 0L, 0), >> > LLVMConstInt(LLVMInt32Type(), 0L, 0) }; >> > return LLVMConstInBoundsGEP(g, elements, 2); >> > >> > Is it possible to get the address of an element of a constant array or >> > struct without first initializing a global variable to the constant? >> >> No. >> > > OK. I'd have preferred to have to avoid bloating the module symbol table > with global symbols that will never be referenced but it's no big deal. > >> >> -Eli >> >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/c92c8504/attachment.html From clattner at apple.com Mon Jan 11 16:33:31 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 11 Jan 2010 14:33:31 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary In-Reply-To: <4B4B7C8F.8080804@laurences.net> References: <4B4AEF34.5090601@laurences.net> <4B4B7C8F.8080804@laurences.net> Message-ID: <7F7BD8CA-807F-44B9-B06C-3357DE4A445D@apple.com> On Jan 11, 2010, at 11:31 AM, Dustin Laurence wrote: > > If these patches are useful I'll send more, but I should know one > thing. > I notice that the example code in the LangRef is not formatted > consistently; sometimes in a grey box, sometimes just inline. My > guess > is the preferred format changed at some point and older ones are just > not updated yet. I left the format as I found it this time, but > tell me > which is preferred and I'll try to always put examples in the newer > format. > Grey boxes are fine with me, -Chris From dllaurence at dslextreme.com Mon Jan 11 16:37:30 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Mon, 11 Jan 2010 14:37:30 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary In-Reply-To: <7F7BD8CA-807F-44B9-B06C-3357DE4A445D@apple.com> References: <4B4AEF34.5090601@laurences.net> <4B4B7C8F.8080804@laurences.net> <7F7BD8CA-807F-44B9-B06C-3357DE4A445D@apple.com> Message-ID: <4B4BA82A.4070802@laurences.net> On 01/11/2010 02:33 PM, Chris Lattner wrote: > Grey boxes are fine with me, OK, until told differently I will assume that the preferred outcome is for all code examples I touch to be put in that format if they aren't already. Dustin From clattner at apple.com Mon Jan 11 16:53:21 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 11 Jan 2010 14:53:21 -0800 Subject: [LLVMdev] LangRef 'struct' patch--preliminary In-Reply-To: <4B4BA82A.4070802@laurences.net> References: <4B4AEF34.5090601@laurences.net> <4B4B7C8F.8080804@laurences.net> <7F7BD8CA-807F-44B9-B06C-3357DE4A445D@apple.com> <4B4BA82A.4070802@laurences.net> Message-ID: <566BCF1B-E82C-47BB-B5E3-B817DAF850D9@apple.com> On Jan 11, 2010, at 2:37 PM, Dustin Laurence wrote: > On 01/11/2010 02:33 PM, Chris Lattner wrote: > >> Grey boxes are fine with me, > > OK, until told differently I will assume that the preferred outcome is > for all code examples I touch to be put in that format if they aren't > already. Sounds good, thanks. From gvenn.cfe.dev at gmail.com Mon Jan 11 16:55:09 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Mon, 11 Jan 2010 17:55:09 -0500 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: References: Message-ID: <43538E5B-44B1-412D-B496-8124CF266C91@gmail.com> Sorry to keep this thread alive, but I'm learning so ... There is more. The doc for GlobalValue::LinkageTypes or the C API LLVMLinkage is not as clear as the lang ref manual. See: http://llvm.org/docs/LangRef.html#linkage. I'm pointing this out because something like LinkerPrivateLinkage (LLVMLinkerPrivateLinkage), or another one, might be more appropriate to your throw away use case (if I understand your use correctly). One should test this of course, and/or one of the experts could chime in hint hint. :-) Anyway thought this info. might be useful, however rehashed. Garrison On Jan 11, 2010, at 16:34, James Williams wrote: > 2010/1/11 Garrison Venn > I have not tried this, but a linkage type of PrivateLinkage would not add to the symbol table according > to the doc. > > LLVMSetLinkage(g, LLVMPrivateLinkage); > > Thanks - I hadn't thought of that. > > Garrison > > On Jan 11, 2010, at 14:03, James Williams wrote: > >> 2010/1/11 Eli Friedman >> On Mon, Jan 11, 2010 at 7:07 AM, James Williams wrote: >> > Hi, >> > >> > I've read http://llvm.org/docs/LangRef.html#t_array and >> > http://llvm.org/docs/GetElementPtr.html and if I've understood right there >> > are no operations that act directly on arrays - instead I need to use >> > getelementptr on a pointer to an array to get a pointer to an array element. >> > I also understand that there is no 'address of' operation. >> > >> > As a result I can't figure out how to use constant derived types without >> > assigning them to a global. Say I want to use the C bindings function >> > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer to a >> > C string constant - there doesn't seem to be any way to directly use the >> > resulting [N x i8] value directly and there's no operator that gives me its >> > address. >> > >> snip >> > >> > Is it possible to get the address of an element of a constant array or >> > struct without first initializing a global variable to the constant? >> >> No. >> >> OK. I'd have preferred to have to avoid bloating the module symbol table with global symbols that will never be referenced but it's no big deal. >> >> -Eli >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/f86ce73d/attachment-0001.html From stoklund at 2pi.dk Mon Jan 11 17:50:26 2010 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 11 Jan 2010 15:50:26 -0800 Subject: [LLVMdev] Setting TARGET_LLCFLAGS in the environment Message-ID: <9E5E3501-787E-4E83-ABB6-6D461CDE94AD@2pi.dk> Weird issue beyond my make-fu: When running the test-suite, this works fine: make TARGET_LLCFLAGS='-mcpu=cortex-a8 -mattr=+thumb2' TEST=nightly report But this fails: export TARGET_LLCFLAGS='-mcpu=cortex-a8 -mattr=+thumb2' make TEST=nightly report It looks like the following line from Makefile.rules is executed multiple times: TARGET_LLCFLAGS += -relocation-model=pic -disable-fp-elim This causes llc to complain about the -relocation-model and -disable-fp-elim options being given multiple times. (Sometime they are repeated three or four times). Clearly there is some make magic I don't understand here. Does anyone know what is going on? From viridia at gmail.com Mon Jan 11 18:30:10 2010 From: viridia at gmail.com (Talin) Date: Mon, 11 Jan 2010 16:30:10 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> Message-ID: I'm working on a new version of the patch. Another thing I wanted to ask about - do you prefer to have one giant patch that has everything, or a series of incremental patches? I can see advantages either way. Normally I would want to do this as a series of incremental patches, however this is a rather large project and it may take me quite a while before it's completely done. I don't doubt that I will need some assistance when it comes to the trickier parts (like the optimization aspects you mentioned.) So there's a risk involved in submitting the first one or two patches, because the final patch might not be ready in time for the next release. On the other hand, it will be a lot easier for others to assist if we go ahead and submit the initial work. On Mon, Jan 11, 2010 at 12:02 PM, Chris Lattner wrote: > > On Jan 11, 2010, at 11:10 AM, Talin wrote: > > > Quick question - should unions enforce that all member types are unique? > I realize that a union of { i32, i32 } doesn't make sense, but should the > code actually forbid this? > > Either way works for me. > > > As far as constants go, as long as the initializer is an exact match for > one of the member types, it should be no problem. > > Right, please propose a syntax and a class to use (ConstantUnion?) for it, > > -Chris -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100111/5af69ee8/attachment.html From mmuller at enduden.com Mon Jan 11 20:29:18 2010 From: mmuller at enduden.com (Michael Muller) Date: Mon, 11 Jan 2010 21:29:18 -0500 Subject: [LLVMdev] Using a function from another module References: <400d33ea1001111305m41f76c6fp7cd826e261fc3c1e@mail.gmail.com> Message-ID: <16528.1263263358.322849.1903303283@succubus> Kenneth Uildriks wrote: > On Mon, Jan 11, 2010 at 1:39 PM, Jeffrey Yasskin wrote: > > The JIT tries to handle this in some cases > > (http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/ExecutionEngine.cpp?annotate=92771#l942), > > but doesn't handle it for functions. There aren't any tests, so I'm > > not surprised it's broken. > > > > The JIT would be simpler if we just dropped multiple-module support > > and asked people to link their modules together before trying to JIT > > them. ?Is there a reason you can't do that? > > I'd like to be able to JIT a module, call imported functions from it, > and then write out that module without including the bodies of the > imported functions in the output .bc file. > Yeah, I have a similar use case - I want to be able to execute and dump the .bc file to a filesystem cache to avoid subsequent compiles. I've come up with a fairly elegant work-around, but I'll try to put together a unit test that illustrates the failure. If I feel ambitious, I'll also take a stab at a fix. ============================================================================= michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- Society in every state is a blessing, but government even in its best state is but a necessary evil; in its worst state an intolerable one... - Thomas Paine ============================================================================= From minwook.ahn at gmail.com Mon Jan 11 22:25:43 2010 From: minwook.ahn at gmail.com (minwook Ahn) Date: Tue, 12 Jan 2010 13:25:43 +0900 Subject: [LLVMdev] [LLVMDev] Does our own developed module and functions can go along with the future improved version of LLVM? Message-ID: <3d49ce701001112025v1801244wea5c19248d250796@mail.gmail.com> Hello. I am a compiler developer of our team. We try to build our own compiler for our own processor. We want to build our compiler based on LLVM by adding our own modules and functions which are specific to the features of our processor hardware. In case of our developed modules, is it guaranteed that the modules can work in the future version of LLVM? In order to do so, what guideline is required to do that? Thank you in advance. Minwook Ahn -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100112/2f698b72/attachment.html From baldrick at free.fr Tue Jan 12 02:27:05 2010 From: baldrick at free.fr (Duncan Sands) Date: Tue, 12 Jan 2010 09:27:05 +0100 Subject: [LLVMdev] [LLVMDev] Does our own developed module and functions can go along with the future improved version of LLVM? In-Reply-To: <3d49ce701001112025v1801244wea5c19248d250796@mail.gmail.com> References: <3d49ce701001112025v1801244wea5c19248d250796@mail.gmail.com> Message-ID: <4B4C3259.3010905@free.fr> Hi Minwook Ahn, > We want to build our compiler based on LLVM by adding our own modules > and functions > > which are specific to the features of our processor hardware. do you mean that you have files containing bitcode which contain useful routines for your processor, and that you use like a library? > In case of our developed modules, is it guaranteed that the modules can > work in the future version of LLVM? The LLVM policy is that old bitcode should continue to work in future releases. Usually bitcode is transparently upgraded when loaded by newer tools, however some future versions may require you to run an upgrade tool on the bitcode. This has happened in the past when the internal changes were large enough to make auto-upgrade impractical. Best wishes, Duncan. From junk at giantblob.com Tue Jan 12 02:45:14 2010 From: junk at giantblob.com (James Williams) Date: Tue, 12 Jan 2010 08:45:14 +0000 Subject: [LLVMdev] Operations on constant array value? In-Reply-To: <43538E5B-44B1-412D-B496-8124CF266C91@gmail.com> References: <43538E5B-44B1-412D-B496-8124CF266C91@gmail.com> Message-ID: 2010/1/11 Garrison Venn > Sorry to keep this thread alive, but I'm learning so ... > > There is more. The doc for GlobalValue::LinkageTypes or the C API > LLVMLinkage is not as clear as the > lang ref manual. See: http://llvm.org/docs/LangRef.html#linkage. I'm > pointing this out because something > like LinkerPrivateLinkage (LLVMLinkerPrivateLinkage), or another one, might > be more appropriate to > your throw away use case (if I understand your use correctly). One should > test this of course, and/or one > of the experts could chime in hint hint. :-) > I think I'd understand the LLVM linkage document better if I had more knowledge of ELF - I suspect there's a close correspondance between LLVM linkage types and corresponding features in ELF but I could be confused here. I also found the documentation on derived types less clear in places. I might have missed it but I don't think it's made explicit what operations are allowed on derived types (actually none?) Otherwise the LLVM documentation is generally very good and this together with the clear and orthogonal nature of the IR has enabled me to make huge progress towards replacing my compiler's back end with LLVM in a couple of weekends and a few evenings. -- James Anyway thought this info. might be useful, however rehashed. > > Garrison > > On Jan 11, 2010, at 16:34, James Williams wrote: > > 2010/1/11 Garrison Venn > >> I have not tried this, but a linkage type of PrivateLinkage would not add >> to the symbol table according >> to the doc. >> >> LLVMSetLinkage(g, LLVMPrivateLinkage); >> > > Thanks - I hadn't thought of that. > >> >> Garrison >> >> On Jan 11, 2010, at 14:03, James Williams wrote: >> >> 2010/1/11 Eli Friedman >> >>> On Mon, Jan 11, 2010 at 7:07 AM, James Williams >>> wrote: >>> > Hi, >>> > >>> > I've read http://llvm.org/docs/LangRef.html#t_array and >>> > http://llvm.org/docs/GetElementPtr.html and if I've understood right >>> there >>> > are no operations that act directly on arrays - instead I need to use >>> > getelementptr on a pointer to an array to get a pointer to an array >>> element. >>> > I also understand that there is no 'address of' operation. >>> > >>> > As a result I can't figure out how to use constant derived types >>> without >>> > assigning them to a global. Say I want to use the C bindings function >>> > LLVMValueRef LLVMConstString(char *, int, int) to get an int8* pointer >>> to a >>> > C string constant - there doesn't seem to be any way to directly use >>> the >>> > resulting [N x i8] value directly and there's no operator that gives me >>> its >>> > address. >>> > >>> snip >>> >>> > >>> > Is it possible to get the address of an element of a constant array or >>> > struct without first initializing a global variable to the constant? >>> >>> No. >>> >> >> OK. I'd have preferred to have to avoid bloating the module symbol table >> with global symbols that will never be referenced but it's no big deal. >> >>> >>> -Eli >>> >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100112/19f11e4a/attachment.html From pazzodalegare at gmail.com Tue Jan 12 06:45:18 2010 From: pazzodalegare at gmail.com (Pazzo Da Legare) Date: Tue, 12 Jan 2010 13:45:18 +0100 Subject: [LLVMdev] building a llvm-arm-elf crosscompiler on OSX 10.5 In-Reply-To: <1a26b4921001101517h3453044fm649a68534842a40a@mail.gmail.com> References: <1a26b4921001081755y17c35002hf1cc50c951a178cc@mail.gmail.com> <1a26b4921001091522k78640a2ahc212af3417df6f1a@mail.gmail.com> <1a26b4921001101229p3237c5f9i467d888fce05780c@mail.gmail.com> <1a26b4921001101517h3453044fm649a68534842a40a@mail.gmail.com> Message-ID: <1a26b4921001120445s7e01db87r6ccba7f4f92b700a@mail.gmail.com> Dear ML, Dear Anton, I'm writing you again to add more informations I discovered: looking at the following statement which gives me errors: /Users/dummy/Develop/llvm/llvm-gcc-build/./gcc/xgcc -B/Users/dummy/Develop/llvm/llvm-gcc-build/./gcc/ -B/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/bin/ -B/usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/lib/ -isystem /usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/include -isystem /usr/local/cross-llvm-gcc-arm-elf-4.2-2.6/arm-elf/sys-include -O2 -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fno-inline -g -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -Dinhibit_libc -I. -I. -I../../llvm-gcc4.2-2.6.source/gcc -I../../llvm-gcc4.2-2.6.source/gcc/. -I../../llvm-gcc4.2-2.6.source/gcc/../include -I./../intl -I../../llvm-gcc4.2-2.6.source/gcc/../libcpp/include -I../../llvm-gcc4.2-2.6.source/gcc/../libdecnumber -I../libdecnumber -I/Users/dummy/Develop/llvm/llvm-build/include -I/Users/dummy/Develop/llvm/llvm-2.6/include -mthumb -fexceptions -c ../../llvm-gcc4.2-2.6.source/gcc/unwind-dw2-fde.c -o libgcc/thumb/unwind-dw2-fde.o I found that if I change -mthumb in -mthumb-interwork then I have no errors. How can I instruct configure to use -mthumb-interwork instead of -mthumb? Thank you again for your attention, mp 2010/1/11 Pazzo Da Legare : > Dear ML, > > Anton, Thank you for your answer and your help. > I had a look at ARM.td of LLVM 2.6 (in lib/Target/ARM..) where I found > following definitions: > > // V4T Processors. > def : ProcNoItin<"arm7tdmi", ? ? ? ?[ArchV4T]>; > def : ProcNoItin<"arm7tdmi-s", ? ? ?[ArchV4T]>; > def : ProcNoItin<"arm710t", ? ? ? ? [ArchV4T]>; > def : ProcNoItin<"arm720t", ? ? ? ? [ArchV4T]>; > def : ProcNoItin<"arm9", ? ? ? ? ? ?[ArchV4T]>; > def : ProcNoItin<"arm9tdmi", ? ? ? ?[ArchV4T]>; > def : ProcNoItin<"arm920", ? ? ? ? ?[ArchV4T]>; > def : ProcNoItin<"arm920t", ? ? ? ? [ArchV4T]>; > def : ProcNoItin<"arm922t", ? ? ? ? [ArchV4T]>; > def : ProcNoItin<"arm940t", ? ? ? ? [ArchV4T]>; > def : ProcNoItin<"ep9312", ? ? ? ? ?[ArchV4T]>; > > I would like to understand if LLVM can be used for ArchV4T or not. > Could you please indicate specific documentation for llvm ARM codegen? > > Does anybody use llvm with arm7tdmi ucontroller (e.g. at91sam7xxx) > > Thank you again for your help, > > pz > > 2010/1/10 Anton Korobeynikov : >> Hello, Pazzo >> >>> Any clue? >> Yes. Sorry, my fault - next time I should check ARM docs before replying. >> ARM7TDMI is ARMv4T and this is not supported by LLVM (LLVM does v5+ codegen). >> > From gabor at mac.com Tue Jan 12 09:22:42 2010 From: gabor at mac.com (Gabor Greif) Date: Tue, 12 Jan 2010 16:22:42 +0100 Subject: [LLVMdev] LICM ilist question. Message-ID: <4B4C93C2.9010006@mac.com> Hi Gang-Ryung! Your reverse iteration of instructions in the BB > * for (BasicBlock::iterator II = BB->end(); II != BB->begin(); ) *{ > > Instruction &I = *--II; > > if (isLoopInvariantInst(I) && canSinkOrHoistInst(I) && > isSafeToExecuteUnconditionally(I)) > * hoist(I);* > } looks perfectly valid. If I remember correctly, the (operator--) on Instruction has a buggy assert, but that should not trigger in your case. (Adding unit tests for reverse iteration is on my TODO list.) I suspect that your "hoist(I)" call removes the instruction "I" from the BB and puts it into the first position of another basic block. This could mess up the "II != BB->begin()" test. Hope this helps! Cheers, Gabor From pkk at spth.de Tue Jan 12 10:31:33 2010 From: pkk at spth.de (Philipp Klaus Krause) Date: Tue, 12 Jan 2010 17:31:33 +0100 Subject: [LLVMdev] How to use llvm to cross-compile to C Message-ID: <4B4CA3E5.5010308@spth.de> I want to use llvm to develop for an embedded system. The embedded system has a C compiler. I hope llvm will provide a way to use other languages. How can I do this? How do I tell llvm e.g. the size of ints on the target, etc? Philipp From rnk at mit.edu Tue Jan 12 11:05:27 2010 From: rnk at mit.edu (Reid Kleckner) Date: Tue, 12 Jan 2010 12:05:27 -0500 Subject: [LLVMdev] [LLVMDev] Does our own developed module and functions can go along with the future improved version of LLVM? In-Reply-To: <4B4C3259.3010905@free.fr> References: <3d49ce701001112025v1801244wea5c19248d250796@mail.gmail.com> <4B4C3259.3010905@free.fr> Message-ID: <9a9942201001120905p4d977226gf29ce9cd3121b066@mail.gmail.com> On Tue, Jan 12, 2010 at 3:27 AM, Duncan Sands wrote: > Hi Minwook Ahn, > >> We want to build our compiler based on LLVM by adding our own modules >> and functions >> >> which are specific to the features of our processor hardware. > > do you mean that you have files containing bitcode which contain useful > routines for your processor, and that you use like a library? I think the question was, can they write their own backend for LLVM (a new Target) and will their code automatically work with future releases of LLVM. In that case, the answer is yes, you can develop your own backend, but no, LLVM does not provide API stability. As new versions of LLVM are released you would have to update your code to the new API or stay with the old version of LLVM. Reid From grosser at fim.uni-passau.de Tue Jan 12 11:59:45 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 12 Jan 2010 18:59:45 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> Message-ID: <4B4CB891.5090902@fim.uni-passau.de> On 01/08/10 14:20, ether wrote: > sorry that i forgot to change the subjuect > > > hi all, Hi ether, now a kind of more complete answer. > On 2010-1-7 0:11, John Mosby wrote: >> In LLVM we could add support for generalized CFG regions and >> RegionPasses. A region is a part of the CFG. The only information we >> have is, that it has one entry and one exit, this it can be optimized >> separately. >> I think this is the best way to add region analysis. I must admit this >> approach >> helps me on another, similar project I'm working on in parallel (no >> pun intended). >> Tobias, is this how you are architecting your region analysis already? >> >> John >> > > i just implementing the skeleton of Region/RegionInfo like LoopBase and > LoopInfoBase[1] in the llvm existing codes, and found that theres lots > of common between "Region" and "Loop": > > 1. both of them are consist of several BasicBlocks Correct. > 2. both of them have some kind of nested structures, so both a loop and > a region could have parent or childrens Correct. > 3. both of them have a BasicBlocks(header of a loop and "entry" of a > region) that dominates all others Correct. > and the Region class will have the most stuffs very similar in LoopBase, > like: ParentRegion, SubRegions, Blocks, getRegionDepth(), Correct. > getExitBlock(), getExitingBlock() ...... This might need some thoughts, > so, could us just treat "Loop" as some kind of general "Region" of > BasicBlocks, and make Loop and Region inherit from "RegionBase"? I would like to do so, as I like the structure of this approach. However until now my pass was written on the side, as a proof of concept. I wrote two Passes: 1. Regions Detect the regions and print the regions tree. Try it with: opt -regions -analyze file.bc 2. RegionsWithoutLoops Find the maximal regions that do not contain any loops. Try it with: opt -regions-without-loops file.bc opt -view-regions-without-loops file.bc (needs graphviz) Both ATM only work on BasicBlocks. However I have seen the patches in your sandbox and I really like the idea to keep the analysis general. If you are interested you could have a look at my sandbox (not yet well documented and cleanly formatted). We might want to think about, how to merge our work. Tobi From samuraileumas at yahoo.com Tue Jan 12 12:14:14 2010 From: samuraileumas at yahoo.com (Samuel Crow) Date: Tue, 12 Jan 2010 10:14:14 -0800 (PST) Subject: [LLVMdev] Extra command line options from the LLVM CommandLine parser Message-ID: <142526.58008.qm@web62008.mail.re1.yahoo.com> Hello everybody, We've just got our command-line driven program to compile after having converted our command-line option parser to use LLVM's commandline class. The problem we've run into is that we get the following for llvmpeg --help : USAGE: llvmpeg [options] OPTIONS: -I= - Specify an additional include path -help - Display available options (--help-hidden for more) -o= - Specify output filename -stats - Enable statistics output from program -time-passes - Time each pass, printing elapsed time for each on exit -verbose - Print informational messages -verify-dom-info - Verify dominator info (time consuming) -version - Display the version of this program The problem is that we never asked for the -stats, -time-passes, and -verify-dom-info options. We're using LLVM 2.6. Has this been fixed in trunk? Thanks, --Sam From bob.wilson at apple.com Tue Jan 12 12:21:25 2010 From: bob.wilson at apple.com (Bob Wilson) Date: Tue, 12 Jan 2010 10:21:25 -0800 Subject: [LLVMdev] Setting TARGET_LLCFLAGS in the environment In-Reply-To: <9E5E3501-787E-4E83-ABB6-6D461CDE94AD@2pi.dk> References: <9E5E3501-787E-4E83-ABB6-6D461CDE94AD@2pi.dk> Message-ID: <8C96415A-51EB-4FCF-8790-DAC405020955@apple.com> On Jan 11, 2010, at 3:50 PM, Jakob Stoklund Olesen wrote: > Weird issue beyond my make-fu: > > When running the test-suite, this works fine: > > make TARGET_LLCFLAGS='-mcpu=cortex-a8 -mattr=+thumb2' TEST=nightly report > > But this fails: > > export TARGET_LLCFLAGS='-mcpu=cortex-a8 -mattr=+thumb2' > make TEST=nightly report > > It looks like the following line from Makefile.rules is executed multiple times: > > TARGET_LLCFLAGS += -relocation-model=pic -disable-fp-elim > > This causes llc to complain about the -relocation-model and -disable-fp-elim options being given multiple times. (Sometime they are repeated three or four times). > > Clearly there is some make magic I don't understand here. Does anyone know what is going on? The variables set on the command-line to make are passed along to recursive makes via $(MAKEFLAGS). If you set TARGET_LLCFLAGS on the command-line, every recursive invocation is going to get that same value and will add "-relocation-model=pic -disable-fp-elim" only once. But, if you don't set it on the command-line, make will pick up the value from the environment and add those extra options to whatever it finds. I bet it is then updating the environment value, so that a child make process will see the value that already has the extra options. If that child make also includes Makefile.rules, it will then add another copy of the extra options. Each level of recursive make will keep adding another copy. (The way that make interacts with the environment is confusing, so I'm not 100% sure this is what's happening but it matches the symptoms you are seeing.) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100112/55f3e902/attachment.html From stoklund at 2pi.dk Tue Jan 12 12:30:52 2010 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 12 Jan 2010 10:30:52 -0800 Subject: [LLVMdev] Setting TARGET_LLCFLAGS in the environment In-Reply-To: <8C96415A-51EB-4FCF-8790-DAC405020955@apple.com> References: <9E5E3501-787E-4E83-ABB6-6D461CDE94AD@2pi.dk> <8C96415A-51EB-4FCF-8790-DAC405020955@apple.com> Message-ID: <1DED81C0-DA5B-4596-851B-46460F5393E7@2pi.dk> On Jan 12, 2010, at 10:21 AM, Bob Wilson wrote: > > On Jan 11, 2010, at 3:50 PM, Jakob Stoklund Olesen wrote: > >> Weird issue beyond my make-fu: >> >> When running the test-suite, this works fine: >> >> make TARGET_LLCFLAGS='-mcpu=cortex-a8 -mattr=+thumb2' TEST=nightly report >> >> But this fails: >> >> export TARGET_LLCFLAGS='-mcpu=cortex-a8 -mattr=+thumb2' >> make TEST=nightly report >> >> It looks like the following line from Makefile.rules is executed multiple times: >> >> TARGET_LLCFLAGS += -relocation-model=pic -disable-fp-elim >> >> This causes llc to complain about the -relocation-model and -disable-fp-elim options being given multiple times. (Sometime they are repeated three or four times). >> >> Clearly there is some make magic I don't understand here. Does anyone know what is going on? > > The variables set on the command-line to make are passed along to recursive makes via $(MAKEFLAGS). If you set TARGET_LLCFLAGS on the command-line, every recursive invocation is going to get that same value and will add "-relocation-model=pic -disable-fp-elim" only once. But, if you don't set it on the command-line, make will pick up the value from the environment and add those extra options to whatever it finds. I bet it is then updating the environment value, so that a child make process will see the value that already has the extra options. If that child make also includes Makefile.rules, it will then add another copy of the extra options. Each level of recursive make will keep adding another copy. (The way that make interacts with the environment is confusing, so I'm not 100% sure this is what's happening but it matches the symptoms you are seeing.) That makes sense. I think the number of extra arguments added was dependent on the directory level of the test case. One rather horrible fix would be TARGET_LLCFLAGS := $(filter-out -relocation-model=pic, $TARGET_LLCFLAGS) -relocation-model=pic Or I can just keep on typing in the long command lines... Thanks! /jakob From clattner at apple.com Tue Jan 12 16:10:31 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 12 Jan 2010 14:10:31 -0800 Subject: [LLVMdev] Extra command line options from the LLVM CommandLine parser In-Reply-To: <142526.58008.qm@web62008.mail.re1.yahoo.com> References: <142526.58008.qm@web62008.mail.re1.yahoo.com> Message-ID: <87A45A74-A739-486C-83AE-B16C490574BE@apple.com> On Jan 12, 2010, at 10:14 AM, Samuel Crow wrote: > Hello everybody, > > We've just got our command-line driven program to compile after > having converted our command-line option parser to use LLVM's > commandline class. The problem we've run into is that we get the > following for llvmpeg --help : You'll get these if you link in the LLVM pass manager, Statistics.cpp or other LLVM files that use cl::opt. There isn't a really great way to define multiple different namespaces for command line options to live in at this point. -Chris > > USAGE: llvmpeg [options] > > OPTIONS: > -I= - Specify an additional include path > -help - Display available options (--help-hidden for more) > -o= - Specify output filename > -stats - Enable statistics output from program > -time-passes - Time each pass, printing elapsed time for each > on exit > -verbose - Print informational messages > -verify-dom-info - Verify dominator info (time consuming) > -version - Display the version of this program > > The problem is that we never asked for the -stats, -time-passes, and > -verify-dom-info options. We're using LLVM 2.6. Has this been > fixed in trunk? > > Thanks, > > --Sam > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From clattner at apple.com Tue Jan 12 16:11:28 2010 From: clattner at apple.com (Chris Lattner) Date: Tue, 12 Jan 2010 14:11:28 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> Message-ID: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> On Jan 11, 2010, at 4:30 PM, Talin wrote: > I'm working on a new version of the patch. > > Another thing I wanted to ask about - do you prefer to have one > giant patch that has everything, or a series of incremental patches? > I can see advantages either way. A series of incremental patches is strongly preferred, starting with LangRef.html. > Normally I would want to do this as a series of incremental patches, > however this is a rather large project and it may take me quite a > while before it's completely done. I don't doubt that I will need > some assistance when it comes to the trickier parts (like the > optimization aspects you mentioned.) So there's a risk involved in > submitting the first one or two patches, because the final patch > might not be ready in time for the next release. > > On the other hand, it will be a lot easier for others to assist if > we go ahead and submit the initial work. No problem, just submit it as you go. When the langref piece goes in, just say in it that this is an experimental feature in development. Thanks Talin, -Chris From jan_sjodin at yahoo.com Tue Jan 12 17:56:57 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Tue, 12 Jan 2010 15:56:57 -0800 (PST) Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B4CB891.5090902@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> Message-ID: <314301.28741.qm@web55601.mail.re4.yahoo.com> Why not use the "standard" algorithm for detecting SESE-regions and building a program structure tree? It should handle everything you want. It also becomes much simpler to specify a connected SESE-region by entry/exit edges, while a disconnected region is specified by entry/exit blocks. Only defining regions on blocks is not enough to be able to quickly determine how to replace/move a region in a CFG. The algorithm can be found in this paper: The Program Structure Tree: Computing Control Regions in Linear Time by Richard Johnson , David Pearson , KeshavPingali - Jan ----- Original Message ---- From: Tobias Grosser To: ether Cc: LLVM Developers Mailing List Sent: Tue, January 12, 2010 12:59:45 PM Subject: Re: [LLVMdev] Make LoopBase inherit from "RegionBase"? On 01/08/10 14:20, ether wrote: > sorry that i forgot to change the subjuect > > > hi all, Hi ether, now a kind of more complete answer. > On 2010-1-7 0:11, John Mosby wrote: >> In LLVM we could add support for generalized CFG regions and >> RegionPasses. A region is a part of the CFG. The only information we >> have is, that it has one entry and one exit, this it can be optimized >> separately. >> I think this is the best way to add region analysis. I must admit this >> approach >> helps me on another, similar project I'm working on in parallel (no >> pun intended). >> Tobias, is this how you are architecting your region analysis already? >> >> John >> > > i just implementing the skeleton of Region/RegionInfo like LoopBase and > LoopInfoBase[1] in the llvm existing codes, and found that theres lots > of common between "Region" and "Loop": > > 1. both of them are consist of several BasicBlocks Correct. > 2. both of them have some kind of nested structures, so both a loop and > a region could have parent or childrens Correct. > 3. both of them have a BasicBlocks(header of a loop and "entry" of a > region) that dominates all others Correct. > and the Region class will have the most stuffs very similar in LoopBase, > like: ParentRegion, SubRegions, Blocks, getRegionDepth(), Correct. > getExitBlock(), getExitingBlock() ...... This might need some thoughts, > so, could us just treat "Loop" as some kind of general "Region" of > BasicBlocks, and make Loop and Region inherit from "RegionBase"? I would like to do so, as I like the structure of this approach. However until now my pass was written on the side, as a proof of concept. I wrote two Passes: 1. Regions Detect the regions and print the regions tree. Try it with: opt -regions -analyze file.bc 2. RegionsWithoutLoops Find the maximal regions that do not contain any loops. Try it with: opt -regions-without-loops file.bc opt -view-regions-without-loops file.bc (needs graphviz) Both ATM only work on BasicBlocks. However I have seen the patches in your sandbox and I really like the idea to keep the analysis general. If you are interested you could have a look at my sandbox (not yet well documented and cleanly formatted). We might want to think about, how to merge our work. Tobi _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From viridia at gmail.com Tue Jan 12 19:01:30 2010 From: viridia at gmail.com (Talin) Date: Tue, 12 Jan 2010 17:01:30 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: Here is the LangRef part of the patch. On Tue, Jan 12, 2010 at 2:11 PM, Chris Lattner wrote: > > On Jan 11, 2010, at 4:30 PM, Talin wrote: > > I'm working on a new version of the patch. >> >> Another thing I wanted to ask about - do you prefer to have one giant >> patch that has everything, or a series of incremental patches? I can see >> advantages either way. >> > > A series of incremental patches is strongly preferred, starting with > LangRef.html. > > > Normally I would want to do this as a series of incremental patches, >> however this is a rather large project and it may take me quite a while >> before it's completely done. I don't doubt that I will need some assistance >> when it comes to the trickier parts (like the optimization aspects you >> mentioned.) So there's a risk involved in submitting the first one or two >> patches, because the final patch might not be ready in time for the next >> release. >> >> On the other hand, it will be a lot easier for others to assist if we go >> ahead and submit the initial work. >> > > No problem, just submit it as you go. When the langref piece goes in, just > say in it that this is an experimental feature in development. Thanks > Talin, > > -Chris > > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100112/de2f07ac/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: unionref.patch Type: application/octet-stream Size: 8331 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100112/de2f07ac/attachment.obj From grosser at fim.uni-passau.de Tue Jan 12 19:14:33 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 13 Jan 2010 02:14:33 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> Message-ID: <4B4D1E79.8070807@fim.uni-passau.de> On 01/13/10 00:56, Jan Sjodin wrote: > Why not use the "standard" algorithm for detecting SESE-regions and building a program structure tree? > It should handle everything you want. It also becomes much simpler to specify a connected SESE-region > by entry/exit edges, while a disconnected region is specified by entry/exit blocks. Only defining regions on > blocks is not enough to be able to quickly determine how to replace/move a region in a CFG. > > The algorithm can be found in this paper: > The Program Structure Tree: Computing Control Regions in Linear Time > by Richard Johnson , David Pearson , KeshavPingali Hi Jan, great to read you again. And thanks for pointing me to this paper, I read it but did some further research. I like the idea using edges to define the regions, however it does not catch all regions. Defining regions using just edges stops us from detection a lot of very common regions. Example: A very common CFG: \ / 1 / \ 2 3 | | 4 6 | | 5 7 \ / 8 / \ 9 10 | | 11 12 \ / 13 / \ I would detect these two regions: Region A: 1 -> 8 containing {1,2,3,4,5,6,7} Region B: 8 -> 13 containing {8,9,10,11,12} If I use edges to define the regions the detection is not possible at all. After region detection the CFG can always be split up to create single entry single exit edges, if they are needed e.g. for code generation. \ / 1_a | 1 / \ 2 3 | | 4 6 | | 5 7 \ / 8_a | 8 / \ 9 10 | | 11 12 \ / 13_a | 13 / \ Now the regions can be defined using edges: Region A: (1_a,1) -> (8_a, 8) containing {1,2,3,4,5,6,7,8_a} Region B: (8_a, 8) -> (13_a, 13) containing {8,9,10,11,12,13_a} In general this approach saves a preliminary pass that has to insert new bbs, to generate these edges. As we do not have to modify the CFG other passes like dominance information are still valid, and we do not have to create a lot of auxiliary bbs, to be able to detect all regions. This saves memory and runtime. In general it is probably not too easy to decide where to insert these bbs either: CFG: 0 | 1 / | 2 | / \ 3 4 5 | | | | 6 7 8 \ | / \ |/ region A: 1 -> 9 {1,2,3,4,5,6,7,8} 9 region B: 2 -> 9 {2,4,5,6,7} So we need one bb that joins 6 and 7 and one that joins the two regions CFG: 0 | 1 / | 2 | / \ 3 4 5 | | | | 6 7 8 \ | | \ | | region A: (0,1) -> (9b,9) {1,2,3,4,5,6,7,8,9a,9b} 9a | region B: (1,2) -> (9a,9b) {2,4,5,6,7,9a} \ / 9b | 9 My approach is comparable to this paper: The Refined Process Structure Tree by Jussi Vanhatalo, Hagen V?lzer, Jana Koehler The implementation however takes advantage of the existence of Dominance/PostDominance information. Therefore it is simpler and hopefully faster. At the moment run time is comparable to dominance tree calculation. If you want, have a look into some results I got with a pass extracting maximal non trivial regions that do not contain loops from libbz2: http://tobias.osaft.eu/llvm/region/bz2NoLoops/ Interesting ones: regions_without_loops.BZ2_bzBuffToBuffCompress.dot.png has a lot of exit edges. regions_without_loops.bzopen_or_bzdopen.dot.png the first region has two entry edges. One is the loop latch. (Keep in mind all regions have the same color, so if it seems there is an edge into a region, there are just two regions close by) Without a prepass that exposes the edges almost no region could be detected with the "standard" approach. It would be great to hear your opinion about this. See you Tobias From gohman at apple.com Tue Jan 12 19:46:14 2010 From: gohman at apple.com (Dan Gohman) Date: Tue, 12 Jan 2010 17:46:14 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Jan 12, 2010, at 5:01 PM, Talin wrote: > Here is the LangRef part of the patch. > +

The union type is used to represent a set of possible data types which can > + exist at a given location in memory (also known as an "untagged" > + union). [...] This wording is somewhat misleading; memory in LLVM has no types. How about: "A union type describes an object with size and alignment suitable for an object of any one of a given set of types." Also, is it really useful to support insertvalue/extractvalue/getelementptr on unions? The benefit of unions that I'm aware of is it allows target-independent IR to work with appropriately sized and aligned memory. This doesn't require any special support for accessing union members; for example: %p = alloca union { i32, double } %q = bitcast union { i32, double }* %p to double* store i32 2.0, double* %q Would this be a reasonable approach? Dan From jan_sjodin at yahoo.com Tue Jan 12 22:09:10 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Tue, 12 Jan 2010 20:09:10 -0800 (PST) Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B4D1E79.8070807@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> <4B4D1E79.8070807@fim.uni-passau.de> Message-ID: <457655.70882.qm@web55601.mail.re4.yahoo.com> Hi Tobias > In general this approach saves a preliminary pass that has to insert new > bbs, to generate these edges. As we do not have to modify the CFG other > passes like dominance information are still valid, and we do not have to > create a lot of auxiliary bbs, to be able to detect all regions. This > saves memory and runtime. In general it is probably not too easy to > decide where to insert these bbs either: The general rule is to split all blocks with multiple in-edges and multiple out-edges into blocks with either multiple in-edges or multiple out-edges, but not both. One option is to keep this as an invariant throughout the compiler and make use of the merge blocks (multiple in-edges) to contain only PHI-nodes, and all other code in regular basic blocks. There are different variations on this that may or may not be useful. > CFG: > 0 > | > 1 > / | > 2 | > / \ 3 > 4 5 | > | | | > 6 7 8 > \ | / > \ |/ region A: 1 -> 9 {1,2,3,4,5,6,7,8} > 9 region B: 2 -> 9 {2,4,5,6,7} > > So we need one bb that joins 6 and 7 and one that joins the two regions > > CFG: 0 > | > 1 > / | > 2 | > / \ 3 > 4 5 | > | | | > 6 7 8 > \ | | > \ | | region A: (0,1) -> (9b,9) {1,2,3,4,5,6,7,8,9a,9b} > 9a | region B: (1,2) -> (9a,9b) {2,4,5,6,7,9a} > \ / > 9b > | > 9 It is fairly simple to use the information from the algorithm to decide where those merges should be inserted to get the expected regions. This may be needed in the cases where a sub-region is too complicated to be represented and must be abstracted into a "black box". > My approach is comparable to this paper: > The Refined Process Structure Tree by JussiVanhatalo, Hagen V?lzer, > Jana Koehler I was looking through some slides that described their algorithm. One case that seems to be legal is this: | Entry / \ R0 R1 \ / Exit | With two fragments: Entry->R0->Exit and Entry->R1->Exit, which means that a fragment cannot be identified using only the entry and exit blocks, but the internal blocks or edges will also need to be listed. I don't know if this is relevant to your implementation. > The implementation however takes advantage of the existence of > Dominance/PostDominance information. Therefore it is simpler and > hopefully faster. At the moment run time is comparable to dominance tree > calculation. Both algorithms are linear so there is really no big difference in time imo. I believe the biggest difference that you mention is that you can capture more complicated regions without having to modify the CFG with the current algorithm. > If you want, have a look into some results I got with a pass extracting > maximal non trivial regions that do not contain loops from libbz2: > > http://tobias.osaft.eu/llvm/region/bz2NoLoops/ > > Interesting ones: > > regions_without_loops.BZ2_bzBuffToBuffCompress.dot.png > has a lot of exit edges. I think this example proves the strengths and weaknesses of both approaches. Making that region into structured control flow would add a lot of additional blocks. This will also happen after generating code from the polyhedral model, so either way the cost is there if the optimization is successful. The second case is where the optimization fails (no profitable transformation found) and the CFG can remain untouched. The third case is if one of those blocks contains something complicated. I believe the current algorithm simply fails and cannot detect the region. If the CFG is modified this would allow an internal SESE-region to become a black box, and the the outer regions could be optimized. > regions_without_loops.bzopen_or_bzdopen.dot.png > the first region has two entry edges. One is the loop latch. > (Keep in mind all regions have the same color, so if it seems there is > an edge into a region, there are just two regions close by) > > Without a prepass that exposes the edges almost no region could be > detected with the "standard" approach. Indeed the CFG will have to be modified for these cases. I it seems to me that the trade-off between the two approaches is that the algorithm that you currently have is a cheaper up front, but may be less capable in some cases, while the "standard" algorithm will be more expensive, but can handle problematic regions better. Would you agree? - Jan From grosser at fim.uni-passau.de Wed Jan 13 03:19:09 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 13 Jan 2010 10:19:09 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <25423_1263356804_4B4D4B83_25423_4709_1_457655.70882.qm@web55601.mail.re4.yahoo.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> <4B4D1E79.8070807@fim.uni-passau.de> <25423_1263356804_4B4D4B83_25423_4709_1_457655.70882.qm@web55601.mail.re4.yahoo.com> Message-ID: <4B4D900D.6040601@fim.uni-passau.de> On 01/13/10 05:09, Jan Sjodin wrote: > Hi Tobias > >> In general this approach saves a preliminary pass that has to insert new > >> bbs, to generate these edges. As we do not have to modify the CFG other >> passes like dominance information are still valid, and we do not have to >> create a lot of auxiliary bbs, to be able to detect all regions. This >> saves memory and runtime. In general it is probably not too easy to >> decide where to insert these bbs either: > > The general rule is to split all blocks with multiple in-edges and multiple out-edges > into blocks with either multiple in-edges or multiple out-edges, but not both. This is not sufficient, as shown in the example below. It would allow only one region. > One option is to keep this as an invariant throughout the compiler and make use > of the merge blocks (multiple in-edges) to contain only PHI-nodes, and all other code > in regular basic blocks. There are different variations on this that may or may not be > useful. This might be possible, however probably doubling the number of bbs. > >> CFG: >> 0 >> | >> 1 >> / | >> 2 | >> / \ 3 >> 4 5 | >> | | | >> 6 7 8 >> \ | / >> \ |/ region A: 1 -> 9 {1,2,3,4,5,6,7,8} >> 9 region B: 2 -> 9 {2,4,5,6,7} >> >> So we need one bb that joins 6 and 7 and one that joins the two regions >> >> CFG: 0 >> | >> 1 >> / | >> 2 | >> / \ 3 >> 4 5 | >> | | | >> 6 7 8 >> \ | | >> \ | | region A: (0,1) -> (9b,9) {1,2,3,4,5,6,7,8,9a,9b} >> 9a | region B: (1,2) -> (9a,9b) {2,4,5,6,7,9a} >> \ / >> 9b >> | >> 9 > > It is fairly simple to use the information from the algorithm to decide > where those merges should be inserted to get the expected regions. > This may be needed in the cases where a sub-region is too complicated > to be represented and must be abstracted into a "black box". >From which algorithm? The program structure tree does not give this information, does it? >> My approach is comparable to this paper: >> The Refined Process Structure Tree by JussiVanhatalo, Hagen V?lzer, >> Jana Koehler > > I was looking through some slides that described their algorithm. One case that > seems to be legal is this: > > | > Entry > / \ > R0 R1 > \ / > Exit > | > > With two fragments: Entry->R0->Exit and Entry->R1->Exit, which means > that a fragment cannot be identified using only the entry and exit blocks, but > the internal blocks or edges will also need to be listed. I don't know if this is > relevant to your implementation. No. The ideas are comparable, however I believe their implementation is a little complicated. ;-) I would mark the regions as: Region A: R0 -> Exit, containing {R0} Region B: R1 -> Exit, containing {R1} >> The implementation however takes advantage of the existence of >> Dominance/PostDominance information. Therefore it is simpler and >> hopefully faster. At the moment run time is comparable to dominance tree >> calculation. > > Both algorithms are linear so there is really no big difference in time imo. Sure. However in terms of maintainability it is nice to be able to reuse existing analysis instead of write another triconnected component analysis upfront. > I believe the biggest difference that you mention is that you can capture more > complicated regions without having to modify the CFG with the current > algorithm. Yes. >> If you want, have a look into some results I got with a pass extracting >> maximal non trivial regions that do not contain loops from libbz2: >> >> http://tobias.osaft.eu/llvm/region/bz2NoLoops/ >> >> Interesting ones: >> >> regions_without_loops.BZ2_bzBuffToBuffCompress.dot.png >> has a lot of exit edges. > > I think this example proves the strengths and weaknesses of both > approaches. Making that region into structured control flow would add a lot > of additional blocks. This will also happen after generating code > from the polyhedral model, so either way the cost is there if the optimization > is successful. Yes, but just in this case and not for regions where the model cannot be applied. If the regions pass is used for analysis purposes, nothing has to be touched. > The second case is where the optimization fails (no profitable > transformation found) and the CFG can remain untouched. > > The third case is if one of those blocks contains something complicated. > I believe the current algorithm simply fails and cannot detect the region. Which algorithm? The one in Graphite? Or the region detection I wrote here? This is just plain region detection, that does not even look at the content but builds a region tree (program structure tree). It just detects every possible region. The selection would be a later pass. > If the > CFG is modified this would allow an internal SESE-region to become a black box, and the > the outer regions could be optimized. This is an optimization, however I think it is orthogonal to the region detection problem. Say it works with any algorithm. >> regions_without_loops.bzopen_or_bzdopen.dot.png >> the first region has two entry edges. One is the loop latch. >> (Keep in mind all regions have the same color, so if it seems there is >> an edge into a region, there are just two regions close by) >> >> Without a prepass that exposes the edges almost no region could be >> detected with the "standard" approach. > > Indeed the CFG will have to be modified for these cases. I it seems to me that the trade-off > between the two approaches is that the algorithm that you currently have is a cheaper up > front, but may be less capable in some cases, while the "standard" algorithm will be more > expensive, but can handle problematic regions better. Would you agree? I agree that the algorithm I have is cheaper upfront, but I do not yet see a case where the algorithm is less capable. Would you mind to give an example or to highlight the relevant part of the discussion? Thanks a lot Tobias From dllaurence at dslextreme.com Wed Jan 13 03:40:30 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 01:40:30 -0800 Subject: [LLVMdev] invoke/unwind Message-ID: <4B4D950E.9010605@laurences.net> I put invoke/unwind aside because I couldn't get them to work, but I'm working on my evaluator now and it would be nice to figure this out so I don't have to unwind the stack manually. This was the reason for my earlier question about global declarations, and as that's cleared up I can easily pass exception data...if I can make unwind return out of some deep recursion. The behavior I get is sort of odd and took a while to characterize--unwind returns every time if done one level deep in the same translation unit. If I try it across translation units, or more than one call deep, I get a seg fault every time. I have reduced the problem to a trivial test case which I can post if there is an obvious "you idiot, everyone knows you have to frob your gismoids" answer. Dustin From baldrick at free.fr Wed Jan 13 04:07:44 2010 From: baldrick at free.fr (Duncan Sands) Date: Wed, 13 Jan 2010 11:07:44 +0100 Subject: [LLVMdev] invoke/unwind In-Reply-To: <4B4D950E.9010605@laurences.net> References: <4B4D950E.9010605@laurences.net> Message-ID: <4B4D9B70.5070802@free.fr> Hi Dustin, the code generators do not support unwind, only the interpreter does. Ciao, Duncan. From gvenn.cfe.dev at gmail.com Wed Jan 13 06:08:19 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Wed, 13 Jan 2010 07:08:19 -0500 Subject: [LLVMdev] invoke/unwind In-Reply-To: <4B4D950E.9010605@laurences.net> References: <4B4D950E.9010605@laurences.net> Message-ID: <0EA6FEC3-08CC-4A85-8396-A529E40DDA60@gmail.com> If it helps, to see what is involved, outside of a pure IR context, see the example code, and doc at: http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism#Source_Code:_exceptionDemo.cpp Although this is a pure example that shows several test cases, including foreign exception interaction, it is not an IR example, but rather a LLVM IR API example. It would be interesting to see a pure IR version of a personality function. I don't see why this would not be possible, although costly in terms of effort. Clang would help. There are also ways to lower your invoke/unwind into a setjump/longjump implementation, but I do not know how to do this in IR, as it requires function pass setup which is outside the scope of IR. Garrison On Jan 13, 2010, at 4:40, Dustin Laurence wrote: > I put invoke/unwind aside because I couldn't get them to work, but I'm > working on my evaluator now and it would be nice to figure this out so I > don't have to unwind the stack manually. This was the reason for my > earlier question about global declarations, and as that's cleared up I > can easily pass exception data...if I can make unwind return out of some > deep recursion. > > The behavior I get is sort of odd and took a while to > characterize--unwind returns every time if done one level deep in the > same translation unit. If I try it across translation units, or more > than one call deep, I get a seg fault every time. > > I have reduced the problem to a trivial test case which I can post if > there is an obvious "you idiot, everyone knows you have to frob your > gismoids" answer. > > Dustin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From kennethuil at gmail.com Wed Jan 13 08:28:57 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Wed, 13 Jan 2010 08:28:57 -0600 Subject: [LLVMdev] invoke/unwind In-Reply-To: <4B4D9B70.5070802@free.fr> References: <4B4D950E.9010605@laurences.net> <4B4D9B70.5070802@free.fr> Message-ID: <400d33ea1001130628g4575b510n5c2d29450bb6cdb4@mail.gmail.com> On Wed, Jan 13, 2010 at 4:07 AM, Duncan Sands wrote: > Hi Dustin, the code generators do not support unwind, only the > interpreter does. > > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > I thought someone implemented invoke/unwind for x86 a few months ago. I remember seeing a post to that effect. From baldrick at free.fr Wed Jan 13 08:37:27 2010 From: baldrick at free.fr (Duncan Sands) Date: Wed, 13 Jan 2010 15:37:27 +0100 Subject: [LLVMdev] invoke/unwind In-Reply-To: <400d33ea1001130628g4575b510n5c2d29450bb6cdb4@mail.gmail.com> References: <4B4D950E.9010605@laurences.net> <4B4D9B70.5070802@free.fr> <400d33ea1001130628g4575b510n5c2d29450bb6cdb4@mail.gmail.com> Message-ID: <4B4DDAA7.3040800@free.fr> Hi Kenneth, > I thought someone implemented invoke/unwind for x86 a few months ago. > I remember seeing a post to that effect. unwind will work as a rethrow, i.e. you do an invoke and in the landing pad or downstream of it you can use invoke to rethrow the exception that caused you to branch to the landing pad. But that's it as far as I know. Ciao, Duncan. From mark.i.r.muir at gmail.com Wed Jan 13 10:38:15 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Wed, 13 Jan 2010 16:38:15 +0000 Subject: [LLVMdev] Cross-module function inlining Message-ID: I've developed a working LLVM back-end (based on LLVM 2.6) for a custom architecture with its own tool chain. This tool chain creates stand-alone programs from a single assembly. We used to use GCC, which supported producing a single machine assembly from multiple source files. I modified Clang to accept the architecture, but discovered that clang-cc (or the Clang Tool subclass inside Clang) doesn't allow multiple source files to be lowered to a single machine assembly. The ToolChain subclasses inside Clang make use of the normal system linker to combine multiple modules, but this isn't possible on our system. So, I created a new Clang ToolChain subclass that forms a tool pipeline based on the following: - Run the existing Clang tool on each source file, using -emit-llvm to generate a .bc file for each module. - Run llvm-link to merge them into a single .bc file. - Run llc to generate a complete machine assembly. The last two were implemented together in a single Tool, performing the job of the linker. Optimisation options are passed onto each tool. This does the trick. However, with optimisations enabled, the resulting code is not as efficient as it would be if all the code were in a single module. In particular, function inlining is only performed by clang (i.e. only on a module-by-module basis), and not by llvm-link or llc. This can be seen in the resulting pass options with -O3 (obtained using '-Xclang -debug-only=Execution' and '-Xlinker -debug-only=Execution'): Clang: Pass Arguments: -raiseallocs -simplifycfg -domtree -domfrontier -mem2reg -globalopt -globaldce -ipconstprop -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -functionattrs -inline -argpromotion -simplify-libcalls -instcombine -jump-threading -simplifycfg -domtree -domfrontier -scalarrepl -instcombine -break-crit-edges -condprop -tailcallelim -simplifycfg -reassociate -domtree -loops -loopsimplify -domfrontier -lcssa -loop-rotate -licm -lcssa -loop-unswitch -instcombine -scalar-evolution -lcssa -iv-users -indvars -loop-deletion -lcssa -loop-unroll -instcombine -memdep -gvn -memdep -memcpyopt -sccp -instcombine -break-crit-edges -condprop -domtree -memdep -dse -adce -simplifycfg -strip-dead-prototypes -print-used-types -deadtypeelim -constmerge llc: Pass Arguments: -preverify -domtree -verify -loops -loopsimplify -scalar-evolution -iv-users -loop-reduce -lowerinvoke -unreachableblockelim -codegenprepare -stack-protector -machine-function-analysis -machinedomtree -machine-loops -machinelicm -machine-sink -unreachable-mbb-elimination -livevars -phi-node-elimination -twoaddressinstruction -liveintervals -simple-register-coalescing -livestacks -virtregmap -linearscan-regalloc -stack-slot-coloring -prologepilog -machinedomtree -machine-loops -machine-loops I'm sure I can hack away to manually add these passes, but I'd prefer an informed opinion on the best way to achieve this, or if there's a more proper way to achieve the same thing (i.e. inter-module function inlining). Also, I've noticed another problem with this approach: when function declarations are 'inline __attribute__((always_inline))' in header files, where the corresponding function definition is in a separate module to where the function is being called, LLVM will not inline the function call at the call site, but will happily strip away the function body, resulting in broken code. Is there a way to stop this? Any guidance is much appreciated. Regards, - Mark From nicholas at mxc.ca Wed Jan 13 10:43:54 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Wed, 13 Jan 2010 08:43:54 -0800 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: References: Message-ID: <4B4DF84A.6090903@mxc.ca> Mark Muir wrote: > I've developed a working LLVM back-end (based on LLVM 2.6) for a custom architecture with its own tool chain. This tool chain creates stand-alone programs from a single assembly. We used to use GCC, which supported producing a single machine assembly from multiple source files. > > I modified Clang to accept the architecture, but discovered that clang-cc (or the Clang Tool subclass inside Clang) doesn't allow multiple source files to be lowered to a single machine assembly. The ToolChain subclasses inside Clang make use of the normal system linker to combine multiple modules, but this isn't possible on our system. > > So, I created a new Clang ToolChain subclass that forms a tool pipeline based on the following: > - Run the existing Clang tool on each source file, using -emit-llvm to generate a .bc file for each module. > - Run llvm-link to merge them into a single .bc file. > - Run llc to generate a complete machine assembly. > The last two were implemented together in a single Tool, performing the job of the linker. Optimisation options are passed onto each tool. > > This does the trick. > > However, with optimisations enabled, the resulting code is not as efficient as it would be if all the code were in a single module. In particular, function inlining is only performed by clang (i.e. only on a module-by-module basis), and not by llvm-link or llc. This can be seen in the resulting pass options with -O3 (obtained using '-Xclang -debug-only=Execution' and '-Xlinker -debug-only=Execution'): It sounds like you're not running the LTO optimizations. You could try replacing llvm-link with llvm-ld which will, or run 'opt -std-link-opts' between llvm-link and llc. > Clang: > Pass Arguments: -raiseallocs -simplifycfg -domtree -domfrontier -mem2reg -globalopt -globaldce -ipconstprop -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -functionattrs -inline -argpromotion -simplify-libcalls -instcombine -jump-threading -simplifycfg -domtree -domfrontier -scalarrepl -instcombine -break-crit-edges -condprop -tailcallelim -simplifycfg -reassociate -domtree -loops -loopsimplify -domfrontier -lcssa -loop-rotate -licm -lcssa -loop-unswitch -instcombine -scalar-evolution -lcssa -iv-users -indvars -loop-deletion -lcssa -loop-unroll -instcombine -memdep -gvn -memdep -memcpyopt -sccp -instcombine -break-crit-edges -condprop -domtree -memdep -dse -adce -simplifycfg -strip-dead-prototypes -print-used-types -deadtypeelim -constmerge This pass list is fine, it's equivalent to 'opt -std-compile-opts'. Nick > llc: > Pass Arguments: -preverify -domtree -verify -loops -loopsimplify -scalar-evolution -iv-users -loop-reduce -lowerinvoke -unreachableblockelim -codegenprepare -stack-protector -machine-function-analysis -machinedomtree -machine-loops -machinelicm -machine-sink -unreachable-mbb-elimination -livevars -phi-node-elimination -twoaddressinstruction -liveintervals -simple-register-coalescing -livestacks -virtregmap -linearscan-regalloc -stack-slot-coloring -prologepilog -machinedomtree -machine-loops -machine-loops > > I'm sure I can hack away to manually add these passes, but I'd prefer an informed opinion on the best way to achieve this, or if there's a more proper way to achieve the same thing (i.e. inter-module function inlining). > > Also, I've noticed another problem with this approach: when function declarations are 'inline __attribute__((always_inline))' in header files, where the corresponding function definition is in a separate module to where the function is being called, LLVM will not inline the function call at the call site, but will happily strip away the function body, resulting in broken code. Is there a way to stop this? > > Any guidance is much appreciated. > > Regards, > > - Mark > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From dllaurence at dslextreme.com Wed Jan 13 11:27:42 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 09:27:42 -0800 Subject: [LLVMdev] invoke/unwind In-Reply-To: <4B4D9B70.5070802@free.fr> References: <4B4D950E.9010605@laurences.net> <4B4D9B70.5070802@free.fr> Message-ID: <4B4E028E.7050200@laurences.net> On 01/13/2010 02:07 AM, Duncan Sands wrote: > Hi Dustin, the code generators do not support unwind, only the > interpreter does. Ah, the secret is not to even try to frob the gnorts. Manual unwinding, here I come. :-( I was going to say the interpreter doesn't either, but then I recalled it JITs when it can. I don't know how to call into libc from the interpreter to test that. So how is clang doing C++ exceptions? Dustin From Robert.Quill at imgtec.com Wed Jan 13 11:42:50 2010 From: Robert.Quill at imgtec.com (Robert Quill) Date: Wed, 13 Jan 2010 17:42:50 -0000 Subject: [LLVMdev] Identifying recursive functions in a backend Message-ID: <0B912ECB965FB9488D1FEFB4AFCA1C0767FA3D@klmail.kl.imgtec.org> Hi, I was wondering if it was possible to detect if a function is recursive in a back-end. For instance, I'd like to be able to say: "If this function we are about to call is recursive, store the return address to the stack, if it isn't we don't need a stack so do nothing". Does anyone know if this is possible? Thanks, Rob - This message is subject to Imagination Technologies' e-mail terms: http://www.imgtec.com/e-mail.htm Imagination Technologies Ltd is a limited company registered in England No: 1306335 Registered Office: Imagination House, Home Park Estate, Kings Langley, Hertfordshire, WD4 8LZ. Email to and from the company may be monitored for compliance and other administrative purposes. - -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/f960f540/attachment.html From dllaurence at dslextreme.com Wed Jan 13 11:46:53 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 09:46:53 -0800 Subject: [LLVMdev] invoke/unwind In-Reply-To: <0EA6FEC3-08CC-4A85-8396-A529E40DDA60@gmail.com> References: <4B4D950E.9010605@laurences.net> <0EA6FEC3-08CC-4A85-8396-A529E40DDA60@gmail.com> Message-ID: <4B4E070D.7020906@laurences.net> On 01/13/2010 04:08 AM, Garrison Venn wrote: > If it helps, to see what is involved, outside of a pure IR context, > see the example code, and doc at: > > http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism#Source_Code:_exceptionDemo.cpp It does, although in the "let me show you why this is too much to tackle" way. > Although this is a pure example that shows several test cases, > including foreign exception interaction, it is not an IR example, but > rather a LLVM IR API example. It would be interesting to see a pure > IR version of a personality function. I don't see why this would not > be possible, although costly in terms of effort. Clang would help. Beyond the scope of the project, I guess. Sounds too far out on the diminishing returns curve for knowledge. If I spend too much time handcoding IR the first extension to the project would be to write a "high-level IR" front-end that provides a 1-1 mapping of the semantics, but with handcoding-friendly syntax and tools. It would actually save time at some level, given that I'm manually #including headers just to reduce the amount of code duplication to saying it twice instead of many times. Complete aside--I hate when people tell me something is impossible, even me. :-) So after I said you couldn't do without CPP-style #includes a few days ago I was annoyed enough to design in my head an import/export mechanism using only unix tools everyone has laying around. Just to prove myself wrong, I guess. I'm not sure I'll implement it given that I already have a lot of code written the other way, but LLVM syntax is simple enough that it could be done without parsing the IR. I don't know if I have enough IR left to justify switching over, but it would be satisfying in principle to get rid of the duplication of headers. > There are also ways to lower your invoke/unwind into a > setjump/longjump implementation, but I do not know how to do this in > IR, as it requires function pass setup which is outside the scope of > IR. I don't know enough about how setjmp/longjmp are implemented to have a clue. If I'm getting into uncharted territory it's easier to just unwind the evaluator stack by hand, just as I already did with the parser when unwinding didn't work. The focus is on learning IR and about the simple lisp evaluation model. There are actually limits to my madness, you know. :-) It would be more profitable to learn another aspect of the system by implementing a MMIX back-end or something. Or, and I know this is just *crazy* talk, I could actually follow the intended learning path and use the main C++ API for something. :-) Dustin From dllaurence at dslextreme.com Wed Jan 13 12:02:51 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 10:02:51 -0800 Subject: [LLVMdev] LangRef.html invoke/unwind patch Message-ID: <4B4E0ACB.8020902@laurences.net> Here is a small doc patch based on answers from the list and from the links mentioned. For stylistic consistency I've followed the language in the va_arg description for the analogous situation. Dustin -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: LangRef.unwind.patch Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/809032a7/attachment.pl From anton at korobeynikov.info Wed Jan 13 12:26:43 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 13 Jan 2010 21:26:43 +0300 Subject: [LLVMdev] Identifying recursive functions in a backend In-Reply-To: <0B912ECB965FB9488D1FEFB4AFCA1C0767FA3D@klmail.kl.imgtec.org> References: <0B912ECB965FB9488D1FEFB4AFCA1C0767FA3D@klmail.kl.imgtec.org> Message-ID: Hello, Robert > I was wondering if it was possible to detect if a function is recursive in a back-end. For instance, I'd like to be able to say: "If this function we are about to call is recursive, store the return address to the stack, if it isn't we don't need a stack so do nothing". Does anyone know if this is possible? Well, in general - no. Think about the function which does an indirect call... If you want some approximation - find all call sites in the function and check the destinations. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From mrs at apple.com Wed Jan 13 13:11:10 2010 From: mrs at apple.com (Mike Stump) Date: Wed, 13 Jan 2010 11:11:10 -0800 Subject: [LLVMdev] invoke/unwind In-Reply-To: <4B4E028E.7050200@laurences.net> References: <4B4D950E.9010605@laurences.net> <4B4D9B70.5070802@free.fr> <4B4E028E.7050200@laurences.net> Message-ID: <3C472436-88AE-4C98-93EB-936EE94FB16A@apple.com> On Jan 13, 2010, at 9:27 AM, Dustin Laurence wrote: > So how is clang doing C++ exceptions? Roughly, like so: $ cat t.cc int main() { try { throw 1; } catch (int i) { } } $ clang -S t.cc -emit-llvm $ cat t.s ; ModuleID = 't.cc' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-apple-darwin10.0" @_ZTIi = external constant i8* ; [#uses=1] define i32 @main() ssp { entry: %retval = alloca i32 ; [#uses=2] %exception.ptr = alloca i8* ; [#uses=1] %_rethrow = alloca i8* ; [#uses=4] %i = alloca i32, align 4 ; [#uses=1] %cleanup.dst = alloca i32 ; [#uses=3] %cleanup.dst6 = alloca i32 ; [#uses=5] store i32 0, i32* %retval %exception = call i8* @__cxa_allocate_exception(i64 4) ; [#uses=3] store i8* %exception, i8** %exception.ptr %0 = bitcast i8* %exception to i32* ; [#uses=1] store i32 1, i32* %0 invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8*), i8* null) noreturn to label %invoke.cont unwind label %try.handler invoke.cont: ; preds = %entry unreachable terminate.handler: ; preds = %match.end %exc = call i8* @llvm.eh.exception() ; [#uses=1] %1 = call i32 (i8*, i8*, ...)* @llvm.eh.selector(i8* %exc, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i32 1) ; [#uses=0] call void @_ZSt9terminatev() noreturn nounwind unreachable try.handler: ; preds = %entry %exc1 = call i8* @llvm.eh.exception() ; [#uses=3] %selector = call i32 (i8*, i8*, ...)* @llvm.eh.selector(i8* %exc1, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i8* bitcast (i8** @_ZTIi to i8*), i8* null) ; [#uses=1] %2 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) ; [#uses=1] %3 = icmp eq i32 %selector, %2 ; [#uses=1] br i1 %3, label %match, label %catch.next match: ; preds = %try.handler %4 = call i8* @__cxa_begin_catch(i8* %exc1) ; [#uses=1] %5 = bitcast i8* %4 to i32* ; [#uses=1] %6 = load i32* %5 ; [#uses=1] store i32 %6, i32* %i store i32 1, i32* %cleanup.dst br label %match.end match.handler: ; No predecessors! %exc2 = call i8* @llvm.eh.exception() ; [#uses=2] %7 = call i32 (i8*, i8*, ...)* @llvm.eh.selector(i8* %exc2, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i32 0) ; [#uses=0] store i8* %exc2, i8** %_rethrow store i32 2, i32* %cleanup.dst br label %match.end cleanup.pad: ; preds = %cleanup.switch store i32 1, i32* %cleanup.dst6 br label %finally cleanup.pad3: ; preds = %cleanup.switch store i32 2, i32* %cleanup.dst6 br label %finally match.end: ; preds = %match.handler, %match invoke void @__cxa_end_catch() to label %invoke.cont4 unwind label %terminate.handler invoke.cont4: ; preds = %match.end br label %cleanup.switch cleanup.switch: ; preds = %invoke.cont4 %tmp = load i32* %cleanup.dst ; [#uses=1] switch i32 %tmp, label %cleanup.end [ i32 1, label %cleanup.pad i32 2, label %cleanup.pad3 ] cleanup.end: ; preds = %cleanup.switch %exc5 = call i8* @llvm.eh.exception() ; [#uses=1] store i8* %exc5, i8** %_rethrow store i32 2, i32* %cleanup.dst6 br label %finally catch.next: ; preds = %try.handler store i8* %exc1, i8** %_rethrow store i32 2, i32* %cleanup.dst6 br label %finally finally: ; preds = %catch.next, %cleanup.end, %cleanup.pad3, %cleanup.pad br label %cleanup.switch8 cleanup.switch8: ; preds = %finally %tmp7 = load i32* %cleanup.dst6 ; [#uses=1] switch i32 %tmp7, label %cleanup.end9 [ i32 1, label %finally.end i32 2, label %finally.throw ] cleanup.end9: ; preds = %cleanup.switch8 br label %finally.end finally.throw: ; preds = %cleanup.switch8 %8 = load i8** %_rethrow ; [#uses=1] call void @_Unwind_Resume_or_Rethrow(i8* %8) unreachable finally.end: ; preds = %cleanup.end9, %cleanup.switch8 %9 = load i32* %retval ; [#uses=1] ret i32 %9 } declare i32 @__gxx_personality_v0(...) declare i8* @llvm.eh.exception() nounwind readonly declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind declare i8* @__cxa_allocate_exception(i64) declare void @__cxa_throw(i8*, i8*, i8*) declare void @_ZSt9terminatev() declare i32 @llvm.eh.typeid.for(i8*) nounwind declare i8* @__cxa_begin_catch(i8*) declare void @__cxa_end_catch() declare void @_Unwind_Resume_or_Rethrow(i8*) From mark.i.r.muir at gmail.com Wed Jan 13 14:05:56 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Wed, 13 Jan 2010 20:05:56 +0000 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: <4B4DF84A.6090903@mxc.ca> References: <4B4DF84A.6090903@mxc.ca> Message-ID: <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> On 13 Jan 2010, at 16:43, Nick Lewycky wrote: > Mark Muir wrote: >> - Run the existing Clang tool on each source file, using -emit-llvm to generate a .bc file for each module. >> - Run llvm-link to merge them into a single .bc file. >> - Run llc to generate a complete machine assembly. >> >> However, with optimisations enabled, the resulting code is not as efficient as it would be if all the code were in a single module. In particular, function inlining is only performed by clang (i.e. only on a module-by-module basis), and not by llvm-link or llc. > > It sounds like you're not running the LTO optimizations. You could try replacing llvm-link with llvm-ld which will, or run 'opt -std-link-opts' between llvm-link and llc. > Yep, that sorted inlining. Thanks. But... now there's a small problem with library calls. Symbols such as 'memset', 'malloc', etc. are being removed by global dead code elimination. They are implemented in one of the bitcode modules that are linked together (implementations are based on newlib). I get the same behaviour of them being stripped even when they are live, by the following: opt -internalize -globaldce Other (not standard-library) functions implemented in different modules than where they are called, are correctly seen as live. So, could this be something to do with what is declared as a built-in? I haven't provided any list of built-ins (or overridden the defaults), nor could I figure out how exactly to do that. I've also noticed other problems related to built-ins - in one example, code made use of abs(), but didn't #include . The resulting code compiled without warning or error, but the resulting code was broken, due to the arguments not being seen as live, e.g.: Without #include : 0x181e8b0: i32 = TargetGlobalAddress 0 [TF=1] => JUMP_CALLi [TF=1], %r2, %r3, %r4, %r5, %r6, %r7, %r8, %r9, %r10 With #include : 0x181e8b0: i32 = TargetGlobalAddress 0 [TF=1] => JUMP_CALLi [TF=1], %r3, %r2, %r3, %r4, %r5, %r6, %r7, %r8, %r9, %r10 Where r2 is the link register, and r3 to r10 are argument/retval registers. LowerFormalArguments() doesn't see any arguments in the former, and consequently doesn't add input register nodes to the DAG. I guess I need help with the concept of built-ins, and what code is related to them in the Clang driver and back-end. Regards, - Mark From viridia at gmail.com Wed Jan 13 14:11:59 2010 From: viridia at gmail.com (Talin) Date: Wed, 13 Jan 2010 12:11:59 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Tue, Jan 12, 2010 at 5:46 PM, Dan Gohman wrote: > > On Jan 12, 2010, at 5:01 PM, Talin wrote: > > > Here is the LangRef part of the patch. > > > +

The union type is used to represent a set of possible data types > which can > > + exist at a given location in memory (also known as an "untagged" > > + union). > [...] > > This wording is somewhat misleading; memory in LLVM has no types. > How about: > > "A union type describes an object with size and alignment suitable for > an object of any one of a given set of types." > OK > > Also, is it really useful to support > insertvalue/extractvalue/getelementptr on unions? The benefit of unions > that I'm aware of is it allows target-independent IR to work with > appropriately sized and aligned memory. This doesn't require any special > support for accessing union members; for example: > > %p = alloca union { i32, double } > %q = bitcast union { i32, double }* %p to double* > store i32 2.0, double* %q > > Would this be a reasonable approach? > It depends on whether or not unions can be passed around as SSA values or not. I can think of situations where you would want to. In particular, GEP is useful because you can avoid the bitcast above - GEP to element 0 if you want an int (in the example above), or to element 1 if you want a double. Also, I'm thinking that insertvalue might be the best way to construct a constant union. Right now there's a bit of a problem in that the data type of a constant must match exactly the declared type; However in the case of a union what we want is for the data type of the initializer to exactly match the type of one of the union members. I thought this would be relatively easy to do, but it's a little trickier than I realized. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/2be8f496/attachment.html From nlewycky at google.com Wed Jan 13 14:34:20 2010 From: nlewycky at google.com (Nick Lewycky) Date: Wed, 13 Jan 2010 12:34:20 -0800 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> References: <4B4DF84A.6090903@mxc.ca> <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> Message-ID: On 13 January 2010 12:05, Mark Muir wrote: > On 13 Jan 2010, at 16:43, Nick Lewycky wrote: > > > Mark Muir wrote: > >> - Run the existing Clang tool on each source file, using -emit-llvm to > generate a .bc file for each module. > >> - Run llvm-link to merge them into a single .bc file. > >> - Run llc to generate a complete machine assembly. > >> > >> However, with optimisations enabled, the resulting code is not as > efficient as it would be if all the code were in a single module. In > particular, function inlining is only performed by clang (i.e. only on a > module-by-module basis), and not by llvm-link or llc. > > > > It sounds like you're not running the LTO optimizations. You could try > replacing llvm-link with llvm-ld which will, or run 'opt -std-link-opts' > between llvm-link and llc. > > > > Yep, that sorted inlining. Thanks. > > But... now there's a small problem with library calls. Symbols such as > 'memset', 'malloc', etc. are being removed by global dead code elimination. > They are implemented in one of the bitcode modules that are linked together > (implementations are based on newlib). And what problems does that cause? If malloc is linked in, we're free to inline it everywhere and delete the symbol. If you meant for it to be visible to the optimizers but you don't want it to be part of the code generated for your program (ie., you'll link it against newlib later), you should mark the functions with available_externally linkage. > I get the same behaviour of them being stripped even when they are live, by > the following: > > opt -internalize -globaldce > > Other (not standard-library) functions implemented in different modules > than where they are called, are correctly seen as live. So, could this be > something to do with what is declared as a built-in? I haven't provided any > list of built-ins (or overridden the defaults), nor could I figure out how > exactly to do that. > Alternately, if you wanted malloc, memset and friends to be externally visible (compiled as part of your program and dlsym'able), you could create a public api file which contains a one per line list of the names of the functions that may not be marked internal linkage by internalize. Pass that in to opt with -internalize-public-api-file filename ...other flags... Nick I've also noticed other problems related to built-ins - in one example, code > made use of abs(), but didn't #include . The resulting code > compiled without warning or error, but the resulting code was broken, due to > the arguments not being seen as live, e.g.: > > Without #include : > > 0x181e8b0: i32 = TargetGlobalAddress 0 [TF=1] > => JUMP_CALLi [TF=1], %r2, %r3, > %r4, %r5, %r6, %r7, > %r8, %r9, %r10 > > With #include : > > 0x181e8b0: i32 = TargetGlobalAddress 0 [TF=1] > => JUMP_CALLi [TF=1], %r3, %r2, %r3, > %r4, %r5, %r6, %r7, > %r8, %r9, %r10 > > Where r2 is the link register, and r3 to r10 are argument/retval registers. > LowerFormalArguments() doesn't see any arguments in the former, and > consequently doesn't add input register nodes to the DAG. > > I guess I need help with the concept of built-ins, and what code is related > to them in the Clang driver and back-end. > > Regards, > > - Mark > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/685a6b84/attachment.html From mark.i.r.muir at gmail.com Wed Jan 13 15:03:17 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Wed, 13 Jan 2010 21:03:17 +0000 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: References: <4B4DF84A.6090903@mxc.ca> <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> Message-ID: <39479E87-0DB7-48B3-9D1E-0E99DC5B602A@gmail.com> On 13 Jan 2010, at 20:34, Nick Lewycky wrote: > On 13 January 2010 12:05, Mark Muir wrote: > > But... now there's a small problem with library calls. Symbols such as 'memset', 'malloc', etc. are being removed by global dead code elimination. They are implemented in one of the bitcode modules that are linked together (implementations are based on newlib). > > And what problems does that cause? If malloc is linked in, we're free to inline it everywhere and delete the symbol. If you meant for it to be visible to the optimizers but you don't want it to be part of the code generated for your program (ie., you'll link it against newlib later), you should mark the functions with available_externally linkage. > Sorry, I should've been more clear - the calls to _malloc and _free weren't being inlined (see example below). I'm not sure why (happens with or without -simplify-libcalls). So, the resulting .bc file from 'opt' contains live references to symbols that were in its input .bc, but for some reason it stripped them. #include int entries = 3; int result; int main() { int i; // Allocate and populate the initial array. int* values = malloc(entries * sizeof(int)); for (i = 0; i < entries; i ++) values[i] = i + 1; // Calculate the sum, using a dynamically allocated accumulator. int* acc = malloc(sizeof(int)); *acc = 0; for (i = 0; i < entries; i ++) *acc += values[i]; result = *acc; // Deallocate the memory. free(values); free(acc); return 0; } Here's a fragment of the final machine assembly (with -O3): _main: ADDCOMP out=r1 in1=r1 in2=4 conf=`ADDCOMP_SUB WMEM in=r2 in_addr=r1 conf=`WMEM_SI CONST_16B out=r3 conf=12 JUMP nl_out=r2/*RA*/ addr_in=&_malloc conf=`JUMP_ALWAYS_ABS // Call In case this is important, here is the relevant declarations from the 'stdlib.h' that is in use: _PTR _EXFUN(malloc,(size_t __size)); _VOID _EXFUN(free,(_PTR)); where: #define _PTR void * #define _EXFUN(name, proto) name proto and from 'newlib.c': void * malloc (size_t sz) { ... } i.e. They look like any other function call, which is why I suspect it has something to do with special behaviour given to built-ins. > > Alternately, if you wanted malloc, memset and friends to be externally visible (compiled as part of your program and dlsym'able), you could create a public api file which contains a one per line list of the names of the functions that may not be marked internal linkage by internalize. Pass that in to opt with -internalize-public-api-file filename ...other flags... > I saw that. I was thinking of only using that option as a last resort, due to maintainability. > > I guess I need help with the concept of built-ins, and what code is related to them in the Clang driver and back-end. Thanks. - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/1305cf18/attachment.html From baldrick at free.fr Wed Jan 13 15:52:18 2010 From: baldrick at free.fr (Duncan Sands) Date: Wed, 13 Jan 2010 22:52:18 +0100 Subject: [LLVMdev] LangRef.html invoke/unwind patch In-Reply-To: <4B4E0ACB.8020902@laurences.net> References: <4B4E0ACB.8020902@laurences.net> Message-ID: <4B4E4092.7090405@free.fr> Hi Dustin, > Here is a small doc patch based on answers from the list and from the > links mentioned. For stylistic consistency I've followed the language > in the va_arg description for the analogous situation. as I mentioned in another email, unwind is not completely unsupported: it does work for rethrowing an exception. Ciao, Duncan. From dllaurence at dslextreme.com Wed Jan 13 16:05:25 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 14:05:25 -0800 Subject: [LLVMdev] LangRef.html invoke/unwind patch In-Reply-To: <4B4E4092.7090405@free.fr> References: <4B4E0ACB.8020902@laurences.net> <4B4E4092.7090405@free.fr> Message-ID: <4B4E43A5.5090400@laurences.net> On 01/13/2010 01:52 PM, Duncan Sands wrote: > as I mentioned in another email, unwind is not completely unsupported: > it does work for rethrowing an exception. Good point. Not understanding how languages implement exceptions under the hood, I lose the nuances that should be in a reference document. How's this version? Dustin -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: LangRef.unwind.patch Url: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/300cc8bd/attachment.pl From dag at cray.com Wed Jan 13 16:10:23 2010 From: dag at cray.com (David Greene) Date: Wed, 13 Jan 2010 16:10:23 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging Message-ID: <201001131610.23444.dag@cray.com> This patch adds a couple of interfaces to dump full or partial SelectionDAGs. The current code only prints the top-level SDNode. This patch makes it much easier to understand CannotYetSelect errors and those sorts of things. In particular, it helped me track down PR6019. Any objections to committing? -Dave -------------- next part -------------- A non-text attachment was scrubbed... Name: selectiondagdump.patch Type: text/x-diff Size: 3807 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/86c18306/attachment.bin From dllaurence at dslextreme.com Wed Jan 13 16:24:03 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 14:24:03 -0800 Subject: [LLVMdev] LangRef.html invoke/unwind patch In-Reply-To: <4B4E4092.7090405@free.fr> References: <4B4E0ACB.8020902@laurences.net> <4B4E4092.7090405@free.fr> Message-ID: <4B4E4803.4040908@laurences.net> On 01/13/2010 01:52 PM, Duncan Sands wrote: > as I mentioned in another email, unwind is not completely unsupported: > it does work for rethrowing an exception. PS: note that my text is very conservative because I am reluctant to say anything about what invoke/unwind can do right now because, frankly, I don't have the knowledge to get the details correct. It would be A Good Thing, but needs a more knowledgeable author. But I think having the note there in one place an interested person is likely to find it is better than nothing. I was tempted to put in some of the explanatory links I was given, but I didn't because they might age badly and because that involves policy choices about how and where information should be documented. I'll be happy to add them to the patch if it's deemed appropriate, because they're useful. Dustin From junk at giantblob.com Wed Jan 13 16:27:17 2010 From: junk at giantblob.com (James Williams) Date: Wed, 13 Jan 2010 22:27:17 +0000 Subject: [LLVMdev] How to create forward reference to BasicBlock? Message-ID: Hi, Can anyone tell me if there's a straighforward way to create a new BasicBlock without inserting it into a function's basic block list? I want to do this so I can create a forward reference to a block that's position in the function is not yet known. I've tried: Function function Builder builder; bb = BasicBlock::Create(function,...) bb.eraseFromParent() ... add other blocks to function and build instructions in those blocks ... function.getBasicBlockList.push_back(bb) but I get an assert failure from push_back: t1: SymbolTableListTraitsImpl.h:68: void llvm::SymbolTableListTraits::addNodeToList(ValueSubClass*) [with ValueSubClass = llvm::BasicBlock, ItemParentClass = llvm::Function]: Assertion `V->getParent() == 0 && "Value already in a container!!"' failed. Should I expect this to work? Can a BasicBlock only exist within a Function's basic block list? Is there a better way to create a forward reference for branch instructions to IR code that is not yet generated? Thanks in advance, -- James -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/26dd41f8/attachment.html From gvenn.cfe.dev at gmail.com Wed Jan 13 16:28:42 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Wed, 13 Jan 2010 17:28:42 -0500 Subject: [LLVMdev] invoke/unwind In-Reply-To: <4B4E070D.7020906@laurences.net> References: <4B4D950E.9010605@laurences.net> <0EA6FEC3-08CC-4A85-8396-A529E40DDA60@gmail.com> <4B4E070D.7020906@laurences.net> Message-ID: <58D71DE3-01B5-479A-8CF5-AD8AD2CE994B@gmail.com> On Jan 13, 2010, at 12:46, Dustin Laurence wrote: > On 01/13/2010 04:08 AM, Garrison Venn wrote: > >> If it helps, to see what is involved, outside of a pure IR context, >> see the example code, and doc at: >> >> http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism#Source_Code:_exceptionDemo.cpp > > It does, although in the "let me show you why this is too much to > tackle" way. > Yeah, I hear you. The LLVM developer fly trap got me. ;-) >> Although this is a pure example that shows several test cases, >> including foreign exception interaction, it is not an IR example, but >> rather a LLVM IR API example. It would be interesting to see a pure >> IR version of a personality function. I don't see why this would not >> be possible, although costly in terms of effort. Clang would help. > snip >> There are also ways to lower your invoke/unwind into a >> setjump/longjump implementation, but I do not know how to do this in >> IR, as it requires function pass setup which is outside the scope of >> IR. > > I don't know enough about how setjmp/longjmp are implemented to have a > clue. If I'm getting into uncharted territory it's easier to just > unwind the evaluator stack by hand, just as I already did with the > parser when unwinding didn't work. The focus is on learning IR and > about the simple lisp evaluation model. > For pedagogical purposes, the lowering is accomplished by an IR to IR graph transformation that you add to a function pass manager. I personally view LLVM as a term re-writing system where the rules are controlled by the developer a priori. The above IR to IR transformation is one of these rules, which in LLVM parlance, and from a compiler viewpoint, is a pass. See -lowerinvoke in http://llvm.org/docs/Passes.html for the command line option. See llvm::createLowerInvokePass(...) in Scalar.h; note the comments. However this kind of implementation does not do stack unwinding but rather creates the standard longjmp to a pevious setjmp behavior. This is why I thought the pursuit of the zero cost (exception setup with no throw), unwind approach was worth being caught by the venus fly trap. > There are actually limits to my madness, you know. :-) It would be more > profitable to learn another aspect of the system by implementing a MMIX > back-end or something. Funny I was thinking the same thing. Implementing MIX would be a cool way to learn the other side of LLVM (backends). I didn't even know there was a MMIX until your email forced me to query. > > Or, and I know this is just *crazy* talk, I could actually follow the > intended learning path and use the main C++ API for something. :-) > Well, even though I did not take your route, I still use the IR ref. doc as my true documentation. It is fairly isomorphic to C++ IR API. So I think your approach is worth while. > Dustin > Garrison From devang.patel at gmail.com Wed Jan 13 16:54:39 2010 From: devang.patel at gmail.com (Devang Patel) Date: Wed, 13 Jan 2010 14:54:39 -0800 Subject: [LLVMdev] How to create forward reference to BasicBlock? In-Reply-To: References: Message-ID: <352a1fb21001131454n17e2699cpdb01962016493a21@mail.gmail.com> On Wed, Jan 13, 2010 at 2:27 PM, James Williams wrote: > Hi, > > Can anyone tell me if there's a straighforward way to create a new > BasicBlock without inserting it into a function's basic block list? I want > to do this so I can create a forward reference to a block that's position in > the function is not yet known. > > I've tried: > > Function function > Builder builder; > > bb = BasicBlock::Create(function,...) > bb.eraseFromParent() > > ... > add other blocks to function and build instructions in those blocks > ... > > function.getBasicBlockList.push_back(bb) > > > but I get an assert failure from push_back: > > t1: SymbolTableListTraitsImpl.h:68: void > llvm::SymbolTableListTraits ItemParentClass>::addNodeToList(ValueSubClass*) [with ValueSubClass = > llvm::BasicBlock, ItemParentClass = llvm::Function]: Assertion > `V->getParent() == 0 && "Value already in a container!!"' failed. > > Should I expect this to work? Can a BasicBlock only exist within a > Function's basic block list? Is there a better way to create a forward > reference for branch instructions to IR code that is not yet generated? You can create BasicBlock without inserting it into a Function. Parent is an optional parameter for BasicBlock constructor. See include/llvm/BasicBlock.h - Devang From junk at giantblob.com Wed Jan 13 17:00:54 2010 From: junk at giantblob.com (James Williams) Date: Wed, 13 Jan 2010 23:00:54 +0000 Subject: [LLVMdev] How to create forward reference to BasicBlock? In-Reply-To: <352a1fb21001131454n17e2699cpdb01962016493a21@mail.gmail.com> References: <352a1fb21001131454n17e2699cpdb01962016493a21@mail.gmail.com> Message-ID: 2010/1/13 Devang Patel > On Wed, Jan 13, 2010 at 2:27 PM, James Williams > wrote: > > Hi, > > > > Can anyone tell me if there's a straighforward way to create a new > > BasicBlock without inserting it into a function's basic block list? I > want > > to do this so I can create a forward reference to a block that's position > in > > the function is not yet known. > > > > I've tried: > > > > Function function > > Builder builder; > > > > bb = BasicBlock::Create(function,...) > > bb.eraseFromParent() > > > > ... > > add other blocks to function and build instructions in those blocks > > ... > > > > function.getBasicBlockList.push_back(bb) > > > > > > but I get an assert failure from push_back: > > > > t1: SymbolTableListTraitsImpl.h:68: void > > llvm::SymbolTableListTraits > ItemParentClass>::addNodeToList(ValueSubClass*) [with ValueSubClass = > > llvm::BasicBlock, ItemParentClass = llvm::Function]: Assertion > > `V->getParent() == 0 && "Value already in a container!!"' failed. > > > > Should I expect this to work? Can a BasicBlock only exist within a > > Function's basic block list? Is there a better way to create a forward > > reference for branch instructions to IR code that is not yet generated? > > You can create BasicBlock without inserting it into a Function. Parent > is an optional parameter for BasicBlock constructor. See > include/llvm/BasicBlock.h > Thanks. -- James > > - > Devang > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/69791734/attachment.html From dllaurence at dslextreme.com Wed Jan 13 17:03:30 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Wed, 13 Jan 2010 15:03:30 -0800 Subject: [LLVMdev] invoke/unwind In-Reply-To: <58D71DE3-01B5-479A-8CF5-AD8AD2CE994B@gmail.com> References: <4B4D950E.9010605@laurences.net> <0EA6FEC3-08CC-4A85-8396-A529E40DDA60@gmail.com> <4B4E070D.7020906@laurences.net> <58D71DE3-01B5-479A-8CF5-AD8AD2CE994B@gmail.com> Message-ID: <4B4E5142.80307@laurences.net> On 01/13/2010 02:28 PM, Garrison Venn wrote: > I > personally view LLVM as a term re-writing system where the rules are > controlled by the developer a priori. Hopefully I'll remember that comment when I understand its significance better. :-) > Funny I was thinking the same thing. Implementing MIX would be a cool > way to learn the other side of LLVM (backends). It seemed appropriate, especially since I've always been too lazy to really learn MIX and that's unfortunate when one wants to go to the source instead of read one of Knuth's interpreters. I haven't needed to do that often, but one should always have the option. Also, I have common hardware so I have no real motivation to target a real machine (the only possible reason I could see is if I wanted to buy a board and do robotics with my boy, and at five he's not ready for that yet). So doing an (M)MIX backend would have the salutary effect of making me able to read Knuth better and that's more motivating than real hardware I'm not actually using myself. Plus, a priori I'd guess that (M)MIX is very likely more consistent and less quirky than any real architecture, as it has no practical constraints or opportunities to exploit. (M)Mix would probably be a good choice for a backend-writing tutorial. I think expectations would be suitably modest--I don't think anyone is going to port the Linux kernel to MIX or anything, so presumably one wouldn't get endless requests to tweak the code gen to within an inch of its life. The existence of both MIX and MMIX could even be an advantage if both were supported, as one would have examples of both CISC and RISC style architectures. > ...I didn't even know > there was a MMIX until your email forced me to query. I guess MMIX is to MIX as x86-64 is to 16-bit x86. Hopefully that rather than it being like ia64 is to x86. :-) Of course, the usefulness of MMIX more or less depends on Knuth finishing stuff. :-) > Well, even though I did not take your route, I still use the IR ref. > doc as my true documentation. It is fairly isomorphic to C++ IR API. > So I think your approach is worth while. I hope so. Though of course I have an agenda for learning LLVM too, and if that pans out I won't be able to escape doing things normally. I do not envision writing interpreters for anything more complex than Forth or Lisp in IR. One advantage this backwards approach has is exposing more of the real machine nature than even C. I'd like to think that makes one a better compiler user in the end. It's nice to know what all those nice high-level semantics are really costing you. I think part of my motivation, besides just doing the unexpected, is that long ago someone told me they took a class in "assembly and lisp"; basically, they taught programming by teaching you how to implement a higher-level language. Being young and stupid, I didn't see the point, but eventually I figured it out. It's never too late to re-do your childhood right, is it? I also think that the effort to write good code at such a low level is very good discipline. At least, I find it so, because the consequences of good and bad design become magnified. The absence of scope nesting and the difficulty of doing many simple operations really makes factoring out a vocabulary of small toolkit functions useful, for example, and that's not a bad discipline to reinforce. I just created a couple of functions whose body is a single shift simply because it enforced some abstraction and the names are documentation. I can always move the body into the header and let LLVM inline them if I want to optimize away the function call overhead (not that there is any great need to do that in a learning tool). (It's easy to tell which parts of the code I care about. The expression representation is pretty cleanly divided into a toolkit. The user interaction loop is a big fat function I didn't take the time to decompose.) Apropos of nothing, learning that I'm not going to use invoke/unwind puts me back a bit while I bloat and uglify the evaluator code with exception tests and unwinding, but I'm not that far from Turing completeness now and that's kind of a good feeling. :-) I probably didn't oblige myself to go further than that unless I just want to. Dustin From wendling at apple.com Wed Jan 13 18:20:19 2010 From: wendling at apple.com (Bill Wendling) Date: Wed, 13 Jan 2010 16:20:19 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer Message-ID: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> Hi all, A quick question: The current implementation of the "allow unsafe math" option is to specify it via the TargetOptions object. However, this prevents the target-independent optimizer from using it. Are there any opinions (ha!) on how this could be achieved in a nice clean manner which doesn't require using the TargetOptions object in the optimizer? -bw From devang.patel at gmail.com Wed Jan 13 18:50:37 2010 From: devang.patel at gmail.com (Devang Patel) Date: Wed, 13 Jan 2010 16:50:37 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> Message-ID: <352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> On Wed, Jan 13, 2010 at 4:20 PM, Bill Wendling wrote: > Hi all, > > A quick question: > > The current implementation of the "allow unsafe math" option is to specify it via the TargetOptions object. However, this prevents the target-independent optimizer from using it. Are there any opinions (ha!) on how this could be achieved in a nice clean manner which doesn't require using the TargetOptions object in the optimizer? > function attribute. - Devang From wendling at apple.com Wed Jan 13 19:12:00 2010 From: wendling at apple.com (Bill Wendling) Date: Wed, 13 Jan 2010 17:12:00 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> Message-ID: <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> On Jan 13, 2010, at 4:50 PM, Devang Patel wrote: > On Wed, Jan 13, 2010 at 4:20 PM, Bill Wendling wrote: >> Hi all, >> >> A quick question: >> >> The current implementation of the "allow unsafe math" option is to specify it via the TargetOptions object. However, this prevents the target-independent optimizer from using it. Are there any opinions (ha!) on how this could be achieved in a nice clean manner which doesn't require using the TargetOptions object in the optimizer? >> > > function attribute. Not a bad idea. However, how should it behave during inlining for LTO? (I really don't know the answer to this.) There are three options, that you mentioned off-line: A) Caller wins This could result in something the programmer didn't expect, possibly resulting in an incorrect answer. B) Don't inline We potentially miss important optimizations. C) Safety first The programmer could get code they didn't expect, but at least it won't result in an "incorrect" answer. I.e., it will be correct modulo LLVM bugs, but lacking any unsafe transforms they were expecting. -bw From resistor at mac.com Wed Jan 13 19:52:43 2010 From: resistor at mac.com (Owen Anderson) Date: Wed, 13 Jan 2010 17:52:43 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> Message-ID: <5C255F8C-606F-4546-9CAB-104479E179D9@mac.com> On Jan 13, 2010, at 5:12 PM, Bill Wendling wrote: > > C) Safety first > The programmer could get code they didn't expect, but at least it won't result in an "incorrect" answer. I.e., it will be correct modulo LLVM bugs, but lacking any unsafe transforms they were expecting. > This one sounds like the sensible option to me. -Owen -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/15251115/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2620 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100113/15251115/attachment.bin From nicholas at mxc.ca Wed Jan 13 23:20:23 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Wed, 13 Jan 2010 21:20:23 -0800 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: <39479E87-0DB7-48B3-9D1E-0E99DC5B602A@gmail.com> References: <4B4DF84A.6090903@mxc.ca> <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> <39479E87-0DB7-48B3-9D1E-0E99DC5B602A@gmail.com> Message-ID: <4B4EA997.10705@mxc.ca> Mark Muir wrote: > > On 13 Jan 2010, at 20:34, Nick Lewycky wrote: > >> On 13 January 2010 12:05, Mark Muir > > wrote: >> >> >> But... now there's a small problem with library calls. Symbols >> such as 'memset', 'malloc', etc. are being removed by global dead >> code elimination. They are implemented in one of the bitcode >> modules that are linked together (implementations are based on >> newlib). >> >> >> And what problems does that cause? If malloc is linked in, we're free >> to inline it everywhere and delete the symbol. If you meant for it to >> be visible to the optimizers but you don't want it to be part of the >> code generated for your program (ie., you'll link it against newlib >> later), you should mark the functions with available_externally linkage. > > Sorry, I should've been more clear - the calls to _malloc and _free > weren't being inlined (see example below). I'm not sure why (happens > with or without -simplify-libcalls). So, the resulting .bc file from > 'opt' contains live references to symbols that were in its input .bc, > but for some reason it stripped them. Okay. Could you post an .ll (run 'llvm-dis < foo.bc') example of where this happens? Just the input and opt commands to run is fine. It's very frustrating to look at C and assembly when the problem is in the IR -> IR transform itself. Nick > #include > > int entries = 3; > int result; > > int main() > { > int i; > > // Allocate and populate the initial array. > int* values = malloc(entries * sizeof(int)); > for (i = 0; i < entries; i ++) > values[i] = i + 1; > > // Calculate the sum, using a dynamically allocated accumulator. > int* acc = malloc(sizeof(int)); > *acc = 0; > for (i = 0; i < entries; i ++) > *acc += values[i]; > result = *acc; > > // Deallocate the memory. > free(values); > free(acc); > > return 0; > } > > > Here's a fragment of the final machine assembly (with -O3): > > _main: > ADDCOMP out=r1 in1=r1 in2=4 conf=`ADDCOMP_SUB > WMEM in=r2 in_addr=r1 conf=`WMEM_SI > CONST_16B out=r3 conf=12 > JUMP nl_out=r2/*RA*/ addr_in=&_malloc conf=`JUMP_ALWAYS_ABS // Call > > > In case this is important, here is the relevant declarations from the > 'stdlib.h' that is in use: > > _PTR _EXFUN(malloc,(size_t __size)); > _VOID _EXFUN(free,(_PTR)); > > > where: > > #define _PTR void * > #define _EXFUN(name, proto) name proto > > > and from 'newlib.c': > > void * > malloc (size_t sz) > { > ... > } > > > i.e. They look like any other function call, which is why I suspect it > has something to do with special behaviour given to built-ins. > >> >> Alternately, if you wanted malloc, memset and friends to be externally >> visible (compiled as part of your program and dlsym'able), you could >> create a public api file which contains a one per line list of the >> names of the functions that may not be marked internal linkage by >> internalize. Pass that in to opt with -internalize-public-api-file >> filename ...other flags... >> > > I saw that. I was thinking of only using that option as a last resort, > due to maintainability. > >> >> I guess I need help with the concept of built-ins, and what code >> is related to them in the Clang driver and back-end. > > Thanks. > > - Mark > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From nicholas at mxc.ca Wed Jan 13 23:37:04 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Wed, 13 Jan 2010 21:37:04 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> Message-ID: <4B4EAD80.5090202@mxc.ca> Bill Wendling wrote: > On Jan 13, 2010, at 4:50 PM, Devang Patel wrote: > >> On Wed, Jan 13, 2010 at 4:20 PM, Bill Wendling wrote: >>> Hi all, >>> >>> A quick question: >>> >>> The current implementation of the "allow unsafe math" option is to specify it via the TargetOptions object. However, this prevents the target-independent optimizer from using it. Are there any opinions (ha!) on how this could be achieved in a nice clean manner which doesn't require using the TargetOptions object in the optimizer? >>> >> >> function attribute. > > Not a bad idea. However, how should it behave during inlining for LTO? (I really don't know the answer to this.) A bit on the instruction, not unlike nsw/nuw/exact/inbounds. We could mark whether the fadd is reassociable or not: http://nondot.org/sabre/LLVMNotes/FloatingPointChanges.txt This handles inlining properly. Nick > There are three options, that you mentioned off-line: > > A) Caller wins > This could result in something the programmer didn't expect, possibly resulting in an incorrect answer. > > B) Don't inline > We potentially miss important optimizations. > > C) Safety first > The programmer could get code they didn't expect, but at least it won't result in an "incorrect" answer. I.e., it will be correct modulo LLVM bugs, but lacking any unsafe transforms they were expecting. > > -bw > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From baldrick at free.fr Thu Jan 14 04:01:07 2010 From: baldrick at free.fr (Duncan Sands) Date: Thu, 14 Jan 2010 11:01:07 +0100 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> Message-ID: <4B4EEB63.3070400@free.fr> Hi Bill, > The current implementation of the "allow unsafe math" option is to specify it via the TargetOptions object. However, this prevents the target-independent optimizer from using it. Are there any opinions (ha!) on how this could be achieved in a nice clean manner which doesn't require using the TargetOptions object in the optimizer? a flag on each floating point operation, saying whether it does "exact" math or not? Ciao, Duncan. From mark.i.r.muir at gmail.com Thu Jan 14 04:32:20 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Thu, 14 Jan 2010 10:32:20 +0000 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: <4B4EA997.10705@mxc.ca> References: <4B4DF84A.6090903@mxc.ca> <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> <39479E87-0DB7-48B3-9D1E-0E99DC5B602A@gmail.com> <4B4EA997.10705@mxc.ca> Message-ID: <19D3AAC1-5D26-491A-8C62-622B4FED431A@gmail.com> On 14 Jan 2010, at 05:20, Nick Lewycky wrote: >> calls to _malloc and _free >> weren't being inlined (see example below). I'm not sure why (happens >> with or without -simplify-libcalls). So, the resulting .bc file from >> 'opt' contains live references to symbols that were in its input .bc, >> but for some reason it stripped them. > > Okay. Could you post an .ll (run 'llvm-dis < foo.bc') example of where this happens? Just the input and opt commands to run is fine. It's very frustrating to look at C and assembly when the problem is in the IR -> IR transform itself. I've attached the relevant IR (stripped down to the bare minimum). The following commands will reproduce the problem (using vanilla 2.6 versions of the LLVM tools): llvm-as test_malloc.ll -o - | opt -std-link-opts -o - | llvm-dis -o - That strips everything except for @main. The stripping of the two global variables is fine, and there are no references to them left in the IR. But there are live references to @malloc and @free. The minimum options required for this behaviour are: llvm-as test_malloc.ll -o - | opt -internalize -globaldce -o - | llvm-dis -o - If I use -disable-internalize with -std-link-opts, then global dead code elimination doesn't remove anything, but inlining still takes place. So that is the solution I'm using at the moment. But I'd like to know why this behaviour is happening, and it would be nice to have global DCE so that the resulting machine assembly is easier to work with (for manual debugging on this architecture). Thanks for looking at this. Regards, - Mark -------------- next part -------------- A non-text attachment was scrubbed... Name: test_malloc.ll.bz2 Type: application/x-bzip2 Size: 2417 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100114/f2ce3118/attachment.bz2 From arplynn at gmail.com Thu Jan 14 05:09:52 2010 From: arplynn at gmail.com (Alastair Lynn) Date: Thu, 14 Jan 2010 11:09:52 +0000 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <4B4EEB63.3070400@free.fr> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <4B4EEB63.3070400@free.fr> Message-ID: Hi- Would this not be a good place to use the new metadata feature? If metadata indicating that a value is OK to have unsafe optimisations done on it is dropped, everything will still work correctly. Alastair On 14 Jan 2010, at 10:01, Duncan Sands wrote: > Hi Bill, > >> The current implementation of the "allow unsafe math" option is to specify it via the TargetOptions object. However, this prevents the target-independent optimizer from using it. Are there any opinions (ha!) on how this could be achieved in a nice clean manner which doesn't require using the TargetOptions object in the optimizer? > > a flag on each floating point operation, saying whether it does "exact" math or > not? > > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From morten at hue.no Thu Jan 14 08:07:58 2010 From: morten at hue.no (Morten Ofstad) Date: Thu, 14 Jan 2010 15:07:58 +0100 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com><352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> Message-ID: > There are three options, that you mentioned off-line: > > A) Caller wins > This could result in something the programmer didn't expect, possibly > resulting in an incorrect answer. > > B) Don't inline > We potentially miss important optimizations. > > C) Safety first > The programmer could get code they didn't expect, but at least it won't > result in an "incorrect" answer. I.e., it will be correct modulo LLVM > bugs, but lacking any unsafe transforms they were expecting. >From having worked extensively on FP code, I would prefer option B -- often the most important property is that two calls to the same function with the same parameters produce the same result, you don't want the function to produce different results if it has been inlined or not even if the inlined result is more 'correct' (which would be the case with C). One example of the kind of problems you can get into is if you have a test to see which side of a plane a point is on and it produces different results from two different calls with the same point and the same plane. - Morten From criswell at uiuc.edu Thu Jan 14 09:23:48 2010 From: criswell at uiuc.edu (John Criswell) Date: Thu, 14 Jan 2010 09:23:48 -0600 Subject: [LLVMdev] Identifying recursive functions in a backend In-Reply-To: References: <0B912ECB965FB9488D1FEFB4AFCA1C0767FA3D@klmail.kl.imgtec.org> Message-ID: <4B4F3704.3060104@uiuc.edu> Anton Korobeynikov wrote: > Hello, Robert > > >> I was wondering if it was possible to detect if a function is recursive in a back-end. For instance, I'd like to be able to say: "If this function we are about to call is recursive, store the return address to the stack, if it isn't we don't need a stack so do nothing". Does anyone know if this is possible? >> > Well, in general - no. Think about the function which does an indirect call... > > If you want some approximation - find all call sites in the function > and check the destinations. > More completely, you need to find Strongly Connected Components (SCCs) in the call graph. Indirect function calls will simply make the call graph more conservative than it needs to be because you will have conservative estimates of what functions can be called at the indirect function call site. What you could do is write a pure LLVM pass that locates SCCs and marks functions that are part of an SCC with a special attribute. Your code generator pass could then check this attribute when generating code for each function. -- John T. > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From stefano.delliponti at gmail.com Thu Jan 14 10:03:24 2010 From: stefano.delliponti at gmail.com (Stefano Delli Ponti) Date: Thu, 14 Jan 2010 17:03:24 +0100 Subject: [LLVMdev] FYI: libcpu Message-ID: Interesting project I've discovered yesterday: http://www.libcpu.org/ Stefano From nicholas at mxc.ca Thu Jan 14 11:03:42 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Thu, 14 Jan 2010 09:03:42 -0800 Subject: [LLVMdev] Cross-module function inlining In-Reply-To: <19D3AAC1-5D26-491A-8C62-622B4FED431A@gmail.com> References: <4B4DF84A.6090903@mxc.ca> <7D3B5734-0544-4698-90FE-9CB5EC5D7B8F@gmail.com> <39479E87-0DB7-48B3-9D1E-0E99DC5B602A@gmail.com> <4B4EA997.10705@mxc.ca> <19D3AAC1-5D26-491A-8C62-622B4FED431A@gmail.com> Message-ID: <4B4F4E6E.5020704@mxc.ca> Mark Muir wrote: > On 14 Jan 2010, at 05:20, Nick Lewycky wrote: > >>> calls to _malloc and _free >>> weren't being inlined (see example below). I'm not sure why (happens >>> with or without -simplify-libcalls). So, the resulting .bc file from >>> 'opt' contains live references to symbols that were in its input .bc, >>> but for some reason it stripped them. >> >> Okay. Could you post an .ll (run 'llvm-dis< foo.bc') example of where this happens? Just the input and opt commands to run is fine. It's very frustrating to look at C and assembly when the problem is in the IR -> IR transform itself. > > > I've attached the relevant IR (stripped down to the bare minimum). The following commands will reproduce the problem (using vanilla 2.6 versions of the LLVM tools): > > llvm-as test_malloc.ll -o - | opt -std-link-opts -o - | llvm-dis -o - > > That strips everything except for @main. The stripping of the two global variables is fine, and there are no references to them left in the IR. But there are live references to @malloc and @free. > > The minimum options required for this behaviour are: > > llvm-as test_malloc.ll -o - | opt -internalize -globaldce -o - | llvm-dis -o - > > If I use -disable-internalize with -std-link-opts, then global dead code elimination doesn't remove anything, but inlining still takes place. So that is the solution I'm using at the moment. But I'd like to know why this behaviour is happening, and it would be nice to have global DCE so that the resulting machine assembly is easier to work with (for manual debugging on this architecture). > > Thanks for looking at this. Thanks, I think it's now pretty clear what's going on. The .ll you posted has a @free function with no calls to it. Since it's never called, it can be deleted after -internalize. What happened to your free() calls in the C code is that they turned into free instructions in LLVM. You can fix that by passing -ffreestanding, but realize that this may trigger other missed optimizations as clang/gcc will cease assuming that functions with certain names do certain things. Nick From clattner at apple.com Thu Jan 14 11:45:26 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 14 Jan 2010 09:45:26 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <4B4EEB63.3070400@free.fr> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <4B4EEB63.3070400@free.fr> Message-ID: On Jan 14, 2010, at 2:01 AM, Duncan Sands wrote: > Hi Bill, > >> The current implementation of the "allow unsafe math" option is to >> specify it via the TargetOptions object. However, this prevents the >> target-independent optimizer from using it. Are there any opinions >> (ha!) on how this could be achieved in a nice clean manner which >> doesn't require using the TargetOptions object in the optimizer? > > a flag on each floating point operation, saying whether it does > "exact" math or > not? Yes, the right approach for this is to add flags to each fp operations just like the NUW/NSW bits on integer ops. We want the ability to represent the C99 pragmas which are scoped more tightly than a function body. This is actually really easy to do, the big issue is defining the 'bits' that we want to carry on each operation. For example, I think it would be reasonable to have an "assume finite" bit (saying no nan's / inf), it would also be useful to know you can do reassociation etc, useful to know that you don't care about signed zero, etc. I don't have enough expertise to propose exactly how this should work. -Chris From wendling at apple.com Thu Jan 14 13:04:45 2010 From: wendling at apple.com (Bill Wendling) Date: Thu, 14 Jan 2010 11:04:45 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <352a1fb21001131650t48d4f151jfa174c4d9d8aa6b0@mail.gmail.com> <2175B48E-0D56-4491-BA5C-808EEA4257AD@apple.com> Message-ID: On Jan 14, 2010, at 6:07 AM, Morten Ofstad wrote: >> There are three options, that you mentioned off-line: >> >> A) Caller wins >> This could result in something the programmer didn't expect, possibly resulting in an incorrect answer. >> >> B) Don't inline >> We potentially miss important optimizations. >> >> C) Safety first >> The programmer could get code they didn't expect, but at least it won't result in an "incorrect" answer. I.e., it will be correct modulo LLVM bugs, but lacking any unsafe transforms they were expecting. > > From having worked extensively on FP code, I would prefer option B -- often the most important property is that two calls to the same function with the same parameters produce the same result, you don't want the function to produce different results if it has been inlined or not even if the inlined result is more 'correct' (which would be the case with C). > > One example of the kind of problems you can get into is if you have a test to see which side of a plane a point is on and it produces different results from two different calls with the same point and the same plane. > Good point. :-) Though it looks like we would bypass this and go with flags on the individual instructions. Do you have any insight into Chris's response? -bw From st at iss.tu-darmstadt.de Thu Jan 14 15:56:23 2010 From: st at iss.tu-darmstadt.de (ST) Date: Thu, 14 Jan 2010 22:56:23 +0100 Subject: [LLVMdev] Register Spilling and SSA Message-ID: <201001142256.23296.st@iss.tu-darmstadt.de> Hi I just stumbled upon this paper. While i just skimmed over it it seems as if the authors say that their algorithm is more efficient than the llvm 2.3 algorithm? So i thought that might be interesting? http://pp.info.uni-karlsruhe.de/uploads/publikationen/braun09cc.pdf Disclaimer: I have no affiliation with the authors and stumbled in a slightly unrelated search over this paper. ST From dag at cray.com Thu Jan 14 16:35:27 2010 From: dag at cray.com (David Greene) Date: Thu, 14 Jan 2010 16:35:27 -0600 Subject: [LLVMdev] Register Spilling and SSA In-Reply-To: <201001142256.23296.st@iss.tu-darmstadt.de> References: <201001142256.23296.st@iss.tu-darmstadt.de> Message-ID: <201001141635.28530.dag@cray.com> On Thursday 14 January 2010 15:56, ST wrote: > Hi > > I just stumbled upon this paper. While i just skimmed over it it seems as > if the authors say that their algorithm is more efficient than the llvm 2.3 > algorithm? So i thought that might be interesting? > > http://pp.info.uni-karlsruhe.de/uploads/publikationen/braun09cc.pdf Don't trust it. The abstract clearly states they're counting the number of dynamic spills. That has almost nothing to do with performance. Someone would have to reproduce their experiment to verify that performance indeed improves. And alas, our field is notoriously unscientific in this respect. Reproducability of experiments is nonexistant. -Dave From kremenek at apple.com Thu Jan 14 16:37:17 2010 From: kremenek at apple.com (Ted Kremenek) Date: Thu, 14 Jan 2010 14:37:17 -0800 Subject: [LLVMdev] reminder: internship in Apple's Clang team (applications due Jan 18) Message-ID: <084AEC6B-8FBE-4D3C-A7F0-27BF825700E6@apple.com> I just wanted to send a follow-up announcement about a paid, developer internship this coming summer in Apple's Clang team. The final day to apply is next Monday, January 18. Please email resumes and a brief statement of interest directly to me. Only students or former students who have just graduated are eligible. Cheers, Ted From etherzhhb at gmail.com Thu Jan 14 20:03:49 2010 From: etherzhhb at gmail.com (ether) Date: Fri, 15 Jan 2010 10:03:49 +0800 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <314301.28741.qm@web55601.mail.re4.yahoo.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <314301.28741.qm@web55601.mail.re4.yahoo.com> Message-ID: <4B4FCD05.60107@gmail.com> hi, about region class, i think provide some method to extract the top level loop from a given region is useful in polly(our polyhedral optimization framework), because the we are only doing optimization on the region with loop. and a long term consideration about "region pass": if we want to integrate region analysis and optimization framework into llvm, i think we can use an approach that similar to loop analysis and optimization: write a class "regionpass" inherit from "pass", and the corresponding pass manger "RegionPassManager". (a kind of function pass) if we follow this approach, we need to push the region pass manager into the llvm pass manager stack. the first question of this approach is, whats the relationship between "loop pass manager" and "region pass manager"? way 1: make region pass manager below loop pass manager in the stack pass manager stack: bb pass manager <---top loop pass manager region pass manager function pass manager ... <---bottom in this way the region hierarchy need to be reconstruct when a loop transform change it. way 2: make region pass manager above loop pass manager in the stack pass manager stack: bb pass manager <---top region pass manager loop pass manager function pass manager ... <---bottom in this way the loop hierarchy need to be reconstruct when a region pass change it. now we need to choose a way to minimize the loop reconstruction or region reconstruction. i think that the chance that a region transform affect the loop structure is smaller, so maybe way 2 is better. at last, i have some idea about finding a interesting region: (maybe make the region analysis too complex) we can introduce some thing like "region filter" that determine the property of a region, the region filter will like a "pass", which can run on an instruction at a time, a basic block at a time, or even a sub region at a time, then we can write a "filter manager" like "pass manager " to stream the filtering process, so that we can promote the the performance of the region finding process. comment and suggestion is appreciate. best regards --ether On 2010-1-13 7:56, Jan Sjodin wrote: > Why not use the "standard" algorithm for detecting SESE-regions and building a program structure tree? > It should handle everything you want. It also becomes much simpler to specify a connected SESE-region > by entry/exit edges, while a disconnected region is specified by entry/exit blocks. Only defining regions on > blocks is not enough to be able to quickly determine how to replace/move a region in a CFG. > > The algorithm can be found in this paper: > The Program Structure Tree: Computing Control Regions in Linear Time > by Richard Johnson , David Pearson , KeshavPingali > > - Jan > > ----- Original Message ---- > From: Tobias Grosser > To: ether > Cc: LLVM Developers Mailing List > Sent: Tue, January 12, 2010 12:59:45 PM > Subject: Re: [LLVMdev] Make LoopBase inherit from "RegionBase"? > > On 01/08/10 14:20, ether wrote: > >> sorry that i forgot to change the subjuect >> >> >> hi all, >> > Hi ether, > > now a kind of more complete answer. > > >> On 2010-1-7 0:11, John Mosby wrote: >> >>> In LLVM we could add support for generalized CFG regions and >>> RegionPasses. A region is a part of the CFG. The only information we >>> have is, that it has one entry and one exit, this it can be optimized >>> separately. >>> I think this is the best way to add region analysis. I must admit this >>> approach >>> helps me on another, similar project I'm working on in parallel (no >>> pun intended). >>> Tobias, is this how you are architecting your region analysis already? >>> >>> John >>> >>> >> i just implementing the skeleton of Region/RegionInfo like LoopBase and >> LoopInfoBase[1] in the llvm existing codes, and found that theres lots >> of common between "Region" and "Loop": >> >> 1. both of them are consist of several BasicBlocks >> > Correct. > > >> 2. both of them have some kind of nested structures, so both a loop and >> a region could have parent or childrens >> > Correct. > > >> 3. both of them have a BasicBlocks(header of a loop and "entry" of a >> region) that dominates all others >> > Correct. > > > >> and the Region class will have the most stuffs very similar in LoopBase, >> like: ParentRegion, SubRegions, Blocks, getRegionDepth(), >> > Correct. > > >> getExitBlock(), getExitingBlock() ...... >> > This might need some thoughts, > > > >> so, could us just treat "Loop" as some kind of general "Region" of >> BasicBlocks, and make Loop and Region inherit from "RegionBase"? >> > I would like to do so, as I like the structure of this approach. > However until now my pass was written on the side, as a proof of concept. > > I wrote two Passes: > > 1. Regions > Detect the regions and print the regions tree. Try it with: > opt -regions -analyze file.bc > > 2. RegionsWithoutLoops > Find the maximal regions that do not contain any loops. Try it with: > opt -regions-without-loops file.bc > opt -view-regions-without-loops file.bc (needs graphviz) > > Both ATM only work on BasicBlocks. However I have seen the patches in > your sandbox and I really like the idea to keep the analysis general. > > If you are interested you could have a look at my sandbox (not yet well > documented and cleanly formatted). > > We might want to think about, how to merge our work. > > Tobi > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > From hbrenkun at yahoo.cn Thu Jan 14 20:39:33 2010 From: hbrenkun at yahoo.cn (=?gb2312?B?yM7ApCAg?=) Date: Fri, 15 Jan 2010 10:39:33 +0800 (CST) Subject: [LLVMdev] getting from MachineOperand is just attribute from logic. Message-ID: <816113.38107.qm@web92410.mail.cnh.yahoo.com> Hi, I have ported LLC to a risc cpu. It can pass benchmark that I have at current. But I want do some optimization after register alloction by adjusting register using. I scan MachineBasicBlock to analyze operand's IsKill, IsDead , IsDef attribute to get a physical register's liverange. But I get a strange case at MBB.jpg. R4 is marked at MBB0. If I scan R4's liverange by [MBB0->MBB1->MBB2]. I will find R4 first is killed, then is used. It can not unlogisch. Attually R4 just is . It will cause my optimization pass crash(Actually, I ingore Live In message of MBB. I recollect live in messges at my pass.). 1. Does attribute of R4 at MBB0 is a unimportant and redundancy messages, Or a little bug??? 2. Is it unreliable to get a physical register's liverange by Kill, Dead messages from MachineBasicBlock?? ___________________________________________________________ ?????????????????????????????????? http://card.mail.cn.yahoo.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: MBB.jpg Type: image/jpeg Size: 26396 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/2c00f5cc/attachment.jpg From stoklund at 2pi.dk Thu Jan 14 21:44:39 2010 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Thu, 14 Jan 2010 19:44:39 -0800 Subject: [LLVMdev] getting from MachineOperand is just attribute from logic. In-Reply-To: <816113.38107.qm@web92410.mail.cnh.yahoo.com> References: <816113.38107.qm@web92410.mail.cnh.yahoo.com> Message-ID: <12CB2E12-716D-4572-990F-0EAEC6096E47@2pi.dk> On Jan 14, 2010, at 6:39 PM, ???? wrote: > But I want do some optimization after register alloction by adjusting > register using. I scan MachineBasicBlock to analyze operand's IsKill, IsDead , IsDef attribute to get a physical register's liverange. But I get a strange case at MBB.jpg. You can also look at RegisterScavenging.cpp and MachineVerifier.cpp. They are doing the same thing. > R4 is marked at MBB0. If I scan R4's liverange by [MBB0->MBB1->MBB2]. I will find R4 first is killed, then is used. It can not unlogisch. Attually R4 just is . It will cause my optimization pass crash(Actually, I ingore Live In message of MBB. I recollect live in messges at my pass.). A register should not be used after it is killed, and if it is needed by a successor block, it should be live out. Note that a register in the live-in list of an MBB is not always live-out from all predecessors. A register defined by IMPLICIT_DEF can be optimized away entirely. > 1. Does attribute of R4 at MBB0 is a unimportant and redundancy messages, Or a little bug??? You have probably found a bug. Can you reproduce it with one of the normal back ends? > 2. Is it unreliable to get a physical register's liverange by Kill, Dead messages from MachineBasicBlock?? You also need to use the live-in list for each MBB, but otherwise it should be reliable. Look at how RegisterScavenger is doing it. /jakob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1929 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100114/b0907dbf/attachment.bin From viridia at gmail.com Thu Jan 14 22:27:09 2010 From: viridia at gmail.com (Talin) Date: Thu, 14 Jan 2010 20:27:09 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: I'm still working on the next patch, it's going somewhat slowly. I wanted to create a unit test that actually created a union, and in order to do that I had to implement constant unions. And rather than creating a special syntax for constructing a union, I decided that it was simplest to implement the insertvalue instruction for a constant union expression: @foo = constant union { i32, float } insertvalue union { i32, float } undef, i32 4, 0 What this says is to start with an undef, and then insert the value '4' into the integer field (the zeroth field) of the union. The reason for doing it this way is that to construct a union, you really need 4 pieces of information: The type of the union, the type and value of the member to be initialized, and the index of which member is being initialized. Originally I thought about having the last be detected automatically by what type of initializer was used: @foo = constant union { i32, float } i32 4 However, from a syntactical standpoint what you get is two types in a row - "union { i32, float }" followed by "i32". That is completely unlike any other IR syntax and doesn't fit well into the parser. Using insertvalue as an initializer has the advantage that it's parameters supply all of information we need. The disadvantage is that you have to type the union type signature twice, but I doubt that will be a major issue since IR isn't meant to be typed by hand anyway. On Tue, Jan 12, 2010 at 5:01 PM, Talin wrote: > Here is the LangRef part of the patch. > > > On Tue, Jan 12, 2010 at 2:11 PM, Chris Lattner wrote: > >> >> On Jan 11, 2010, at 4:30 PM, Talin wrote: >> >> I'm working on a new version of the patch. >>> >>> Another thing I wanted to ask about - do you prefer to have one giant >>> patch that has everything, or a series of incremental patches? I can see >>> advantages either way. >>> >> >> A series of incremental patches is strongly preferred, starting with >> LangRef.html. >> >> >> Normally I would want to do this as a series of incremental patches, >>> however this is a rather large project and it may take me quite a while >>> before it's completely done. I don't doubt that I will need some assistance >>> when it comes to the trickier parts (like the optimization aspects you >>> mentioned.) So there's a risk involved in submitting the first one or two >>> patches, because the final patch might not be ready in time for the next >>> release. >>> >>> On the other hand, it will be a lot easier for others to assist if we go >>> ahead and submit the initial work. >>> >> >> No problem, just submit it as you go. When the langref piece goes in, >> just say in it that this is an experimental feature in development. Thanks >> Talin, >> >> -Chris >> >> > > > -- > -- Talin > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100114/1fd36eca/attachment.html From rnk at mit.edu Thu Jan 14 23:01:45 2010 From: rnk at mit.edu (Reid Kleckner) Date: Fri, 15 Jan 2010 00:01:45 -0500 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: <9a9942201001142101k719f199ar1298a6ed19045a85@mail.gmail.com> On Tue, Jan 12, 2010 at 8:46 PM, Dan Gohman wrote: > > On Jan 12, 2010, at 5:01 PM, Talin wrote: > >> Here is the LangRef part of the patch. > >> +

The union type is used to represent a set of possible data types which can >> + ? exist at a given location in memory (also known as an "untagged" >> + ? union). > [...] > > This wording is somewhat misleading; memory in LLVM has no types. > How about: > > "A union type describes an object with size and alignment suitable for > an object of any one of a given set of types." > > Also, is it really useful to support > insertvalue/extractvalue/getelementptr on unions? The benefit of unions > that I'm aware of is it allows target-independent IR to work with > appropriately sized and aligned memory. This doesn't require any special > support for accessing union members; for example: > > ?%p = alloca union { i32, double } > ?%q = bitcast union { i32, double }* %p to double* > ?store i32 2.0, double* %q > > Would this be a reasonable approach? I can think of another benefit of the insertvalue/extractvalue/etc approach, which is that then LLVM checks your types for you. In other words, it makes sure you don't put an i64 into a union of i32 and double, whereas you can always bitcast an i64 to that union. Reid From me22.ca at gmail.com Thu Jan 14 23:25:24 2010 From: me22.ca at gmail.com (me22) Date: Fri, 15 Jan 2010 00:25:24 -0500 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: 2010/1/14 Talin : > The reason for doing it this way is that to construct a union, you really > need 4 pieces of information: The type of the union, the type and value of > the member to be initialized, and the index of which member is being > initialized. Does requiring the index mean that uniquing the union type will have to re-write many of the corresponding insertvalue calls? For instance, how would this round-trip? @foo = constant union { float, i32 } insertvalue union { i32, float } undef, i32 4, 0 @bar = constant union { i32, float } insertvalue union { float, i32 } undef, i32 4, 1 I'm very glad to see a non-bitcast method of using unions, BTW. From hbrenkun at yahoo.cn Thu Jan 14 23:33:14 2010 From: hbrenkun at yahoo.cn (=?utf-8?B?5Lu75Z2kICA=?=) Date: Fri, 15 Jan 2010 13:33:14 +0800 (CST) Subject: [LLVMdev] getting from MachineOperand is just attribute from logic. In-Reply-To: <12CB2E12-716D-4572-990F-0EAEC6096E47@2pi.dk> Message-ID: <603599.43863.qm@web92402.mail.cnh.yahoo.com> Hi, Jakob: Thanks for your answer. I hope to trace all physical register liverange in MachineBasicBlock. In my test, I find LiveIn message of MBB can not give all livein physical register. So I write a pass to recollect livein message by scan MBB. Current case tell me that just to scan MachineOperand's isDef, isKill, IsDead attribute to rebuild physical register's livein will have bug. If I use add missing live-in into , Could I can know which physical register is live at any time? If yes, it is easy for my pass. If not, I need to treat isKill and isDead as isUse, then implement a pass to anaylze CFG to delete unvalid livein message. --- 10?1?15????, Jakob Stoklund Olesen ??? > ???: Jakob Stoklund Olesen > ??: Re: [LLVMdev] getting from MachineOperand is just attribute from logic. > ???: "??" > ??: "llvm" > ??: 2010?1?15?,??,??11:44 > > On Jan 14, 2010, at 6:39 PM, ?? wrote: > > > But I want do some optimization after register > alloction by adjusting > > register using. I scan MachineBasicBlock to analyze > operand's IsKill, IsDead , IsDef attribute to get a physical > register's liverange. But I get a strange case at MBB.jpg. > > You can also look at RegisterScavenging.cpp and > MachineVerifier.cpp. They are doing the same thing. > > >? R4 is marked at MBB0.? If I > scan R4's liverange by [MBB0->MBB1->MBB2]. I will find > R4 first is killed, then is used. It can not unlogisch. > Attually R4 just is . It will cause my > optimization pass crash(Actually, I ingore Live In message > of MBB. I recollect live in messges at my pass.). > > A register should not be used after it is killed, and if it > is needed by a successor block, it should be live out. > > Note that a register in the live-in list of an MBB is not > always live-out from all predecessors. A register defined by > IMPLICIT_DEF can be optimized away entirely. > > >? 1. Does attribute of R4 at MBB0 is > a unimportant? and redundancy messages, Or a little > bug??? > > You have probably found a bug. Can you reproduce it with > one of the normal back ends? > > >? 2. Is it unreliable to get a physical register's > liverange by Kill, Dead messages from MachineBasicBlock?? > > You also need to use the live-in list for each MBB, but > otherwise it should be reliable. Look at how > RegisterScavenger is doing it. > > /jakob > > ___________________________________________________________ ????????????????? http://card.mail.cn.yahoo.com/ From viridia at gmail.com Thu Jan 14 23:55:30 2010 From: viridia at gmail.com (Talin) Date: Thu, 14 Jan 2010 21:55:30 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Thu, Jan 14, 2010 at 9:25 PM, me22 wrote: > 2010/1/14 Talin : > > The reason for doing it this way is that to construct a union, you really > > need 4 pieces of information: The type of the union, the type and value > of > > the member to be initialized, and the index of which member is being > > initialized. > > Does requiring the index mean that uniquing the union type will have > to re-write many of the corresponding insertvalue calls? > > For instance, how would this round-trip? > > @foo = constant union { float, i32 } insertvalue union { i32, > float } undef, i32 4, 0 > @bar = constant union { i32, float } insertvalue union { float, > i32 } undef, i32 4, 1 > Well, the fact that union members have to be indexed by number means that the ordering has to be part of the type - so even though type-theoretically union { i32, float } is the same as union { float, i32 }, in my implementation they are distinct types. However, from the standpoint of a frontend, this is not a great concern, because the frontend will most likely sort the list of types before constructing the IR type. By always putting the types in a canonical order, regardless of the order that they appear in the source code, you can ensure that unions of equal types are always compatible. In other words, you can treat the members like an ordered set rather than like a list. > > I'm very glad to see a non-bitcast method of using unions, BTW. > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100114/4a60bcd3/attachment.html From me22.ca at gmail.com Fri Jan 15 00:08:51 2010 From: me22.ca at gmail.com (me22) Date: Fri, 15 Jan 2010 01:08:51 -0500 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: 2010/1/15 Talin : > On Thu, Jan 14, 2010 at 9:25 PM, me22 wrote: >> >> ? ?@foo = constant union { float, i32 } insertvalue union { i32, >> float } undef, i32 4, 0 >> ? ?@bar = constant union { i32, float } insertvalue union { float, >> i32 } undef, i32 4, 1 >> > > Well, the fact that union members have to be indexed by number means that > the ordering has to be part of the type. > Does that mean that my example above is ill-formed, since the insertvalue gives a different type than the constant wants? From lessen42 at gmail.com Fri Jan 15 00:13:11 2010 From: lessen42 at gmail.com (David Conrad) Date: Fri, 15 Jan 2010 01:13:11 -0500 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz Message-ID: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> Hi, On ARMv6T2 this turns cttz into rbit, clz instead of the 4 instruction sequence it is now. I'm not sure if adding RBIT to ARMISD and doing this optimization in the legalize pass is the best option, but the only better way I could think of doing it was to add a bitreverse intrinsic to llvm ir, which itself might not be the best option since bitreverse probably isn't too common. Other targets that I know of that could potentially benefit from this optimization being global (that have a clz and bitreverse instruction but not ctz) are AVR32 and C64x, neither of which llvm has backends for yet. -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-ctz-arm.diff Type: application/octet-stream Size: 5160 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/9da482e2/attachment-0001.obj From viridia at gmail.com Fri Jan 15 00:29:03 2010 From: viridia at gmail.com (Talin) Date: Thu, 14 Jan 2010 22:29:03 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Thu, Jan 14, 2010 at 10:08 PM, me22 wrote: > 2010/1/15 Talin : > > On Thu, Jan 14, 2010 at 9:25 PM, me22 wrote: > >> > >> @foo = constant union { float, i32 } insertvalue union { i32, > >> float } undef, i32 4, 0 > >> @bar = constant union { i32, float } insertvalue union { float, > >> i32 } undef, i32 4, 1 > >> > > > > Well, the fact that union members have to be indexed by number means that > > the ordering has to be part of the type. > > > > Does that mean that my example above is ill-formed, since the > insertvalue gives a different type than the constant wants? > Yes. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100114/82bcddeb/attachment.html From jo at durchholz.org Fri Jan 15 02:41:00 2010 From: jo at durchholz.org (Joachim Durchholz) Date: Fri, 15 Jan 2010 09:41:00 +0100 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: <4B502A1C.10201@durchholz.org> Talin schrieb: > Well, the fact that union members have to be indexed by number means > that the ordering has to be part of the type - so even though > type-theoretically union { i32, float } is the same as union { float, > i32 }, in my implementation they are distinct types. However, from the > standpoint of a frontend, this is not a great concern, because the > frontend will most likely sort the list of types before constructing the > IR type. Hm... it's placing a burden on the frontend developer. More importantly, it's something that the fronend developer must not forget to do, so you better make sure this is documented in capital letters in a place where the frontend developer is likely to look when preparing code generation. Most importantly, however, this will create a lot of hassles when making code interoperable between compilers: Compiler writers need to agree on a language-independent canonical ordering. That said, if the ordering is canonical, it could be established at the IR level. E.g. by ordering alphabetically. When coding, please consider that many languages establish assignment compatibility between union types. E.g. a union {i32, float} value could be assigned to a name that's typed as a union {i32, i64, float}. This probably means the need for conversion operators, and it definitely means that indexes aren't meaningful by themselves, only in conjunction with their union type. > By always putting the types in a canonical order, regardless of > the order that they appear in the source code, you can ensure that > unions of equal types are always compatible. In other words, you can > treat the members like an ordered set rather than like a list. Yes, that's closer to the frontend semantics: the variants of a union type don't have any natural ordering, so list semantics could cause problems. Regards, Jo From grosser at fim.uni-passau.de Fri Jan 15 02:55:46 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Fri, 15 Jan 2010 09:55:46 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <6363_1263521041_4B4FCD10_6363_3705_1_4B4FCD05.60107@gmail.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <314301.28741.qm@web55601.mail.re4.yahoo.com> <6363_1263521041_4B4FCD10_6363_3705_1_4B4FCD05.60107@gmail.com> Message-ID: <4B502D92.5070704@fim.uni-passau.de> On 01/15/10 03:03, ether wrote: > hi, Hi, > about region class, i think provide some method to extract the top level > loop from a given region is useful in polly(our polyhedral optimization > framework), because the we are only doing optimization on the region > with loop. Hi ether, I think this should be implemented as a RegionFilter, that checks if a region contains a loop, and that can be asked for further information. In general I do not think this kind of analysis belongs to a region, but as you proposed some kind of filter could be applied. In the short term the passes who need this information could get it on their own. > and a long term consideration about "region pass": > > if we want to integrate region analysis and optimization framework into > llvm, i think we can use an approach that similar to loop analysis and > optimization: write a class "regionpass" inherit from "pass", and the > corresponding pass manger "RegionPassManager". (a kind of function pass) > if we follow this approach, we need to push the region pass manager into > the llvm pass manager stack. > the first question of this approach is, whats the relationship between > "loop pass manager" and "region pass manager"? > > way 1: make region pass manager below loop pass manager in the stack > > pass manager stack: > > bb pass manager <---top > loop pass manager > region pass manager > function pass manager > ... <---bottom > > in this way the region hierarchy need to be reconstruct when a loop > transform change it. > > way 2: make region pass manager above loop pass manager in the stack > > pass manager stack: > > bb pass manager <---top > region pass manager > loop pass manager > function pass manager > ... <---bottom > > in this way the loop hierarchy need to be reconstruct when a region pass > change it. > > now we need to choose a way to minimize the loop reconstruction or > region reconstruction. i think that the chance that a region transform > affect the loop structure is smaller, so maybe way 2 is better. This would need some thoughts. Ideal I think we would not order them, but if a region changed, just reconstruct the loops that are in this region and if a loop changed just reconstruct the regions in this loop. > at last, i have some idea about finding a interesting region: (maybe > make the region analysis too complex) > > we can introduce some thing like "region filter" that determine the > property of a region, the region filter will like a "pass", which can > run on an instruction at a time, a basic block at a time, or even a sub > region at a time, then we can write a "filter manager" like "pass > manager " to stream the filtering process, so that we can promote the > the performance of the region finding process. Yes, I like this idea. So the basic design would be that we have some passes like: Maximal Region regarding an Analysis/Filter Minimal Region regarding an Analysis/Filter All Regions regarding an Analysis/Filter So a pass can ask the regionpass manager for a specific kind of regions. It is than just invoked for regions, that fulfill this requirement. Tobias From baldrick at free.fr Fri Jan 15 03:10:17 2010 From: baldrick at free.fr (Duncan Sands) Date: Fri, 15 Jan 2010 10:10:17 +0100 Subject: [LLVMdev] [PATCH] Add simple cross-block DSE. In-Reply-To: References: Message-ID: <4B5030F9.6060002@free.fr> Hi Gianluca, did you look at Jakub Staszak's "Non-local DSE" patch he posted to the mailing list a while ago? Ciao, Duncan. From morten at hue.no Fri Jan 15 03:24:32 2010 From: morten at hue.no (Morten Ofstad) Date: Fri, 15 Jan 2010 10:24:32 +0100 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com><4B4EEB63.3070400@free.fr> Message-ID: <95F5A530019147B5A24ACB15C014D374@Radeon> > This is actually really easy to do, the big issue is defining the > 'bits' that we want to carry on each operation. For example, I think > it would be reasonable to have an "assume finite" bit (saying no > nan's / inf), it would also be useful to know you can do reassociation > etc, useful to know that you don't care about signed zero, etc. I think the main issues are: 1) special values (+0, -0, NaN, +Inf, -Inf) to be taken into account - this can be represented with an 'assume_finite' bit and an 'assume_no_signed_zero' bit 2) rounding, the x86 FPU has 80 bits of internal precision, so you get inconsistent results depending on intermediate results being spilled or being kept in registers. One usual way of handling this is that any assignment in the source code will truncate to the memory representation, while intermediate results in an expression are allowed to be kept at 80 bits precision (i.e. frontend decides which operations must be rounded). - this can be represented with a 'exact_precision' bit 3) exceptions, you might need to have the right number of exceptions triggered in the right order so basically no optimizations are allowed. - this can be represented with a 'trapping_math' and/or 'signaling_NaN' bit, or maybe it can be encoded as 'no_reorder' 'no_duplicate' see: http://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Optimize-Options.html (look for -ffloat-store) http://msdn.microsoft.com/en-us/library/e7s85ffb.aspx (Title: /fp (Specify Floating-Point Behavior)) - Morten From me22.ca at gmail.com Fri Jan 15 08:45:03 2010 From: me22.ca at gmail.com (me22) Date: Fri, 15 Jan 2010 09:45:03 -0500 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: 2010/1/14 Talin : > Originally I thought about having the last be detected > automatically by what type of initializer was used: > ?? ?@foo = constant union { i32, float } i32 4 > However, from a syntactical standpoint what you get is two types in a row - > "union { i32, float }" followed by "i32". That is completely unlike any > other IR syntax and doesn't fit well into the parser. > It seems to me like that's similar to the ptrtoint and such instructions that change the type and not necessarily the value, so perhaps the main union operations should be: %foo = elementtounion i32 4 to union { i32, float } %bar = uniontoelement union { i32, float } %foo to i32 Conceptually I dislike the insertvalue, since the point of an insert is to keep some other part of the value intact, something not needed with unions. From dag at cray.com Fri Jan 15 08:57:25 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 08:57:25 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <201001131610.23444.dag@cray.com> References: <201001131610.23444.dag@cray.com> Message-ID: <201001150857.25798.dag@cray.com> On Wednesday 13 January 2010 16:10, David Greene wrote: > This patch adds a couple of interfaces to dump full or partial > SelectionDAGs. The current code only prints the top-level > SDNode. This patch makes it much easier to understand > CannotYetSelect errors and those sorts of things. In particular, > it helped me track down PR6019. > > Any objections to committing? Ping? -Dave From matthias.braun at kit.edu Fri Jan 15 09:08:56 2010 From: matthias.braun at kit.edu (Matthias Braun) Date: Fri, 15 Jan 2010 16:08:56 +0100 Subject: [LLVMdev] Register Spilling and SSA In-Reply-To: <201001141635.28530.dag@cray.com> References: <201001142256.23296.st@iss.tu-darmstadt.de> <201001141635.28530.dag@cray.com> Message-ID: <1263568136.19013.10.camel@i44pc66.info.uni-karlsruhe.de> Am Donnerstag, den 14.01.2010, 16:35 -0600 schrieb David Greene: > On Thursday 14 January 2010 15:56, ST wrote: > > Hi > > > > I just stumbled upon this paper. While i just skimmed over it it seems as > > if the authors say that their algorithm is more efficient than the llvm 2.3 > > algorithm? So i thought that might be interesting? > > > > http://pp.info.uni-karlsruhe.de/uploads/publikationen/braun09cc.pdf As the author of this paper I have to defend it now (of course ;-) > > Don't trust it. The abstract clearly states they're counting the number of > dynamic spills. That has almost nothing to do with performance. So if not the number of dynamic spills/reloads and rematerialisations. What else can you do to measure the quality of your spilling algorithm? I admit that the impressive numbers in spill translate to smaller gains in the range of 1-5% also it's hard to compare 2 different compilers. Nonetheless we produce faster x86 code than llvm-2.3 for several spec benchmarks (and produce way slower code for some others where I suspect missing architecture neutral optimisations in firm). > > Someone would have to reproduce their experiment to verify that performance > indeed improves. You can easily download libfirm-1.17.0 and cparser-0.9.9 from http://www.libfirm.org and see for yourself. If you're interested in the valgrind hacks to count the spills/reloads, I uploaded them here: pp.info.uni-karlsruhe.de/~matze/valgrind-3.5.0-countmem.tgz (there's a new valgrind plugin called 'countmem' in it). Greetings, Matthias Braun From dag at cray.com Fri Jan 15 09:54:54 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 09:54:54 -0600 Subject: [LLVMdev] Register Spilling and SSA In-Reply-To: <1263568136.19013.10.camel@i44pc66.info.uni-karlsruhe.de> References: <201001142256.23296.st@iss.tu-darmstadt.de> <201001141635.28530.dag@cray.com> <1263568136.19013.10.camel@i44pc66.info.uni-karlsruhe.de> Message-ID: <201001150954.54814.dag@cray.com> On Friday 15 January 2010 09:08, Matthias Braun wrote: > As the author of this paper I have to defend it now (of course ;-) I hope you don't take my comments personally. They are directed to all research organizations. The fact that our field doesn't demand reproducability is a scandal. Hard sciences require independent verification for publication and we should do the same. We should also publish negative results. The fact that such papers are rejected outright is another pet peeve of mine. So many cycles wasted chasing after things other people already found aren't useful... > > Don't trust it. The abstract clearly states they're counting the number > > of dynamic spills. That has almost nothing to do with performance. > > So if not the number of dynamic spills/reloads and rematerialisations. > What else can you do to measure the quality of your spilling algorithm? Run time. We have to measure run time. > I admit that the impressive numbers in spill translate to smaller gains > in the range of 1-5% also it's hard to compare 2 different compilers. Why not compare a single compiler on a single machine using two different allocation algorithms? > Nonetheless we produce faster x86 code than llvm-2.3 for several spec > benchmarks (and produce way slower code for some others where I suspect > missing architecture neutral optimisations in firm). Are there any plans to try this with LLVM? Do you have an LLVM version of your algorithm? Of a firm version of LLVM's algorithm? > > Someone would have to reproduce their experiment to verify that > > performance indeed improves. > > You can easily download libfirm-1.17.0 and cparser-0.9.9 from > http://www.libfirm.org and see for yourself. That's good! Too few research groups release their code and experiment setup. I applaud you for doing so. Much experience has taught me not to trust register allocation papers. They never actually talk about performance. If I were reviewer, I might accept a paper based on novelty of the algorithm (far too many papers are rejected simply because they can't show a 20% speedup) but I wouldn't give points for reducing the number of spills and reloads. Those counts simply don't mean anything in the real world. -Dave From glguida at me.com Fri Jan 15 10:08:47 2010 From: glguida at me.com (Gianluca Guida) Date: Fri, 15 Jan 2010 17:08:47 +0100 Subject: [LLVMdev] [PATCH] Add simple cross-block DSE. In-Reply-To: <4B5030F9.6060002@free.fr> References: <4B5030F9.6060002@free.fr> Message-ID: <97EBCA3B-61A7-4B0C-A24A-3D6EC5E9926B@me.com> Hi Duncan, On Jan 15, 2010, at 10:10 AM, Duncan Sands wrote: > Hi Gianluca, did you look at Jakub Staszak's "Non-local DSE" patch > he posted to the mailing list a while ago? Yes, and as it happens, I just found his patch *after* I've send mine. Of course it is way more complete and smarter than mine -- which tend to solve as unintrusively as possible a common problem I found in a few IRs I've been generating recently (working on a JIT I tend to like simpler optimization passes or I just try to produce a better IR from my side). I'll try to have a look at Jakub's patch, I'm sure it'll handle my case as well. Ah, another thing I found is that my way of sending patches was completely ignoring the LLVM developers policies. My fault. :-) Ciao ;) Gianluca > > Ciao, > > Duncan. From clattner at apple.com Fri Jan 15 12:03:39 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 15 Jan 2010 10:03:39 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> Message-ID: <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > Hi, > > On ARMv6T2 this turns cttz into rbit, clz instead of the 4 > instruction sequence it is now. > > I'm not sure if adding RBIT to ARMISD and doing this optimization in > the legalize pass is the best option, but the only better way I > could think of doing it was to add a bitreverse intrinsic to llvm > ir, which itself might not be the best option since bitreverse > probably isn't too common. I haven't looked at the patch in detail, but this approach makes sense to me. > Other targets that I know of that could potentially benefit from > this optimization being global (that have a clz and bitreverse > instruction but not ctz) are AVR32 and C64x, neither of which llvm > has backends for yet. When/if another target wants this, we could add a ISD::RBIT operation, it doesn't need to be added at the llvm ir level, -Chris From clattner at apple.com Fri Jan 15 12:08:49 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 15 Jan 2010 10:08:49 -0800 Subject: [LLVMdev] LangRef.html invoke/unwind patch In-Reply-To: <4B4E43A5.5090400@laurences.net> References: <4B4E0ACB.8020902@laurences.net> <4B4E4092.7090405@free.fr> <4B4E43A5.5090400@laurences.net> Message-ID: On Jan 13, 2010, at 2:05 PM, Dustin Laurence wrote: > On 01/13/2010 01:52 PM, Duncan Sands wrote: > >> as I mentioned in another email, unwind is not completely >> unsupported: >> it does work for rethrowing an exception. > > Good point. Not understanding how languages implement exceptions > under > the hood, I lose the nuances that should be in a reference document. > How's this version? Seems fine to me, committed as r93518, thanks! From clattner at apple.com Fri Jan 15 12:11:44 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 15 Jan 2010 10:11:44 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: <9C45E97A-C44A-4B9A-94AD-B5832208E279@apple.com> On Jan 13, 2010, at 12:11 PM, Talin wrote: > > It depends on whether or not unions can be passed around as SSA > values or not. I can think of situations where you would want to. > > In particular, GEP is useful because you can avoid the bitcast above > - GEP to element 0 if you want an int (in the example above), or to > element 1 if you want a double. > > Also, I'm thinking that insertvalue might be the best way to > construct a constant union. Right now there's a bit of a problem in > that the data type of a constant must match exactly the declared > type; However in the case of a union what we want is for the data > type of the initializer to exactly match the type of one of the > union members. I thought this would be relatively easy to do, but > it's a little trickier than I realized. I think it is useful to support insert/extract on unions just for orthogonality alone. Stuff that works on structs should generally work on unions too. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/b79cc1e6/attachment.html From clattner at apple.com Fri Jan 15 12:14:16 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 15 Jan 2010 10:14:16 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Jan 14, 2010, at 8:27 PM, Talin wrote: > I'm still working on the next patch, it's going somewhat slowly. I > wanted to create a unit test that actually created a union, and in > order to do that I had to implement constant unions. And rather than > creating a special syntax for constructing a union, I decided that > it was simplest to implement the insertvalue instruction for a > constant union expression: > > @foo = constant union { i32, float } insertvalue union { i32, > float } undef, i32 4, 0 > > What this says is to start with an undef, and then insert the value > '4' into the integer field (the zeroth field) of the union. Insertvalue constant exprs should work on these, but that should fall out from insertvalue just working on unions. However: > > The reason for doing it this way is that to construct a union, you > really need 4 pieces of information: The type of the union, the type > and value of the member to be initialized, and the index of which > member is being initialized. Originally I thought about having the > last be detected automatically by what type of initializer was used: > > @foo = constant union { i32, float } i32 4 I think we really do need a "ConstantUnion" class. How about syntax like this: @foo = constant union { i32, float, double, i32*, i32 } { i32 4 } That seems simple and unambiguous, and analogous to structs. -Chris From gohman at apple.com Fri Jan 15 13:02:39 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 15 Jan 2010 11:02:39 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Jan 13, 2010, at 12:11 PM, Talin wrote: > > It depends on whether or not unions can be passed around as SSA values or not. I can think of situations where you would want to. I'm skeptical that you *really* want to (i.e. that you wouldn't be better off just writing helper functions in your front-end which do the addressing and load/store and then moving on). But, I'm not really interested in getting in the way here. Dan From gohman at apple.com Fri Jan 15 13:16:46 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 15 Jan 2010 11:16:46 -0800 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <201001150857.25798.dag@cray.com> References: <201001131610.23444.dag@cray.com> <201001150857.25798.dag@cray.com> Message-ID: <938484F3-1995-44EE-97BE-E069C92154B1@apple.com> On Jan 15, 2010, at 6:57 AM, David Greene wrote: > On Wednesday 13 January 2010 16:10, David Greene wrote: >> This patch adds a couple of interfaces to dump full or partial >> SelectionDAGs. The current code only prints the top-level >> SDNode. This patch makes it much easier to understand >> CannotYetSelect errors and those sorts of things. In particular, >> it helped me track down PR6019. >> >> Any objections to committing? > > Ping? Is it ever desirable to pass false to the "limit" argument? Otherwise this looks ok. Dan From dag at cray.com Fri Jan 15 13:31:34 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 13:31:34 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <938484F3-1995-44EE-97BE-E069C92154B1@apple.com> References: <201001131610.23444.dag@cray.com> <201001150857.25798.dag@cray.com> <938484F3-1995-44EE-97BE-E069C92154B1@apple.com> Message-ID: <201001151331.34587.dag@cray.com> On Friday 15 January 2010 13:16, Dan Gohman wrote: > Is it ever desirable to pass false to the "limit" argument? Not in the usual course of things but I figured someday someone might want to dig deeper. "limit" is just a heuristic and it could be wrong. Maybe the SelectionDAG is really just huge. > Otherwise this looks ok. I'll check it in and if we think "limit" should go away, I'll follow up with another patch. -Dave From richard at xmos.com Fri Jan 15 13:37:21 2010 From: richard at xmos.com (Richard Osborne) Date: Fri, 15 Jan 2010 19:37:21 +0000 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> Message-ID: On 15 Jan 2010, at 18:03, Chris Lattner wrote: > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Other targets that I know of that could potentially benefit from >> this optimization being global (that have a clz and bitreverse >> instruction but not ctz) are AVR32 and C64x, neither of which llvm >> has backends for yet. > > When/if another target wants this, we could add a ISD::RBIT operation, > it doesn't need to be added at the llvm ir level, The XCore also has ctlz and bitreverse instructions and not cttz. At the moment in the XCore backend cttz is marked as legal and expanded to this pair of instructions in a pattern in the InstrInfo.td. -- Richard Osborne | XMOS http://www.xmos.com From viridia at gmail.com Fri Jan 15 13:37:03 2010 From: viridia at gmail.com (Talin) Date: Fri, 15 Jan 2010 11:37:03 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: <4B502A1C.10201@durchholz.org> References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> <4B502A1C.10201@durchholz.org> Message-ID: On Fri, Jan 15, 2010 at 12:41 AM, Joachim Durchholz wrote: > Talin schrieb: > > Well, the fact that union members have to be indexed by number means that >> the ordering has to be part of the type - so even though type-theoretically >> union { i32, float } is the same as union { float, i32 }, in my >> implementation they are distinct types. However, from the standpoint of a >> frontend, this is not a great concern, because the frontend will most likely >> sort the list of types before constructing the IR type. >> > > Hm... it's placing a burden on the frontend developer. > > More importantly, it's something that the fronend developer must not forget > to do, so you better make sure this is documented in capital letters in a > place where the frontend developer is likely to look when preparing code > generation. > > Most importantly, however, this will create a lot of hassles when making > code interoperable between compilers: Compiler writers need to agree on a > language-independent canonical ordering. > That said, if the ordering is canonical, it could be established at the IR > level. E.g. by ordering alphabetically. > > When coding, please consider that many languages establish assignment > compatibility between union types. E.g. a union {i32, float} value could be > assigned to a name that's typed as a union {i32, i64, float}. > This probably means the need for conversion operators, and it definitely > means that indexes aren't meaningful by themselves, only in conjunction with > their union type. > > I really feel that these issues should be addressed on a layer above IR. LLVM IR always requires that all types match exactly, and any conversions or promotions must be inserted explicitly by the frontend. Making unions do automatic conversions would make them dramatically different from every other IR type. > > > By always putting the types in a canonical order, regardless of > >> the order that they appear in the source code, you can ensure that unions >> of equal types are always compatible. In other words, you can treat the >> members like an ordered set rather than like a list. >> > > Yes, that's closer to the frontend semantics: the variants of a union type > don't have any natural ordering, so list semantics could cause problems. > > Regards, > Jo > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/b516d3a6/attachment.html From gohman at apple.com Fri Jan 15 13:41:35 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 15 Jan 2010 11:41:35 -0800 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <201001151331.34587.dag@cray.com> References: <201001131610.23444.dag@cray.com> <201001150857.25798.dag@cray.com> <938484F3-1995-44EE-97BE-E069C92154B1@apple.com> <201001151331.34587.dag@cray.com> Message-ID: <13D95484-39C9-4C68-B4EB-7E6F7DF20E6D@apple.com> On Jan 15, 2010, at 11:31 AM, David Greene wrote: > On Friday 15 January 2010 13:16, Dan Gohman wrote: > >> Is it ever desirable to pass false to the "limit" argument? > > Not in the usual course of things but I figured someday someone > might want to dig deeper. "limit" is just a heuristic and it > could be wrong. Maybe the SelectionDAG is really just huge. "limit" is just the flag that controls whether or not a message is printed. It seems the message would always be either useful or harmless. > >> Otherwise this looks ok. > > I'll check it in and if we think "limit" should go away, I'll > follow up with another patch. Ok. Dan From dag at cray.com Fri Jan 15 13:51:48 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 13:51:48 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <13D95484-39C9-4C68-B4EB-7E6F7DF20E6D@apple.com> References: <201001131610.23444.dag@cray.com> <201001151331.34587.dag@cray.com> <13D95484-39C9-4C68-B4EB-7E6F7DF20E6D@apple.com> Message-ID: <201001151351.49193.dag@cray.com> On Friday 15 January 2010 13:41, Dan Gohman wrote: > On Jan 15, 2010, at 11:31 AM, David Greene wrote: > > On Friday 15 January 2010 13:16, Dan Gohman wrote: > >> Is it ever desirable to pass false to the "limit" argument? > > > > Not in the usual course of things but I figured someday someone > > might want to dig deeper. "limit" is just a heuristic and it > > could be wrong. Maybe the SelectionDAG is really just huge. > > "limit" is just the flag that controls whether or not a message > is printed. It seems the message would always be either useful > or harmless. Ah, yes, you're correct. I goofed there. The message should be printed and "limit" should control whether we actually check the depth. Sound good? -Dave From deeppatel1987 at gmail.com Fri Jan 15 14:04:24 2010 From: deeppatel1987 at gmail.com (Sandeep Patel) Date: Fri, 15 Jan 2010 20:04:24 +0000 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> Message-ID: <305d6f61001151204x6c01f4fap2ebfbf979a35022@mail.gmail.com> On Fri, Jan 15, 2010 at 6:03 PM, Chris Lattner wrote: > > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Hi, >> >> On ARMv6T2 this turns cttz into rbit, clz instead of the 4 >> instruction sequence it is now. >> >> I'm not sure if adding RBIT to ARMISD and doing this optimization in >> the legalize pass is the best option, but the only better way I >> could think of doing it was to add a bitreverse intrinsic to llvm >> ir, which itself might not be the best option since bitreverse >> probably isn't too common. > > I haven't looked at the patch in detail, but this approach makes sense > to me. > >> Other targets that I know of that could potentially benefit from >> this optimization being global (that have a clz and bitreverse >> instruction but not ctz) are AVR32 and C64x, neither of which llvm >> has backends for yet. > > When/if another target wants this, we could add a ISD::RBIT operation, > it doesn't need to be added at the llvm ir level, Bit reversal turns up in most FFT algorithms, so it wouldn't hurt to be able to add an instcombine that recognizes it, etc. deep From scanon at apple.com Fri Jan 15 12:32:17 2010 From: scanon at apple.com (Stephen Canon) Date: Fri, 15 Jan 2010 10:32:17 -0800 Subject: [LLVMdev] Presenting Unsafe Math Flag to Optimizer In-Reply-To: <95F5A530019147B5A24ACB15C014D374@Radeon> References: <3FD96E23-0798-4AC3-8C2C-98820FCDA0F7@apple.com> <4B4EEB63.3070400@free.fr> <95F5A530019147B5A24ACB15C014D374@Radeon> Message-ID: On Jan 15, 2010, at 1:24 AM, Morten Ofstad wrote: > I think the main issues are: > > 1) special values (+0, -0, NaN, +Inf, -Inf) to be taken into account > - this can be represented with an 'assume_finite' bit and an 'assume_no_signed_zero' bit Sounds right to me. > 2) rounding, the x86 FPU has 80 bits of internal precision, so you get inconsistent results depending > on intermediate results being spilled or being kept in registers. One usual way of handling this is > that any assignment in the source code will truncate to the memory representation, while intermediate > results in an expression are allowed to be kept at 80 bits precision (i.e. frontend decides which operations must be rounded). > - this can be represented with a 'exact_precision' bit Does LLVM even support generating float and double arithmetic on x87? Certainly the default should be to use SSE/SSE2 and avoid this problem entirely. If legacy x87 codegen is supported, it would be nice to have float-store be the default behavior, and require a flag "-fnon-portable-extra-precision" or something similarly menacing to enable the other behavior. > 3) exceptions, you might need to have the right number of exceptions triggered in the right order so basically no optimizations are allowed. > - this can be represented with a 'trapping_math' and/or 'signaling_NaN' bit, or maybe it can be encoded as 'no_reorder' 'no_duplicate' Some reordering should be inhibited not only by trapping math, but also by the default IEEE-754 exception handling (nonstop execution with status flags), at least when #pragma STDC FENV_ACCESS ON is active. If the reordering affects only the order in which flags could be raised, and not which flags could be raised, then it could be allowed with the default exception handling. - Steve -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/6297cc14/attachment.html From corina_fff at yahoo.com Fri Jan 15 14:54:07 2010 From: corina_fff at yahoo.com (corina s) Date: Fri, 15 Jan 2010 12:54:07 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM Message-ID: <328335.13577.qm@web45315.mail.sp1.yahoo.com> Hello, I am building llvm-gcc4.2-2.6 for ARM target.I used the next command line option: >> .../configure --enable-languages=c,c++ --enable-checking --target=arm-eabi >> and then >> make target_alias=arm-eabi >> And then I obtain the following error: In file included from ../../gcc/config/arm/arm.c:59: ../../../libcpp/internal.h: In function ?ufputs?: ../../../libcpp/internal.h:693: warning: implicit declaration of function ?fputs_unlocked? .../../gcc/config/arm/arm.c: At top level: .../../gcc/config/arm/arm.c:514: error: ?MASK_INTERWORK? undeclared here (not in a function) .../../gcc/config/arm/arm.c: In function ?optimization_options?: .../../gcc/config/arm/arm.c:23444: warning: unused parameter ?level? What would be the problem? It is OK the configure line? Thanks, Corina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/b0360cf5/attachment.html From gohman at apple.com Fri Jan 15 15:09:08 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 15 Jan 2010 13:09:08 -0800 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <201001151351.49193.dag@cray.com> References: <201001131610.23444.dag@cray.com> <201001151331.34587.dag@cray.com> <13D95484-39C9-4C68-B4EB-7E6F7DF20E6D@apple.com> <201001151351.49193.dag@cray.com> Message-ID: On Jan 15, 2010, at 11:51 AM, David Greene wrote: > On Friday 15 January 2010 13:41, Dan Gohman wrote: >> On Jan 15, 2010, at 11:31 AM, David Greene wrote: >>> On Friday 15 January 2010 13:16, Dan Gohman wrote: >>>> Is it ever desirable to pass false to the "limit" argument? >>> >>> Not in the usual course of things but I figured someday someone >>> might want to dig deeper. "limit" is just a heuristic and it >>> could be wrong. Maybe the SelectionDAG is really just huge. >> >> "limit" is just the flag that controls whether or not a message >> is printed. It seems the message would always be either useful >> or harmless. > > Ah, yes, you're correct. I goofed there. The message should be > printed and "limit" should control whether we actually check > the depth. > > Sound good? reimplement Unlimited-recursion dumping is what the existing dump routines already do, so it's a little odd to have a flag to allow these new dump routines to do the same thing. I guess you could refactor the old ones to call the new ones and eliminate some redundant code, if you wanted. Dan From anton at korobeynikov.info Fri Jan 15 15:19:35 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sat, 16 Jan 2010 00:19:35 +0300 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <328335.13577.qm@web45315.mail.sp1.yahoo.com> References: <328335.13577.qm@web45315.mail.sp1.yahoo.com> Message-ID: Hello > What would be the problem? You're building llvm-gcc w/o LLVM. > It is OK the configure line? No. Please do read readme.llvm file in the llvm-gcc's source directory. In short: you missed --enable-llvm option -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From gvenn.cfe.dev at gmail.com Fri Jan 15 15:20:05 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Fri, 15 Jan 2010 16:20:05 -0500 Subject: [LLVMdev] mkpatch Message-ID: <68F13E05-85FD-4CA8-A99F-12446061B059@gmail.com> The document: http://llvm.org/docs/DeveloperPolicy.html refers to using utils/mkpatch. However when using mkpatch it complains about directories no longer under version control. Specifically directories lib/Debugger and win32 in mkpatch's source line: svn diff -x -u >> "$NAME".patch.raw 2>&1 \ autoconf docs utils include lib/System lib/Support lib/VMCore lib/AsmParser \ lib/Bitcode lib/Analysis lib/Transforms lib/CodeGen lib/Target \ lib/ExecutionEngine lib/Debugger lib/Linker \ tools test unittests runtime projects examples win32 Xcode are two of the culprits. Is mkpatch no longer used? Removing these entries seems to fix the issue except I do not know what else is now required if mkpatch is out of date and therefore no longer in use. I noticed like most diffs, new directories and new files are not covered. Is there doc on how to submit such additions to the list for review? Garrison PS: I can submit a patch for mkpatch with the above references deleted if desired. From clattner at apple.com Fri Jan 15 15:30:52 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 15 Jan 2010 13:30:52 -0800 Subject: [LLVMdev] mkpatch In-Reply-To: <68F13E05-85FD-4CA8-A99F-12446061B059@gmail.com> References: <68F13E05-85FD-4CA8-A99F-12446061B059@gmail.com> Message-ID: On Jan 15, 2010, at 1:20 PM, Garrison Venn wrote: > The document: http://llvm.org/docs/DeveloperPolicy.html refers to > using utils/mkpatch. > However when using mkpatch it complains about directories no longer > under version > control. Specifically directories lib/Debugger and win32 in > mkpatch's source line: > > svn diff -x -u >> "$NAME".patch.raw 2>&1 \ > autoconf docs utils include lib/System lib/Support lib/VMCore lib/ > AsmParser \ > lib/Bitcode lib/Analysis lib/Transforms lib/CodeGen lib/Target \ > lib/ExecutionEngine lib/Debugger lib/Linker \ > tools test unittests runtime projects examples win32 Xcode > > are two of the culprits. Is mkpatch no longer used? Removing these > entries seems > to fix the issue except I do not know what else is now required if > mkpatch is out of > date and therefore no longer in use. I noticed like most diffs, new > directories and > new files are not covered. Is there doc on how to submit such > additions to the list > for review? > > Garrison > > PS: I can submit a patch for mkpatch with the above references > deleted if desired. I think that removing mkpatch and the references make sense, svn diff is enough. -Chris From dag at cray.com Fri Jan 15 15:38:19 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 15:38:19 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: References: <201001131610.23444.dag@cray.com> <201001151351.49193.dag@cray.com> Message-ID: <201001151538.19812.dag@cray.com> On Friday 15 January 2010 15:09, Dan Gohman wrote: > > Sound good? reimplement > > Unlimited-recursion dumping is what the existing dump routines > already do, so it's a little odd to have a flag to allow these Which existing dump routines are you referring to? > new dump routines to do the same thing. I guess you could > refactor the old ones to call the new ones and eliminate some > redundant code, if you wanted. We probably should. -Dave From dllaurence at dslextreme.com Fri Jan 15 15:51:36 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Fri, 15 Jan 2010 13:51:36 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> <4B502A1C.10201@durchholz.org> Message-ID: <4B50E368.4000108@laurences.net> On 01/15/2010 11:37 AM, Talin wrote: > Yes, that's closer to the frontend semantics: the variants of a > union type don't have any natural ordering, so list semantics could > cause problems. I agree. I probably shouldn't even comment, as I know so little about LLVM. But I've hand-written a couple kLOC of IR now and am starting to get a feel for the syntax, so I'll just say what "feels" right based on that and leave it to others to decide if I've absorbed enough to make any kind of sense. Just imagining myself using such a language extension, I really would not want an ordering imposed where no natural one exists. Indices feel very wrong. Isn't a union basically just a convenient alternate interface to the various other conversion operators like bitcast, inttoptr, trunc, zext, and the rest? (In fact that's how I manipulate my expressions, the three-bit tag in the low-order bits tell me how to treat the high-order bits.) The "index" doesn't (generally) represent any kind of offset, but rather an interpretation of the bits, and none of the offset arithmetic implied by getelementptr or physical register choice implied by extractvalue will occur (except perhaps to satisfy alignment constraints, but that would be architecture dependent and I assume should therefore be invisible). Correct? If that argument is persuasive, then the following seems a bit more consistent with the existing syntax: ; Manipulation of a union register variable %myUnion = unioncast i32, %myValue to union {i32, float} %fieldValue = unioncast union {i32, float} %myUnion to i32 ; %fieldValue == %myValue This specialized union cast fits the pattern of having specialized cast operations between value and pointer as opposed to two values or two pointers. That's enough, as you could require that unions be loaded and stored as unions and then elements extracted. But if you want to make it a bit less syntactically noisy, and also allow the same flexibility that getelementptr would allow in accessing a single member through a pointer, you could allow ; Load/store of one particular union field store i32 %myValue, union {i32, float}* %myUnionPtr %fieldValue = load union {i32, float}* %myUnionPtr as i32 ; %fieldValue == %myValue Where I've added a preposition 'as' to the load instruction by analogy with what the cast operators do with 'to'. I don't know that I'd argue the point much, but offhand it "feels" consistent with the rest of the syntax to have a specialized 'unioncast' operator analogous with the other specialized conversions, but overload load/store as I illustrated so that pointers to unions are conceptually just funny kinds of pointers to their fields (which they are). So in that vein, if you want a pointer to one of the alternatives in the union you'd just cast one pointer to another; to avoid alignment adjustments on what is supposed to be a no-op that cast probably shouldn't be bitcast. So what about %intPtr = unioncast union {i32, float}* %myUnionPtr to i32* %newUnionPtr = unioncast i32* %intPtr to union {i32, float}* ; %newUnionPtr == %myUnionPtr I'm not necessarily advocating overloading one keyword ('unioncast') that way, though I note that it should always be unambiguous based on whether the operands are values or pointers (LLVM seems to have a strong notion of what is and is not a pointer, so this makes some kind of conceptual sense to me). Whether it's OK to create two new keywords is perhaps too fine a detail for me to have a good sense of. What would matter to me is not imposing order on unordered interpretations. Dustin From gohman at apple.com Fri Jan 15 16:23:26 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 15 Jan 2010 14:23:26 -0800 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <201001151538.19812.dag@cray.com> References: <201001131610.23444.dag@cray.com> <201001151351.49193.dag@cray.com> <201001151538.19812.dag@cray.com> Message-ID: <48E6FEC0-7280-446A-83C4-AA9302B2EEC8@apple.com> On Jan 15, 2010, at 1:38 PM, David Greene wrote: > On Friday 15 January 2010 15:09, Dan Gohman wrote: > >>> Sound good? reimplement >> >> Unlimited-recursion dumping is what the existing dump routines >> already do, so it's a little odd to have a flag to allow these > > Which existing dump routines are you referring to? dumpr(). I guess it wasn't commented. It is now :-). > >> new dump routines to do the same thing. I guess you could >> refactor the old ones to call the new ones and eliminate some >> redundant code, if you wanted. > > We probably should. Ok. Dan From dag at cray.com Fri Jan 15 16:30:08 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 16:30:08 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <48E6FEC0-7280-446A-83C4-AA9302B2EEC8@apple.com> References: <201001131610.23444.dag@cray.com> <201001151538.19812.dag@cray.com> <48E6FEC0-7280-446A-83C4-AA9302B2EEC8@apple.com> Message-ID: <201001151630.09281.dag@cray.com> On Friday 15 January 2010 16:23, Dan Gohman wrote: > > Which existing dump routines are you referring to? > > dumpr(). I guess it wasn't commented. It is now :-). > > >> new dump routines to do the same thing. I guess you could > >> refactor the old ones to call the new ones and eliminate some > >> redundant code, if you wanted. > > > > We probably should. > > Ok. All right, I'll work on that. -Dave From dag at cray.com Fri Jan 15 16:34:07 2010 From: dag at cray.com (David Greene) Date: Fri, 15 Jan 2010 16:34:07 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <48E6FEC0-7280-446A-83C4-AA9302B2EEC8@apple.com> References: <201001131610.23444.dag@cray.com> <201001151538.19812.dag@cray.com> <48E6FEC0-7280-446A-83C4-AA9302B2EEC8@apple.com> Message-ID: <201001151634.07534.dag@cray.com> On Friday 15 January 2010 16:23, Dan Gohman wrote: > On Jan 15, 2010, at 1:38 PM, David Greene wrote: > > On Friday 15 January 2010 15:09, Dan Gohman wrote: > >>> Sound good? reimplement > >> > >> Unlimited-recursion dumping is what the existing dump routines > >> already do, so it's a little odd to have a flag to allow these > > > > Which existing dump routines are you referring to? > > dumpr(). I guess it wasn't commented. It is now :-). Ah, one thing. dumpr uses DumpNodesr which does the "once" thing. I actually would prefer a full dump. Perhaps we shouldn't try to unify them, or maybe provide a flag to control behavior. In the past separate APIs have been preferred over flags. I can rename the ones I added to be more consistent with the existing stuff. Opinions? -Dave From grosbach at apple.com Fri Jan 15 16:52:34 2010 From: grosbach at apple.com (Jim Grosbach) Date: Fri, 15 Jan 2010 14:52:34 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> Message-ID: On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: > > On 15 Jan 2010, at 18:03, Chris Lattner wrote: > >> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >> >>> Other targets that I know of that could potentially benefit from >>> this optimization being global (that have a clz and bitreverse >>> instruction but not ctz) are AVR32 and C64x, neither of which llvm >>> has backends for yet. >> >> When/if another target wants this, we could add a ISD::RBIT >> operation, >> it doesn't need to be added at the llvm ir level, > > The XCore also has ctlz and bitreverse instructions and not cttz. At > the moment in the XCore backend cttz is marked as legal and expanded > to this pair of instructions in a pattern in the InstrInfo.td. In that case, perhaps it makes sense to add it as an ISD::RBIT operation straight away. The rest of the patch looks good to me. -Jim From viridia at gmail.com Fri Jan 15 17:13:36 2010 From: viridia at gmail.com (Talin) Date: Fri, 15 Jan 2010 15:13:36 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <492CB430-77EE-412A-9A68-2681DD49CEE5@apple.com> <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Fri, Jan 15, 2010 at 11:02 AM, Dan Gohman wrote: > > On Jan 13, 2010, at 12:11 PM, Talin wrote: > > > > It depends on whether or not unions can be passed around as SSA values or > not. I can think of situations where you would want to. > > I'm skeptical that you *really* want to (i.e. that you wouldn't > be better off just writing helper functions in your front-end > which do the addressing and load/store and then moving on). > But, I'm not really interested in getting in the way here. > > Let me give you a use case then: Say I have a function which returns either a floating-point number or an error code (like divide by zero or something). The way that I would represent this return result is: { i1, union { float, i32 } } In other words, what we have is a small struct that contains a one-bit discriminator field, followed by a union of float and i32. The discriminator field tells us what type is stored in the union - 0 = float, 1 = i32, so this is a typical 'tagged' union. (We can also have untagged or "C-style" unions, as long as the programmer has some other means of knowing what type is stored in the union.) Using a union here (as opposed to using bitcast) solves a number of problems: 1) The size of the struct is automatically calculated by taking the largest field of the union. Without unions, your frontend would have to calculate the size of each possible field, as well as their alignment, and use that to figure the maximum structure size. If your front-end is target-agnostic, you may not even know how to calculate the correct struct size. 2) The struct is small enough to be returned as a first-class SSA value, and with a union you can use it directly. Since bitcast only works on pointers, in order to use it you would have to alloca some temporary memory to hold the function result, store the result into it, then use a combination of GEP and bitcast to get a correctly-typed pointer to the second field, and finally load the value. With a union, you can simply extract the second field without ever having to muck about with pointers and allocas. 3) The union provides an additional layer of type safety, since you can only extract types which are declared in the union, and not any arbitrary type that you could get with a bitcast. (Although I consider this a relatively minor point since type safety isn't a major concern in IR.) 4) It's possible that some future version of the optimizer could use the additional type information provided by the union which the bitcast does not. Perhaps an optimizer which knows that all of the union members are numbers and not pointers could make some additional assumptions... -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/d12df0d7/attachment-0001.html From viridia at gmail.com Fri Jan 15 17:19:40 2010 From: viridia at gmail.com (Talin) Date: Fri, 15 Jan 2010 15:19:40 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: On Fri, Jan 15, 2010 at 3:13 PM, Talin wrote: > On Fri, Jan 15, 2010 at 11:02 AM, Dan Gohman wrote: > >> >> On Jan 13, 2010, at 12:11 PM, Talin wrote: >> > >> > It depends on whether or not unions can be passed around as SSA values >> or not. I can think of situations where you would want to. >> >> I'm skeptical that you *really* want to (i.e. that you wouldn't >> be better off just writing helper functions in your front-end >> which do the addressing and load/store and then moving on). >> But, I'm not really interested in getting in the way here. >> >> Let me give you a use case then: > > Say I have a function which returns either a floating-point number or an > error code (like divide by zero or something). The way that I would > represent this return result is: > > { i1, union { float, i32 } } > > In other words, what we have is a small struct that contains a one-bit > discriminator field, followed by a union of float and i32. The discriminator > field tells us what type is stored in the union - 0 = float, 1 = i32, so > this is a typical 'tagged' union. (We can also have untagged or "C-style" > unions, as long as the programmer has some other means of knowing what type > is stored in the union.) > > Using a union here (as opposed to using bitcast) solves a number of > problems: > > 1) The size of the struct is automatically calculated by taking the largest > field of the union. Without unions, your frontend would have to calculate > the size of each possible field, as well as their alignment, and use that to > figure the maximum structure size. If your front-end is target-agnostic, you > may not even know how to calculate the correct struct size. > > 2) The struct is small enough to be returned as a first-class SSA value, > and with a union you can use it directly. Since bitcast only works on > pointers, in order to use it you would have to alloca some temporary memory > to hold the function result, store the result into it, then use a > combination of GEP and bitcast to get a correctly-typed pointer to the > second field, and finally load the value. With a union, you can simply > extract the second field without ever having to muck about with pointers and > allocas. > > 3) The union provides an additional layer of type safety, since you can > only extract types which are declared in the union, and not any arbitrary > type that you could get with a bitcast. (Although I consider this a > relatively minor point since type safety isn't a major concern in IR.) > > 4) It's possible that some future version of the optimizer could use the > additional type information provided by the union which the bitcast does > not. Perhaps an optimizer which knows that all of the union members are > numbers and not pointers could make some additional assumptions... > > 5) Something I forgot to mention - by allowing GEP and extractvalue to work with unions, we can handle unions nested inside structs and vice versa with a single GEP instruction. This is my main argument against having special instructions for dealing with unions. For example, in the case of { i1, union { float, i32 } }* we can use a GEP with indices [0, 1, 0] to get access to the float field in a single GEP instruction. So just as GEP allows chaining together operations on structs, pointers and arrays, we can also chain them together with operations on unions. This can be quite powerful I think. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/296521a1/attachment.html From corina_fff at yahoo.com Fri Jan 15 17:54:45 2010 From: corina_fff at yahoo.com (corina s) Date: Fri, 15 Jan 2010 15:54:45 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM Message-ID: <280368.92320.qm@web45303.mail.sp1.yahoo.com> OK. I am getting now this error: .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6 --enable-languages=c,c++ --target=arm-eabi exec: 2: -meabi=4: not found make[4]: *** [crtbegin.o] Error 1 ? Thanks for your help, Corina --- On Fri, 1/15/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu Date: Friday, January 15, 2010, 1:19 PM Hello > What would be the problem? You're building llvm-gcc w/o LLVM. > It is OK the configure line? No. Please do read readme.llvm file in the llvm-gcc's source directory. In short: you missed --enable-llvm option -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/3afd110e/attachment.html From anton at korobeynikov.info Fri Jan 15 18:03:41 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sat, 16 Jan 2010 03:03:41 +0300 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <879108.7084.qm@web45309.mail.sp1.yahoo.com> References: <879108.7084.qm@web45309.mail.sp1.yahoo.com> Message-ID: Hello > exec: 2: -meabi=4: not found > make[4]: *** [crtbegin.o] Error 1 It seems you don't have cross-binutils for arm-eabi installed. Note that ARM binutils are known to be buggy - you should use the fresh CVS snapshot. PS: Please use "Reply All" button - this way the copy will be sent to llvm-dev ML and others will be able to comment / use the information as well. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From gohman at apple.com Fri Jan 15 18:14:44 2010 From: gohman at apple.com (Dan Gohman) Date: Fri, 15 Jan 2010 16:14:44 -0800 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <201001151634.07534.dag@cray.com> References: <201001131610.23444.dag@cray.com> <201001151538.19812.dag@cray.com> <48E6FEC0-7280-446A-83C4-AA9302B2EEC8@apple.com> <201001151634.07534.dag@cray.com> Message-ID: <7E0488E5-339E-42BD-9B00-55881D8E3EA2@apple.com> On Jan 15, 2010, at 2:34 PM, David Greene wrote: > On Friday 15 January 2010 16:23, Dan Gohman wrote: >> On Jan 15, 2010, at 1:38 PM, David Greene wrote: >>> On Friday 15 January 2010 15:09, Dan Gohman wrote: >>>>> Sound good? reimplement >>>> >>>> Unlimited-recursion dumping is what the existing dump routines >>>> already do, so it's a little odd to have a flag to allow these >>> >>> Which existing dump routines are you referring to? >> >> dumpr(). I guess it wasn't commented. It is now :-). > > Ah, one thing. dumpr uses DumpNodesr which does the "once" thing. > I actually would prefer a full dump. Perhaps we shouldn't try to > unify them, or maybe provide a flag to control behavior. In the > past separate APIs have been preferred over flags. I can rename > the ones I added to be more consistent with the existing stuff. > > Opinions? I use the GraphViz viewer almost exclusively, so I don't have a strong opinion. Methods with lots of flags are inconvenient to call from a debugger. I'd suggesting coming up with a few common use cases, and providing interfaces to cover those use cases, and not trying to provide lots of extra generality. If SDNode::dumpr() had built-in cycle detection, and indicated cycles with big capital letters, would you still want a recursive dump which doesn't do the "once" thing? Or, if the "once" thing had a more human-oriented syntax, would it be usable? Dan From viridia at gmail.com Fri Jan 15 18:36:01 2010 From: viridia at gmail.com (Talin) Date: Fri, 15 Jan 2010 16:36:01 -0800 Subject: [LLVMdev] [patch] Union Types - work in progress Message-ID: Here's a work in progress of the union patch. Note that the test "union.ll" does not work, so you probably don't want to check this in as is. However, I'd be interested in any feedback you're willing to give. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/d47d6d25/attachment.html From gvenn.cfe.dev at gmail.com Fri Jan 15 19:46:43 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Fri, 15 Jan 2010 20:46:43 -0500 Subject: [LLVMdev] ExceptionDemo patch Message-ID: <314B100E-74E3-4436-B4DF-A6A71C40BFD5@gmail.com> Attached is a patch which will add an exception handling example to the examples directory. This patch is a version of what I added to the wiki which in addition meets the LLVM build and coding standards requirements. The patch was tested for a debug build on CentOS LINUX, and both a debug and release build on OS X 10.6.2. Because of an #include , I do not know if the patch results are portable beyond OS X 10.6.x, and LINUX.. Both zero cost domain specific, and foreign exception handling is demoed. Even though an attempt was made to make the code self documenting (doxygen doc. was not tested), more documentation can be found at: http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism. The wiki version's code though, does not meet the LLVM coding standards, and is therefore not the same source as what results from applying the attached patch. The example's documentation fully explains how to run the example. Comments would be very useful. ;-) Garrison -------------- next part -------------- A non-text attachment was scrubbed... Name: ExceptionDemo.patch Type: application/octet-stream Size: 86667 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/bacc6329/attachment-0001.obj From gvenn.cfe.dev at gmail.com Fri Jan 15 20:12:33 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Fri, 15 Jan 2010 21:12:33 -0500 Subject: [LLVMdev] mkpatch patch Message-ID: <42219856-8E7C-4F92-8EED-470857E8F705@gmail.com> I've included a patch which does not remove mkpatch but does remove diff search directories which caused a failure because those directories were no longer in svn. I was uncomfortable removing mkpatch since I believe it helps document creating patches for beginners who do not use separate source and build (object) root directories. Its existence is also expected by readers of: http://llvm.org/docs/DeveloperPolicy.html#patches. Garrison -------------- next part -------------- A non-text attachment was scrubbed... Name: mkpatch.patch Type: application/octet-stream Size: 701 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/b1990eea/attachment.obj From guh at boisestate.edu Fri Jan 15 20:21:02 2010 From: guh at boisestate.edu (Gang-Ryung Uh) Date: Fri, 15 Jan 2010 19:21:02 -0700 Subject: [LLVMdev] llvm opt phase ordering Message-ID: I wonder whether this question is appropriate to this forum or not; if not, please educate me. For the following command line arguments, what happens to the optimization phases when the licm phase moves out loop invariant instructions to loop preheaders? opt -simplifycfg -instcombine -inline -globaldce -instcombine -simplifycfg -scalarrepl -mem2reg -verify -sccp -adce -licm -instcombine -dce -simplifycfg -deadargelim -globaldce -deadtypeelim < test.bc > test-opt.bc -- UGR -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100115/1b85fd7c/attachment.html From rajika at wso2.com Fri Jan 15 20:49:14 2010 From: rajika at wso2.com (Rajika Kumarasiri) Date: Sat, 16 Jan 2010 08:19:14 +0530 Subject: [LLVMdev] Build failure in llvm trunk Message-ID: <2cbbf7ce1001151849m5246dfd8m5accb94048b9b93f@mail.gmail.com> hi, I am trying to build the llvm trunk with make. I am trying a debug build and I am ending with the following build error. /home/rajika/projects/llvm/llvm/lib/Target/TargetLoweringObjectFile.cpp: In member function ?virtual bool llvm::TargetLoweringObjectFileMachO::shouldEmitUsedDirectiveFor(const llvm::GlobalValue*, llvm::Mangler*) const?: /home/rajika/projects/llvm/llvm/lib/Target/TargetLoweringObjectFile.cpp:959: error: ?NameTmp? was not declared in this scope make[2]: *** [/home/rajika/projects/llvm/llvm-objects/lib/Target/Debug/TargetLoweringObjectFile.o] Error 1 make[2]: Leaving directory `/home/rajika/projects/llvm/llvm-objects/lib/Target' make[1]: *** [Target/.makeall] Error 2 make[1]: Leaving directory `/home/rajika/projects/llvm/llvm-objects/lib' make: *** [all] Error 1 Rajika -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/6e66b811/attachment.html From clattner at apple.com Fri Jan 15 21:38:43 2010 From: clattner at apple.com (Chris Lattner) Date: Fri, 15 Jan 2010 19:38:43 -0800 Subject: [LLVMdev] Build failure in llvm trunk In-Reply-To: <2cbbf7ce1001151849m5246dfd8m5accb94048b9b93f@mail.gmail.com> References: <2cbbf7ce1001151849m5246dfd8m5accb94048b9b93f@mail.gmail.com> Message-ID: Doh, that's my fault. Fixed in r93628 On Jan 15, 2010, at 6:49 PM, Rajika Kumarasiri wrote: > hi, > I am trying to build the llvm trunk with make. I am trying a debug build and I am ending with the following build error. > > /home/rajika/projects/llvm/llvm/lib/Target/TargetLoweringObjectFile.cpp: In member function ?virtual bool llvm::TargetLoweringObjectFileMachO::shouldEmitUsedDirectiveFor(const llvm::GlobalValue*, llvm::Mangler*) const?: > /home/rajika/projects/llvm/llvm/lib/Target/TargetLoweringObjectFile.cpp:959: error: ?NameTmp? was not declared in this scope > make[2]: *** [/home/rajika/projects/llvm/llvm-objects/lib/Target/Debug/TargetLoweringObjectFile.o] Error 1 > make[2]: Leaving directory `/home/rajika/projects/llvm/llvm-objects/lib/Target' > make[1]: *** [Target/.makeall] Error 2 > make[1]: Leaving directory `/home/rajika/projects/llvm/llvm-objects/lib' > make: *** [all] Error 1 > > Rajika > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From nicholas at mxc.ca Fri Jan 15 23:44:01 2010 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 15 Jan 2010 21:44:01 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: <4B50E368.4000108@laurences.net> References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> <4B502A1C.10201@durchholz.org> <4B50E368.4000108@laurences.net> Message-ID: <4B515221.4060206@mxc.ca> Dustin Laurence wrote: > On 01/15/2010 11:37 AM, Talin wrote: > >> Yes, that's closer to the frontend semantics: the variants of a >> union type don't have any natural ordering, so list semantics could >> cause problems. > > I agree. I probably shouldn't even comment, as I know so little about > LLVM. But I've hand-written a couple kLOC of IR now and am starting to > get a feel for the syntax, so I'll just say what "feels" right based on > that and leave it to others to decide if I've absorbed enough to make > any kind of sense. > > Just imagining myself using such a language extension, I really would > not want an ordering imposed where no natural one exists. Indices feel > very wrong. Isn't a union basically just a convenient alternate > interface to the various other conversion operators like bitcast, > inttoptr, trunc, zext, and the rest? Almost, but you're forgetting one important attribute: you can 'alloca' a union type and get something the size of the largest entry. This way, you can allocate a union of {i32, i8} and i8* without knowing in your frontend whether your system has 32 or 64-bit pointers. This is important to people who want to write fully platform neutral code in LLVM. Nick > (In fact that's how I manipulate > my expressions, the three-bit tag in the low-order bits tell me how to > treat the high-order bits.) The "index" doesn't (generally) represent > any kind of offset, but rather an interpretation of the bits, and none > of the offset arithmetic implied by getelementptr or physical register > choice implied by extractvalue will occur (except perhaps to satisfy > alignment constraints, but that would be architecture dependent and I > assume should therefore be invisible). Correct? > > If that argument is persuasive, then the following seems a bit more > consistent with the existing syntax: > > ; Manipulation of a union register variable > %myUnion = unioncast i32, %myValue to union {i32, float} > %fieldValue = unioncast union {i32, float} %myUnion to i32 > ; %fieldValue == %myValue > > This specialized union cast fits the pattern of having specialized cast > operations between value and pointer as opposed to two values or two > pointers. > > That's enough, as you could require that unions be loaded and stored as > unions and then elements extracted. But if you want to make it a bit > less syntactically noisy, and also allow the same flexibility that > getelementptr would allow in accessing a single member through a > pointer, you could allow > > ; Load/store of one particular union field > store i32 %myValue, union {i32, float}* %myUnionPtr > %fieldValue = load union {i32, float}* %myUnionPtr as i32 > ; %fieldValue == %myValue > > Where I've added a preposition 'as' to the load instruction by analogy > with what the cast operators do with 'to'. > > I don't know that I'd argue the point much, but offhand it "feels" > consistent with the rest of the syntax to have a specialized 'unioncast' > operator analogous with the other specialized conversions, but overload > load/store as I illustrated so that pointers to unions are conceptually > just funny kinds of pointers to their fields (which they are). So in > that vein, if you want a pointer to one of the alternatives in the union > you'd just cast one pointer to another; to avoid alignment adjustments > on what is supposed to be a no-op that cast probably shouldn't be > bitcast. So what about > > %intPtr = unioncast union {i32, float}* %myUnionPtr to i32* > %newUnionPtr = unioncast i32* %intPtr to union {i32, float}* > ; %newUnionPtr == %myUnionPtr > > I'm not necessarily advocating overloading one keyword ('unioncast') > that way, though I note that it should always be unambiguous based on > whether the operands are values or pointers (LLVM seems to have a strong > notion of what is and is not a pointer, so this makes some kind of > conceptual sense to me). Whether it's OK to create two new keywords is > perhaps too fine a detail for me to have a good sense of. What would > matter to me is not imposing order on unordered interpretations. > > Dustin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From dllaurence at dslextreme.com Sat Jan 16 02:21:06 2010 From: dllaurence at dslextreme.com (Dustin Laurence) Date: Sat, 16 Jan 2010 00:21:06 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: <4B515221.4060206@mxc.ca> References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> <4B502A1C.10201@durchholz.org> <4B50E368.4000108@laurences.net> <4B515221.4060206@mxc.ca> Message-ID: <4B5176F2.9080804@laurences.net> On 01/15/2010 09:44 PM, Nick Lewycky wrote: > Dustin Laurence wrote: >> ...Isn't a union basically just a convenient alternate >> interface to the various other conversion operators like bitcast, >> inttoptr, trunc, zext, and the rest? > > Almost, but you're forgetting one important attribute: you can 'alloca' > a union type and get something the size of the largest entry. This way, > you can allocate a union of {i32, i8} and i8* without knowing in your > frontend whether your system has 32 or 64-bit pointers. This is > important to people who want to write fully platform neutral code in LLVM. OK, but how does ordering an unordered "bag of alternatives" help that? I wasn't trying to imply that union wasn't useful because you could just use the other conversions, though I see I worded it so it sounds that way. I just meant that the instructions for conversions seemed like a better model for manipulating unions than structures. I suspect it is C syntax that makes us think of structs and unions together, and I was trying to defeat my own tendency to do that by using a different model. The last time I felt like checking my little lisp code with it's numbers stuffed into pairs of pointer-sized words would build on either 32 or 64-bit x86. But I commited some sins to more or less eliminate word size dependence, at least sins against taste. Reducing the temptation to such sin seems worthwhile. :-) Dustin From corina_fff at yahoo.com Sat Jan 16 12:42:07 2010 From: corina_fff at yahoo.com (corina s) Date: Sat, 16 Jan 2010 10:42:07 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: Message-ID: <892210.5766.qm@web45315.mail.sp1.yahoo.com> >From where can I take them? And how modifies the compiling procedure? Thank you, Corina --- On Fri, 1/15/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: "LLVM Developers Mailing List" Date: Friday, January 15, 2010, 4:03 PM Hello > exec: 2: -meabi=4: not found > make[4]: *** [crtbegin.o] Error 1 It seems you don't have cross-binutils for arm-eabi installed. Note that ARM binutils are known to be buggy -? you should use the fresh CVS snapshot. PS: Please use "Reply All" button - this way the copy will be sent to llvm-dev ML and others will be able to comment / use the information as well. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/70b73178/attachment.html From viridia at gmail.com Sat Jan 16 13:15:48 2010 From: viridia at gmail.com (Talin) Date: Sat, 16 Jan 2010 11:15:48 -0800 Subject: [LLVMdev] [patch] Union Types - work in progress In-Reply-To: References: Message-ID: OK here's the patch for real this time :) On Fri, Jan 15, 2010 at 4:36 PM, Talin wrote: > Here's a work in progress of the union patch. Note that the test "union.ll" > does not work, so you probably don't want to check this in as is. However, > I'd be interested in any feedback you're willing to give. > > -- > -- Talin > -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/48a7fc4f/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: union.patch Type: application/octet-stream Size: 44077 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/48a7fc4f/attachment.obj From corina_fff at yahoo.com Sat Jan 16 15:23:57 2010 From: corina_fff at yahoo.com (corina s) Date: Sat, 16 Jan 2010 13:23:57 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <892210.5766.qm@web45315.mail.sp1.yahoo.com> Message-ID: <767388.19631.qm@web45313.mail.sp1.yahoo.com> OK, I put in my classpath the binaries from this package arm-2005q3-1-arm-none-linux-gnueabi-i686-pc-linux-gnu futrhermore in the configure options I specified --with-gnu-ld and? --with-gnu-as but I am getting the same error. exec: 2: -meabi=4: not found Any ideas? --- On Sat, 1/16/10, corina s wrote: From: corina s Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "Anton Korobeynikov" , llvmdev at cs.uiuc.edu Date: Saturday, January 16, 2010, 10:42 AM >From where can I take them? And how modifies the compiling procedure? Thank you, Corina --- On Fri, 1/15/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: "LLVM Developers Mailing List" Date: Friday, January 15, 2010, 4:03 PM Hello > exec: 2: -meabi=4: not found > make[4]: *** [crtbegin.o] Error 1 It seems you don't have cross-binutils for arm-eabi installed. Note that ARM binutils are known to be buggy -? you should use the fresh CVS snapshot. PS: Please use "Reply All" button - this way the copy will be sent to llvm-dev ML and others will be able to comment / use the information as well. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -----Inline Attachment Follows----- _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu? ? ? ???http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/f246154c/attachment.html From anton at korobeynikov.info Sat Jan 16 15:29:16 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 17 Jan 2010 00:29:16 +0300 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <767388.19631.qm@web45313.mail.sp1.yahoo.com> References: <892210.5766.qm@web45315.mail.sp1.yahoo.com> <767388.19631.qm@web45313.mail.sp1.yahoo.com> Message-ID: > exec: 2: -meabi=4: not found > > Any ideas? Yes. As it can be deduced from the name of the package, it's for arm-none-linux-gnueabi, not for arm-eabi. Also, note that the mentioned package is too old for anything useful. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From corina_fff at yahoo.com Sat Jan 16 16:20:24 2010 From: corina_fff at yahoo.com (corina s) Date: Sat, 16 Jan 2010 14:20:24 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: Message-ID: <540939.57343.qm@web45312.mail.sp1.yahoo.com> Can you be more specific? Where can I found it(arm-eabi)? please provide me with some basic steps. Thank you. --- On Sat, 1/16/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu Date: Saturday, January 16, 2010, 1:29 PM > exec: 2: -meabi=4: not found > > Any ideas? Yes. As it can be deduced from the name of the package, it's for arm-none-linux-gnueabi, not for arm-eabi. Also, note that the mentioned package is too old for anything useful. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/da6a9882/attachment.html From anton at korobeynikov.info Sat Jan 16 16:40:54 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 17 Jan 2010 01:40:54 +0300 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <540939.57343.qm@web45312.mail.sp1.yahoo.com> References: <540939.57343.qm@web45312.mail.sp1.yahoo.com> Message-ID: Hello > Can you be more specific? > Where can I found it(arm-eabi)? please provide me with some basic steps. Hrm, I assumed one going to build a compiler for bare metal target would know what to do... Have you looked here: http://www.codesourcery.com/sgpp/lite/arm ? Or maybe you linux distribution vendor provides some pre-built package? I usually build it from sources: http://sourceware.org/binutils/ Also, for bare metal target you will need some C library e.g. newlib. There is some script inside utils/crosstools to automate building of cross-compilers, but I never run it by myself. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From corina_fff at yahoo.com Sat Jan 16 19:44:15 2010 From: corina_fff at yahoo.com (corina s) Date: Sat, 16 Jan 2010 17:44:15 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM Message-ID: <684751.92381.qm@web45309.mail.sp1.yahoo.com> Hello, At this moment I have built from scratch a gcc compiler for ARM and I have in the classpath the binaries. arm-elf-gcc -v Using built-in specs. Target: arm-elf Configured with: ../gcc-4.3.3/configure --target=arm-elf --prefix=/tmp/arm-cortex-toolchain --enable-interwork --enable-multilib --enable-languages=c,c++ --with-newlib --disable-shared --with-gnu-as --with-gnu-ld Thread model: single gcc version 4.3.3 (GCC) arm-elf-as -version GNU assembler (GNU Binutils) 2.19.1 Copyright 2007 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `arm-elf'. Then .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c,c++? --with-gnu-ld=/home/arm/bin/arm-elf-ld --with-gnu-as=/home/arm/bin/arm-elf-as? --target=arm-elf Then the following errors appeared: /tmp/ccm99Neh.s: Assembler messages: /tmp/ccm99Neh.s:96: Error: selected processor does not support `sxtb r5,r5' /tmp/ccm99Neh.s:537: Error: selected processor does not support `sxtb r6,r6' /tmp/ccm99Neh.s:705: Error: selected processor does not support `sxtb r1,r1' /tmp/ccm99Neh.s:711: Error: selected processor does not support `sxtb r1,r1' make[3]: *** [libgcc/thumb/unwind-dw2-fde.o] Error 1 Any ideas? Thank you, Corina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100116/8a5335e6/attachment.html From anton at korobeynikov.info Sat Jan 16 20:09:00 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Sun, 17 Jan 2010 05:09:00 +0300 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <684751.92381.qm@web45309.mail.sp1.yahoo.com> References: <684751.92381.qm@web45309.mail.sp1.yahoo.com> Message-ID: Hello > /tmp/ccm99Neh.s:711: Error: selected processor does not support `sxtb r1,r1' > make[3]: *** [libgcc/thumb/unwind-dw2-fde.o] Error 1 > > Any ideas? Yes. LLVM defaults to ARMv5 in codegeneration and does not support ARMv4. Without any extra option arm-elf-as assumes ARMv4 and thus gives you these errors. So: 1. If your desired target platform is ARMv4 and not newer - then you're out of luck 2. Otherwise - add --with-cpu or --with-arch to llvm-gcc configure to select the processor / arch desired. It seems that you're interested in Cortex CPUs ("/tmp/arm-cortex-toolchain"), then your desired arch is armv7, configure with --with-arch=armv7 or e.g. --with-cpu=cortex-a8 -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From corina_fff at yahoo.com Sun Jan 17 03:00:20 2010 From: corina_fff at yahoo.com (corina s) Date: Sun, 17 Jan 2010 01:00:20 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM Message-ID: <372640.44405.qm@web45305.mail.sp1.yahoo.com> Thanks for your tips. But I'm still getting errors. .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c,c++? --with-gnu-ld=/home/arm/bin/arm-elf-ld --with-gnu-as=/home/arm/bin/arm-elf-as? --with-cpu=cortex-a8? --target=arm-elf Errors: >> checking for g++ that supports -ffunction-sections -fdata-sections... yes configure: error: No support for this host/target combination. make[1]: *** [configure-target-libstdc++-v3] Error 1 >> Thank you for your help, Corina. --- On Sat, 1/16/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu Date: Saturday, January 16, 2010, 6:09 PM Hello > /tmp/ccm99Neh.s:711: Error: selected processor does not support `sxtb r1,r1' > make[3]: *** [libgcc/thumb/unwind-dw2-fde.o] Error 1 > > Any ideas? Yes. LLVM defaults to ARMv5 in codegeneration and does not support ARMv4. Without any extra option arm-elf-as assumes ARMv4 and thus gives you these errors. So: 1. If your desired target platform is ARMv4 and not newer - then you're out of luck 2. Otherwise - add --with-cpu or --with-arch to llvm-gcc configure to select the processor / arch desired. It seems that you're interested in Cortex CPUs ("/tmp/arm-cortex-toolchain"), then your desired arch is armv7, configure with --with-arch=armv7 or e.g. --with-cpu=cortex-a8 -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/781c44b6/attachment-0001.html From corina_fff at yahoo.com Sun Jan 17 03:36:27 2010 From: corina_fff at yahoo.com (corina s) Date: Sun, 17 Jan 2010 01:36:27 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <372640.44405.qm@web45305.mail.sp1.yahoo.com> Message-ID: <651225.94442.qm@web45308.mail.sp1.yahoo.com> Moreover, ?../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c,c++? --with-gnu-ld=/home/arm/bin/arm-elf-ld --with-gnu-as=/home/arm/bin/arm-elf-as? --with-arch=armv7? --target=arm-elf and then make target=arm-elf gives the following error: Unknown arch used in --with-arch=armv7 Thanks. --- On Sun, 1/17/10, corina s wrote: From: corina s Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "Anton Korobeynikov" , llvmdev at cs.uiuc.edu Date: Sunday, January 17, 2010, 1:00 AM Thanks for your tips.. But I'm still getting errors. .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c,c++? --with-gnu-ld=/home/arm/bin/arm-elf-ld --with-gnu-as=/home/arm/bin/arm-elf-as? --with-cpu=cortex-a8? --target=arm-elf Errors: >> checking for g++ that supports -ffunction-sections -fdata-sections... yes configure: error: No support for this host/target combination. make[1]: *** [configure-target-libstdc++-v3] Error 1 >> Thank you for your help, Corina. --- On Sat, 1/16/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu Date: Saturday, January 16, 2010, 6:09 PM Hello > /tmp/ccm99Neh.s:711: Error: selected processor does not support `sxtb r1,r1' > make[3]: *** [libgcc/thumb/unwind-dw2-fde.o] Error 1 > > Any ideas? Yes. LLVM defaults to ARMv5 in codegeneration and does not support ARMv4. Without any extra option arm-elf-as assumes ARMv4 and thus gives you these errors. So: 1. If your desired target platform is ARMv4 and not newer - then you're out of luck 2. Otherwise - add --with-cpu or --with-arch to llvm-gcc configure to select the processor / arch desired. It seems that you're interested in Cortex CPUs ("/tmp/arm-cortex-toolchain"), then your desired arch is armv7, configure with --with-arch=armv7 or e.g. --with-cpu=cortex-a8 -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -----Inline Attachment Follows----- _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu? ? ? ???http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/29f38045/attachment.html From mark.i.r.muir at gmail.com Sun Jan 17 04:56:14 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Sun, 17 Jan 2010 10:56:14 +0000 Subject: [LLVMdev] Frame index arithmetic Message-ID: <7AD1EE04-9CB7-4289-BF42-0DC0BA8A3E1E@gmail.com> I've developed a working back-end for a custom architecture, based on LLVM 2.6. I'm now trying to cover more of the unique features of this architecture. To make use of one such feature, I'm trying something cunning/crazy with the stack - implementing it in a type of memory that can only be addressed via immediates. I've got this mostly working. However, I came across a problem which I've been unable to work around: lowering the IR (even without any optimisations enabled) often requires the pattern: i32 = FrameIndex For normal memory, I was using the following instruction to match this pattern: // Get the address in memory corresponding to the given frame index, saving the address // in a register. def MOV_FI : PseudoInstr<(outs GPR:$dst), (ins frameIndex:$addr), "// $dst := frame index $addr", [(set GPR:$dst, frameIndex:$addr)]>; Which is later replaced by a MOV (output register = stack pointer + constant offset) in eliminateFrameIndex(). However, it isn't appropriate to do this with the proposed stack memory - it doesn't make sense to move the address into a register (where arithmetic can be performed on it), as it isn't possible to move that back to the domain of an immediate. So I conditionally disabled this instruction. But that leads to most programs failing to select the above pattern. The issue is that this pattern is required even in code that doesn't conceptually seem to need it (see the example below). I couldn't figure out how to avoid this during DAG legalisation. Most often, the resulting machine assembly when the above pattern is enabled, simply stores a particular stack slot in a register, for later use in the same basic block, e.g.: MOV out=r4 in=SP+4 LOAD out=r4 addr=r4 despite patterns existing for LOAD with a constant offset (which is successfully used by other stack slots in the same basic block), e.g.: LOAD out=r3 addr=SP off=8 Am I missing some other patterns that would avoid this? For example, is it possible to write patterns that allow for arithmetic involving only immediates, with the result being another immediate? If all else fails, I was thinking of writing a custom pass to identify and remove these. But that could be a lot of work. Thanks, - Mark Example: int result; int foo(int cond, int a, int b) { return cond? a : b; } int main() { return result = foo(1, 2, 3); // Expected: result = 2. } Resulting IR: @result = common global i32 0, align 4 ; [#uses=2] define i32 @foo(i32 %cond, i32 %a, i32 %b) nounwind { entry: %retval = alloca i32 ; [#uses=2] %cond.addr = alloca i32 ; [#uses=2] %a.addr = alloca i32 ; [#uses=2] %b.addr = alloca i32 ; [#uses=2] store i32 %cond, i32* %cond.addr store i32 %a, i32* %a.addr store i32 %b, i32* %b.addr %tmp = load i32* %cond.addr ; [#uses=1] %tobool = icmp ne i32 %tmp, 0 ; [#uses=1] %tmp1 = load i32* %a.addr ; [#uses=1] %tmp2 = load i32* %b.addr ; [#uses=1] %cond3 = select i1 %tobool, i32 %tmp1, i32 %tmp2 ; [#uses=1] store i32 %cond3, i32* %retval %0 = load i32* %retval ; [#uses=1] ret i32 %0 } define i32 @main() nounwind { entry: %retval = alloca i32 ; [#uses=3] store i32 0, i32* %retval %call = call i32 @foo(i32 1, i32 2, i32 3) ; [#uses=1] store i32 %call, i32* @result %tmp = load i32* @result ; [#uses=1] store i32 %tmp, i32* %retval %0 = load i32* %retval ; [#uses=1] ret i32 %0 } (Note: for simplicity, the calling convention in use here places all arguments on the stack) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/9cfd12f5/attachment.html From corina_fff at yahoo.com Sun Jan 17 06:26:14 2010 From: corina_fff at yahoo.com (corina s) Date: Sun, 17 Jan 2010 04:26:14 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <651225.94442.qm@web45308.mail.sp1.yahoo.com> Message-ID: <18097.80603.qm@web45314.mail.sp1.yahoo.com> I recompiled the LLVM-gcc with these options: .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c? --disable-libssp? --with-gnu-ld --with-gnu-as? --with-arch=armv7-a? --target=arm-elf It compiles successfuly. make install. But when compiling a file : llvm-gcc a.c Assembler messages: Fatal error: Invalid -march= option: `armv7-a' Any ideas? Thank you, Corina --- On Sun, 1/17/10, corina s wrote: From: corina s Subject: Re: [LLVMdev] LLVM-gcc for ARM To: llvmdev at cs.uiuc.edu, anton at korobeynikov.info Date: Sunday, January 17, 2010, 1:36 AM Moreover, ?../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c,c++? --with-gnu-ld=/home/arm/bin/arm-elf-ld --with-gnu-as=/home/arm/bin/arm-elf-as? --with-arch=armv7? --target=arm-elf and then make target=arm-elf gives the following error: Unknown arch used in --with-arch=armv7 Thanks. --- On Sun, 1/17/10, corina s wrote: From: corina s Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "Anton Korobeynikov" , llvmdev at cs.uiuc.edu Date: Sunday, January 17, 2010, 1:00 AM Thanks for your tips.. But I'm still getting errors. .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c,c++? --with-gnu-ld=/home/arm/bin/arm-elf-ld --with-gnu-as=/home/arm/bin/arm-elf-as? --with-cpu=cortex-a8? --target=arm-elf Errors: >> checking for g++ that supports -ffunction-sections -fdata-sections... yes configure: error: No support for this host/target combination.. make[1]: *** [configure-target-libstdc++-v3] Error 1 >> Thank you for your help, Corina. --- On Sat, 1/16/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu Date: Saturday, January 16, 2010, 6:09 PM Hello > /tmp/ccm99Neh.s:711: Error: selected processor does not support `sxtb r1,r1' > make[3]: *** [libgcc/thumb/unwind-dw2-fde.o] Error 1 > > Any ideas? Yes. LLVM defaults to ARMv5 in codegeneration and does not support ARMv4. Without any extra option arm-elf-as assumes ARMv4 and thus gives you these errors. So: 1. If your desired target platform is ARMv4 and not newer - then you're out of luck 2. Otherwise - add --with-cpu or --with-arch to llvm-gcc configure to select the processor / arch desired. It seems that you're interested in Cortex CPUs ("/tmp/arm-cortex-toolchain"), then your desired arch is armv7, configure with --with-arch=armv7 or e.g. --with-cpu=cortex-a8 -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -----Inline Attachment Follows----- _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu? ? ? ???http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -----Inline Attachment Follows----- _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu? ? ? ???http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/ddc6a367/attachment.html From mark.i.r.muir at gmail.com Sun Jan 17 07:00:06 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Sun, 17 Jan 2010 13:00:06 +0000 Subject: [LLVMdev] Register Spilling and SSA Message-ID: <37FC589A-3150-4805-BB65-1EEA9110994A@gmail.com> > Much experience has taught me not to trust register allocation papers. > They never actually talk about performance. If I were reviewer, I > might accept a paper based on novelty of the algorithm (far too > many papers are rejected simply because they can't show a 20% speedup) > but I wouldn't give points for reducing the number of spills and > reloads. > Those counts simply don't mean anything in the real world. Sorry to barge in on this thread, but that last sentence there caught me by surprise... This certainly is the case with desktop CPUs, where the hardware designers have gone to a lot of bother adding hardware to perform dynamic rescheduling and register renaming, which effectively replace these stack accesses with registers or access to fast cache. But, with upcoming architectures - particularly ones with a very large number of cores (e.g. something along the lines of Larrabee, or Ambric, and a plethora of others) - such hardware is too costly. As a result, needless stack activity consumes available memory bandwidth, which absolutely hammers instruction-level parallelism. I certainly agree that a fast default register allocator is the best strategy for LLVM, considering its main use. But it would be very nice to have an optional allocator that does minimise spilling, at the cost of run time. I thought it best to raise awareness of where common assumptions (which certainly came around for good reason) can break down in real-world situations, so that progress can be made. Regards, - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/0c8ad0ad/attachment.html From espindola at google.com Sun Jan 17 10:23:01 2010 From: espindola at google.com (Rafael Espindola) Date: Sun, 17 Jan 2010 11:23:01 -0500 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <18097.80603.qm@web45314.mail.sp1.yahoo.com> References: <651225.94442.qm@web45308.mail.sp1.yahoo.com> <18097.80603.qm@web45314.mail.sp1.yahoo.com> Message-ID: <38a0d8451001170823l72e13f4bh54e6fbb75b64f4e2@mail.gmail.com> > llvm-gcc a.c > Assembler messages: > Fatal error: Invalid -march= option: `armv7-a' > > Any ideas? It is probably trying to use the wrong assembler. Run with -v and check. > > Thank you, > Corina > Cheers, -- Rafael ?vila de Esp?ndola From corina_fff at yahoo.com Sun Jan 17 11:03:48 2010 From: corina_fff at yahoo.com (corina s) Date: Sun, 17 Jan 2010 09:03:48 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <38a0d8451001170823l72e13f4bh54e6fbb75b64f4e2@mail.gmail.com> Message-ID: <208754.81838.qm@web45308.mail.sp1.yahoo.com> Hello, Well, I recompiled the LLVM-gcc .../llvm-gcc4.2-2.6.source/configure --prefix=`pwd`/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c? --disable-libssp? --with-gnu-ld=/home/LLVM/llvm-gcc4.2-2.6.source/arm-elf-ld --with-gnu-as=/home/LLVM/llvm-gcc4.2-2.6.source/arm-elf-as? --with-cpu=cortex-a8? --target=arm-elf Everything is OK, but when trying to compile a HelloWorld source it gives me the following error: llvm-gcc a.c a.c:1:19: error: stdio.h: No such file or directory a.c: In function ?main?: a.c:4: warning: incompatible implicit declaration of built-in function ?printf? llvm-gcc -v a.c Using built-in specs. Target: arm-elf Configured with: ../llvm-gcc4.2-2.6.source/configure --prefix=/home/LLVM/obj/../install --program-prefix=llvm- --enable-llvm=/home/LLVM/llvm-2.6/ --enable-languages=c --disable-libssp --with-gnu-ld=/home/LLVM/llvm-gcc4.2-2.6.source/arm-elf-ld --with-gnu-as=/home/LLVM/llvm-gcc4.2-2.6.source/arm-elf-as --with-cpu=cortex-a8 --target=arm-elf Thread model: single gcc version 4.2.1 (Based on Apple Inc. build 5649) (LLVM build) ?/home/LLVM/install/bin/../libexec/gcc/arm-elf/4.2.1/cc1 -quiet -v -iprefix /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/ -D__USES_INITFINI__ a.c -quiet -dumpbase a.c -mcpu=cortex-a8 -auxbase a -version -o /tmp/ccEs2CuD.s ignoring nonexistent directory "/home/LLVM/install/bin/../lib/gcc/arm-elf/4..2.1/../../../../arm-elf/sys-include" ignoring nonexistent directory "/home/LLVM/install/bin/../lib/gcc/arm-elf/4..2.1/../../../../arm-elf/include" ignoring duplicate directory "/home/LLVM/obj/../install/lib/gcc/arm-elf/4.2..1/include" ignoring nonexistent directory "/home/LLVM/obj/../install/lib/gcc/arm-elf/4..2.1/../../../../arm-elf/sys-include" ignoring nonexistent directory "/home/LLVM/obj/../install/lib/gcc/arm-elf/4..2.1/../../../../arm-elf/include" #include "..." search starts here: #include <...> search starts here: ?/home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/include End of search list. GNU C version 4.2.1 (Based on Apple Inc. build 5649) (LLVM build) (arm-elf) ??????? compiled by GNU C version 4.2.4 (Ubuntu 4.2.4-1ubuntu4). GGC heuristics: --param ggc-min-expand=98 --param ggc-min-heapsize=131072 Compiler executable checksum: 2099b8949ecdc7b9705c423e43d0f9c7 a.c:1:19: error: stdio.h: No such file or directory a.c: In function ?main?: a.c:4: warning: incompatible implicit declaration of built-in function ?printf? Why those directories doesn't exist? Thanks for your help, Corina. P.S. I will recompile and run -v to see which version of assembler is called. ? --- On Sun, 1/17/10, Rafael Espindola wrote: From: Rafael Espindola Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu, anton at korobeynikov.info Date: Sunday, January 17, 2010, 8:23 AM > llvm-gcc a.c > Assembler messages: > Fatal error: Invalid -march= option: `armv7-a' > > Any ideas? It is probably trying to use the wrong assembler. Run with -v and check. > > Thank you, > Corina > Cheers, -- Rafael ?vila de Esp?ndola -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/6d01bed0/attachment.html From espindola at google.com Sun Jan 17 11:35:12 2010 From: espindola at google.com (Rafael Espindola) Date: Sun, 17 Jan 2010 12:35:12 -0500 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <208754.81838.qm@web45308.mail.sp1.yahoo.com> References: <38a0d8451001170823l72e13f4bh54e6fbb75b64f4e2@mail.gmail.com> <208754.81838.qm@web45308.mail.sp1.yahoo.com> Message-ID: <38a0d8451001170935x744cf83fn977fa51ea2b4dd25@mail.gmail.com> > llvm-gcc a.c > a.c:1:19: error: stdio.h: No such file or directory > a.c: In function ?main?: > a.c:4: warning: incompatible implicit declaration of built-in function ?printf? This is provided by libc. Do you have an ARM libc? You should configure llvm-gcc with --with-sysroot pointing to the libc install directory. If you are building from scratch, you have to *) Build binutils *) Build llvm-gcc *) Use that to build a libc (newlib, glibc, etc) *) Build a new llvm-gcc that can use that libc Cheers, -- Rafael ?vila de Esp?ndola From corina_fff at yahoo.com Sun Jan 17 12:57:41 2010 From: corina_fff at yahoo.com (corina s) Date: Sun, 17 Jan 2010 10:57:41 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <38a0d8451001170935x744cf83fn977fa51ea2b4dd25@mail.gmail.com> Message-ID: <107534.95850.qm@web45304.mail.sp1.yahoo.com> OK, thank you. Are there some pre-built X86 binaries for LLVM-gcc for ARM? It yes, where can I download from? Thank you, Corina --- On Sun, 1/17/10, Rafael Espindola wrote: From: Rafael Espindola Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu, anton at korobeynikov.info Date: Sunday, January 17, 2010, 9:35 AM > llvm-gcc a.c > a.c:1:19: error: stdio.h: No such file or directory > a.c: In function ?main?: > a.c:4: warning: incompatible implicit declaration of built-in function ?printf? This is provided by libc. Do you have an ARM libc? You should configure llvm-gcc with --with-sysroot pointing to the libc install directory. If you are building from scratch, you have to *) Build binutils *) Build llvm-gcc *) Use that to build a libc (newlib, glibc,? etc) *) Build a new llvm-gcc that can use that libc Cheers, -- Rafael ?vila de Esp?ndola -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/539d9e64/attachment.html From corina_fff at yahoo.com Sun Jan 17 18:20:19 2010 From: corina_fff at yahoo.com (corina s) Date: Sun, 17 Jan 2010 16:20:19 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <107534.95850.qm@web45304.mail.sp1.yahoo.com> Message-ID: <879728.8637.qm@web45315.mail.sp1.yahoo.com> Hello, I recompiled binutils, llvm-gcc, libc, and again llvm-gcc and now it gives me the following error: llvm-gcc --v Using built-in specs. Target: arm-elf Configured with: ......./ (reconfigured) ../llvm-gcc4.2-2.6.source/configure --target=arm-elf --enable-llvm=/home/LLVM/llvm-2.6/ --with-arch=armv7-a --enable-languages=c --prefix=/home/LLVM/install --enable-multilib --with-newlib --without-headers --disable-shared --with-gnu-as --with-gnu-ld --program-prefix=llvm- --disable-libssp --with-sysroot=/home/LLVM/build/arm-elf/newlib/ llvm-gcc HelloWorld.c /home/LLVM/install/lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: this linker was not configured to use sysroots collect2: ld returned 1 exit status I would appreciate some help from you. Corina --- On Sun, 1/17/10, corina s wrote: From: corina s Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "Rafael Espindola" , llvmdev at cs.uiuc.edu, anton at korobeynikov.info Date: Sunday, January 17, 2010, 10:57 AM OK, thank you. Are there some pre-built X86 binaries for LLVM-gcc for ARM? It yes, where can I download from? Thank you, Corina --- On Sun, 1/17/10, Rafael Espindola wrote: From: Rafael Espindola Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: llvmdev at cs.uiuc.edu, anton at korobeynikov.info Date: Sunday, January 17, 2010, 9:35 AM > llvm-gcc a.c > a.c:1:19: error: stdio.h: No such file or directory > a.c: In function ?main?: > a.c:4: warning: incompatible implicit declaration of built-in function ?printf? This is provided by libc. Do you have an ARM libc? You should configure llvm-gcc with --with-sysroot pointing to the libc install directory. If you are building from scratch, you have to *) Build binutils *) Build llvm-gcc *) Use that to build a libc (newlib, glibc, etc) *) Build a new llvm-gcc that can use that libc Cheers, -- Rafael ?vila de Esp?ndola -----Inline Attachment Follows----- _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu? ? ? ???http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100117/2f3d0cc4/attachment.html From espindola at google.com Sun Jan 17 18:23:47 2010 From: espindola at google.com (Rafael Espindola) Date: Sun, 17 Jan 2010 19:23:47 -0500 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <879728.8637.qm@web45315.mail.sp1.yahoo.com> References: <107534.95850.qm@web45304.mail.sp1.yahoo.com> <879728.8637.qm@web45315.mail.sp1.yahoo.com> Message-ID: <38a0d8451001171623w79c1a4a2n82950dd07c889b1a@mail.gmail.com> > llvm-gcc HelloWorld.c > /home/LLVM/install/lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: this linker was not configured to use sysroots > collect2: ld returned 1 exit status > > I would appreciate some help from you. You have to pass --with-sysroot when building binutils too. > Corina Cheers, -- Rafael ?vila de Esp?ndola From anton at korobeynikov.info Mon Jan 18 01:22:52 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 18 Jan 2010 10:22:52 +0300 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <38a0d8451001171623w79c1a4a2n82950dd07c889b1a@mail.gmail.com> References: <107534.95850.qm@web45304.mail.sp1.yahoo.com> <879728.8637.qm@web45315.mail.sp1.yahoo.com> <38a0d8451001171623w79c1a4a2n82950dd07c889b1a@mail.gmail.com> Message-ID: >> llvm-gcc HelloWorld.c >> /home/LLVM/install/lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: this linker was not configured to use sysroots >> collect2: ld returned 1 exit status >> >> I would appreciate some help from you. > > You have to pass --with-sysroot when building binutils too. It might be easier just to build newlib during the build of llvm-gcc. Also, I don't think libc is usable /exists for bare-metal target (arm-elf/arm-eabi). -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From kqyang at fudan.edu.cn Mon Jan 18 02:36:52 2010 From: kqyang at fudan.edu.cn (kqyang) Date: Mon, 18 Jan 2010 16:36:52 +0800 Subject: [LLVMdev] question in LLVM IR References: <107534.95850.qm@web45304.mail.sp1.yahoo.com> <879728.8637.qm@web45315.mail.sp1.yahoo.com> <38a0d8451001171623w79c1a4a2n82950dd07c889b1a@mail.gmail.com> Message-ID: <201001181636519652161@fudan.edu.cn> hi all I read LLVM manual, and have a question in the LLVM IR. The IR is low-level SSA based instruction set. I want to know what the difference between the LLVM IR and SSA form? They are same, similar to tree-ssa of GCC convertion GIMPLE tree to SSA and convert back when optimizations finish? or just attach the SSA info in the LLVM IR? Thanks for your feedback. -Joey -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100118/4c82a3a3/attachment.html From baldrick at free.fr Mon Jan 18 03:19:45 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 18 Jan 2010 10:19:45 +0100 Subject: [LLVMdev] question in LLVM IR In-Reply-To: <201001181636519652161@fudan.edu.cn> References: <107534.95850.qm@web45304.mail.sp1.yahoo.com> <879728.8637.qm@web45315.mail.sp1.yahoo.com> <38a0d8451001171623w79c1a4a2n82950dd07c889b1a@mail.gmail.com> <201001181636519652161@fudan.edu.cn> Message-ID: <4B5427B1.1060705@free.fr> Hi Joey, > I read LLVM manual, and have a question in the LLVM IR. > The IR is low-level SSA based instruction set. > > I want to know what the difference between the LLVM IR and SSA form? > They are same, similar to tree-ssa of GCC convertion GIMPLE tree to SSA > and convert back when optimizations finish? > or just attach the SSA info in the LLVM IR? LLVM IR is always in SSA form - there is no conversion between a non-SSA version and an SSA-version, there is only the SSA version. Consider an LLVM load instruction: %tmp = load i32* %addr Here %addr is the pointer being loaded from, and %tmp is the loaded value. Since LLVM IR is in SSA form, %tmp cannot be defined differently later, so %tmp is *equivalent* to the instruction "load i32* %addr". Thus (unlike in GIMPLE) there is no actual assignment going on in "%tmp = ...", this textual form is just for the benefit of human readers, and means that %tmp is the name of the instruction on the right-hand side. This is why you will search in vain for an assignment instruction in the IR definition: there isn't one. In order to get the effect of assigning multiple times to a variable, you either need to generate explicit phi nodes, or generate explicit stores to memory (unlike registers like %tmp, memory is not in SSA form; this is the same as in GIMPLE). Ciao, Duncan. From corina_fff at yahoo.com Mon Jan 18 05:27:32 2010 From: corina_fff at yahoo.com (corina s) Date: Mon, 18 Jan 2010 03:27:32 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM Message-ID: <703125.78294.qm@web45312.mail.sp1.yahoo.com> hi, So... I followed the following steps in order to compile the llvm-gcc frontend. The single problem is that mentioned in the previous message: llvm-gcc HelloWorld.c /home/LLVM/install/lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: this linker was not configured to use sysroots collect2: ld returned 1 exit status 1. I have compiled the binutils for the arm-none-eabi target. Options: --target=arm-none-eabi --enable-multilib --with-gnu-as --with-gnu-ld --disable-nls --disable-libssp 2. I have compiled the LLVM-gcc source code with the following options: --target=arm-none-eabi --enable-llvm=path/to/llvm --with-arch=armv7-a --enable-languages=c --enable-multilib --with-newlib --enable-internetwork --without-headers --disable-shared --with-gnu-as --with-gnu-ld --disable-libssp 3. Next: taken newlib-1.17.0 source code and configured with the following options: --target=arm-none-eabi --eanble-multilib --with-gnu-as --with-gnu-ld --dlsable-nls --disable-libssp 4. finally have repeated the step 2, but with an extraoption: --with-sysroot=/home/LLVM/build/arm-none-eabi/newlib/libc. ? What I am supposed to do next? you said that I have to recompile binutils with --with-sysroot option? Thank you. --- On Sun, 1/17/10, Anton Korobeynikov wrote: From: Anton Korobeynikov Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "Rafael Espindola" Cc: "corina s" , llvmdev at cs.uiuc.edu Date: Sunday, January 17, 2010, 11:22 PM >> llvm-gcc HelloWorld.c >> /home/LLVM/install/lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: this linker was not configured to use sysroots >> collect2: ld returned 1 exit status >> >> I would appreciate some help from you. > > You have to pass --with-sysroot when building binutils too. It might be easier just to build newlib during the build of llvm-gcc. Also, I don't think libc is usable /exists for bare-metal target (arm-elf/arm-eabi). -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100118/3c8380c0/attachment.html From espindola at google.com Mon Jan 18 06:26:19 2010 From: espindola at google.com (Rafael Espindola) Date: Mon, 18 Jan 2010 07:26:19 -0500 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <703125.78294.qm@web45312.mail.sp1.yahoo.com> References: <703125.78294.qm@web45312.mail.sp1.yahoo.com> Message-ID: <38a0d8451001180426w4d91b3b0pf5dedfc64868a08@mail.gmail.com> > What I am supposed to do next? you said that I have to recompile binutils with --with-sysroot option? It is not an extra step. You have to add --with-sysroot to step 1. > Thank you. Cheers, -- Rafael ?vila de Esp?ndola From miaoyisz at gmail.com Mon Jan 18 07:08:55 2010 From: miaoyisz at gmail.com (Mary_nju) Date: Mon, 18 Jan 2010 05:08:55 -0800 (PST) Subject: [LLVMdev] How to create a CallInst that calls a standard c function like "printf" Message-ID: <27210247.post@talk.nabble.com> I am working on a program based on LLVM. I want to modify the .bc file throught C++ APIs provided by LLVM, but I don't know how to create a CallInst that calls a standard c function like "printf", can anyone help me with this problem? The file attached is the program I wrote, it can be compiled, however, the result of the dump of the retrieved module is not correct(missing global variable and will cause 'program use external function 'myprintf' which could not be resolved') problem. http://old.nabble.com/file/p27210247/test.cpp test.cpp -- View this message in context: http://old.nabble.com/How-to-create-a-CallInst-that-calls-a-standard-c-function-like-%22printf%22-tp27210247p27210247.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From 49640f8a at gmail.com Mon Jan 18 10:36:34 2010 From: 49640f8a at gmail.com (Martins Mozeiko) Date: Mon, 18 Jan 2010 18:36:34 +0200 Subject: [LLVMdev] JIT on ARM Message-ID: <3C7EFA3E-DEE9-435E-9570-AA925A56815E@gmail.com> Hi. I am trying to run LLVM with JIT on ARM processor (Android phone). Currently I have problems using external functions. Any call to external function crashes and gives me signal 11 (SIGSEGV) at some random address. I'm trying to run following C code: *** extern void add1(int* x); int main() { int a = 10; int b = 20; add1(&b); int c = a + b; return c; } *** It gives me following LL code: *** define i32 @main() nounwind { entry: %b = alloca i32, align 4 ; [#uses=3] store i32 20, i32* %b, align 4 call void @add1(i32* %b) nounwind %0 = load i32* %b, align 4 ; [#uses=1] %1 = add nsw i32 %0, 10 ; [#uses=1] ret i32 %1 } declare void @add1(i32*) *** When using llvm::DebugFlag=true JIT gives me following debug messages: ********** Function: main Ifcvt: function (0) 'main' block 0 offset 0 size 40 block 0 offset 0 size 40 JITTing function 'main' JIT: Starting CodeGen of Function main JIT: Emitting BB0 at [0x4512e010] JIT: 0x4512e010: STM %SP, 12, 14, %reg0, %R11, %LR 0xe92d4800 JIT: 0x4512e014: %SP = SUBri %SP, 8, 14, %reg0, %reg0 0xe24dd008 JIT: 0x4512e018: %R0 = MOVi 20, 14, %reg0, %reg0 0xe3a00014 JIT: 0x4512e01c: STR %R0, %SP, %reg0, 4, 14, %reg0, Mem:ST(4,4) [b + 0] 0xe58d0004 JIT: 0x4512e020: %R0 = ADDri %SP, 4, 14, %reg0, %reg0 0xe28d0004 JIT: 0x4512e024: BL , %R0, %R0, %R1, %R2, %R3, %R12, %LR, %D0, %D1, %D2, %D3, %D4, %D5, %D6, %D7, %D16, %D17, %D18, %D19, %D20, %D21, %D22, %D23, %D24, %D25, %D26, %D27, %D28, %D29, %D30, %D31, %CPSR 0xeb000000 JIT: 0x4512e028: %R0 = LDR %SP, %reg0, 4, 14, %reg0, Mem:LD(4,4) [b + 0] 0xe59d0004 JIT: 0x4512e02c: %R0 = ADDri %R0, 10, 14, %reg0, %reg0 0xe280000a JIT: 0x4512e030: %SP = ADDri %SP, 8, 14, %reg0, %reg0 0xe28dd008 JIT: 0x4512e034: LDM_RET %SP, 9, 14, %reg0, %R11, %PC 0xe8bd8800 JIT: Map 'add1' to [0x808e80b4] JIT: Stub emitted at [0x42540008] for function 'add1' JIT: Finished CodeGen of [0x4512e010] Function: main: 40 bytes of text, 1 relocations JIT: Binary code: JIT: 00000000: e92d4800 e24dd008 e3a00014 e58d0004 JIT: 00000010: e28d0004 eb5047f7 e59d0004 e280000a JIT: 00000020: e28dd008 e8bd8800 **** Just to be sure I inform LLVM about add1 function with following code: extern "C" void my_add1(int* x) { LOG("in add1, x=%i\n", *x); *x = *x + 1; } sys::DynamicLibrary::AddSymbol("add1", (void*)&my_add1); Is there something wrong with generated code? Same code, using same test program, runs fine using JIT on x86 under Windows. I have successfuly run program that doesn't use external functions. For example following C code **** int main() { int a = 10; int b = 20; int c = a + b; return c; } **** compiles to following LL code: **** define i32 @main() nounwind readnone { entry: ret i32 30 } **** which runs fine on ARM. When using llvm::DebugFlag=true JIT prints following debug messages: ********** Function: main Ifcvt: function (0) 'main' block 0 offset 0 size 8 block 0 offset 0 size 8 JITTing function 'main' JIT: Starting CodeGen of Function main JIT: Emitting BB0 at [0x4512e010] JIT: 0x4512e010: %R0 = MOVi 30, 14, %reg0, %reg0 0xe3a0001e JIT: 0x4512e014: BX_RET 14, %reg0, %R0 0xe12fff1e JIT: Finished CodeGen of [0x4512e010] Function: main: 8 bytes of text, 0 relocations JIT: Binary code: JIT: 00000000: e3a0001e e12fff1e *** I appreciate any suggestions what can I do in my situation. -- Martins Mozeiko From dag at cray.com Mon Jan 18 10:54:04 2010 From: dag at cray.com (David Greene) Date: Mon, 18 Jan 2010 10:54:04 -0600 Subject: [LLVMdev] [PATCH] SelectionDAG Debugging In-Reply-To: <7E0488E5-339E-42BD-9B00-55881D8E3EA2@apple.com> References: <201001131610.23444.dag@cray.com> <201001151634.07534.dag@cray.com> <7E0488E5-339E-42BD-9B00-55881D8E3EA2@apple.com> Message-ID: <201001181054.04541.dag@cray.com> On Friday 15 January 2010 18:14, Dan Gohman wrote: > > Ah, one thing. dumpr uses DumpNodesr which does the "once" thing. > > I actually would prefer a full dump. Perhaps we shouldn't try to > > unify them, or maybe provide a flag to control behavior. In the > > past separate APIs have been preferred over flags. I can rename > > the ones I added to be more consistent with the existing stuff. > > > > Opinions? > > I use the GraphViz viewer almost exclusively, so I don't have a > strong opinion. > > Methods with lots of flags are inconvenient to call from a debugger. > I'd suggesting coming up with a few common use cases, and providing > interfaces to cover those use cases, and not trying to provide > lots of extra generality. The interfaces I added are exactly the ones I needed to debug a problem. So I think they are pretty minimal. > If SDNode::dumpr() had built-in cycle detection, and indicated > cycles with big capital letters, would you still want a recursive > dump which doesn't do the "once" thing? Or, if the "once" thing > had a more human-oriented syntax, would it be usable? No, I don't think either would be sufficient. I really, really, really want to see the real DAG. I can't think of how "once" could give the same information. -Dave From dag at cray.com Mon Jan 18 10:57:33 2010 From: dag at cray.com (David Greene) Date: Mon, 18 Jan 2010 10:57:33 -0600 Subject: [LLVMdev] Register Spilling and SSA In-Reply-To: <37FC589A-3150-4805-BB65-1EEA9110994A@gmail.com> References: <37FC589A-3150-4805-BB65-1EEA9110994A@gmail.com> Message-ID: <201001181057.33520.dag@cray.com> On Sunday 17 January 2010 07:00, Mark Muir wrote: > Those counts simply don't mean anything in the real world. > This certainly is the case with desktop CPUs, where the hardware designers > have gone to a lot of bother adding hardware to perform dynamic > rescheduling and register renaming, which effectively replace these stack > accesses with registers or access to fast cache. > > But, with upcoming architectures - particularly ones with a very large > number of cores (e.g. something along the lines of Larrabee, or Ambric, > and a plethora of others) - such hardware is too costly. As a result, > needless stack activity consumes available memory bandwidth, which > absolutely hammers instruction-level parallelism. Granted, "anything" is a bit strong. But I stand by the idea that measuring spills isn't telling the real story. The real story often has more to do with *what* is spilled rather than how much is spilled. I believe that is true on in-order machines as well. It's true that instruction count generally tracks performance. But when trying to eke out the last bit of speedup, simple counts simply aren't enough. Changing a register allocator is a rather drastic thing to do, so I want to see how it really impacts performance. -Dave From corina_fff at yahoo.com Mon Jan 18 11:26:31 2010 From: corina_fff at yahoo.com (corina s) Date: Mon, 18 Jan 2010 09:26:31 -0800 (PST) Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <38a0d8451001180426w4d91b3b0pf5dedfc64868a08@mail.gmail.com> Message-ID: <754420.71552.qm@web45316.mail.sp1.yahoo.com> OK. Now I have obtained some other errors, I think that they are generated due to the operating system. /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: ERROR: /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/crt0.o uses FPA instructions, whereas a.out does not /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: ERROR: /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/crt0.o uses hardware FP, whereas a.out uses software FP /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: failed to merge target specific data of file /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/crt0.o /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: ERROR: /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/libc.a(lib_a-atexit.o) uses FPA instructions, whereas a.out does not /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: ERROR: /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/libc.a(lib_a-atexit.o) uses hardware FP, whereas a.out uses software FP /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: failed to merge target specific data of file /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/libc.a(lib_a-atexit.o) /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: ERROR: /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/libc.a(lib_a-exit.o) uses FPA instructions, whereas a.out does not /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/bin/ld: ERROR: /home/LLVM/install/bin/../lib/gcc/arm-elf/4.2.1/../../../../arm-elf/lib/libc.a(lib_a-exit.o) uses hardware FP, whereas a.out uses software FP Can you tell me how can I fix this last(hopefully) problem? Thanks. --- On Mon, 1/18/10, Rafael Espindola wrote: From: Rafael Espindola Subject: Re: [LLVMdev] LLVM-gcc for ARM To: "corina s" Cc: anton at korobeynikov.info, llvmdev at cs.uiuc.edu Date: Monday, January 18, 2010, 4:26 AM > What I am supposed to do next? you said that I have to recompile binutils with --with-sysroot option? It is not an extra step. You have to add --with-sysroot to step 1. > Thank you. Cheers, -- Rafael ?vila de Esp?ndola -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100118/3681f0e0/attachment.html From samuraileumas at yahoo.com Mon Jan 18 12:04:30 2010 From: samuraileumas at yahoo.com (Samuel Crow) Date: Mon, 18 Jan 2010 10:04:30 -0800 (PST) Subject: [LLVMdev] Finding the host datalayout Message-ID: <62524.26201.qm@web62005.mail.re1.yahoo.com> Hello all, As we work the last few bugs out of our project for the last release, we need to find a way to set the default datalayout of the LLVM Assembly file we are generating to be that of the host machine. I've seen options for target triples in the Doxygen but not the datalayout. BTW, we're using version 2.6 of LLVM. --Sam From evan.cheng at apple.com Mon Jan 18 13:02:24 2010 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 18 Jan 2010 11:02:24 -0800 Subject: [LLVMdev] Frame index arithmetic In-Reply-To: <7AD1EE04-9CB7-4289-BF42-0DC0BA8A3E1E@gmail.com> References: <7AD1EE04-9CB7-4289-BF42-0DC0BA8A3E1E@gmail.com> Message-ID: <2B28F2EF-B836-4F14-A3DC-F7EE424FF1A5@apple.com> On Jan 17, 2010, at 2:56 AM, Mark Muir wrote: > I've developed a working back-end for a custom architecture, based on LLVM 2.6. I'm now trying to cover more of the unique features of this architecture. > > To make use of one such feature, I'm trying something cunning/crazy with the stack - implementing it in a type of memory that can only be addressed via immediates. > > I've got this mostly working. However, I came across a problem which I've been unable to work around: lowering the IR (even without any optimisations enabled) often requires the pattern: > > i32 = FrameIndex > > For normal memory, I was using the following instruction to match this pattern: > > // Get the address in memory corresponding to the given frame index, saving the address > // in a register. > def MOV_FI : PseudoInstr<(outs GPR:$dst), (ins frameIndex:$addr), > "// $dst := frame index $addr", > [(set GPR:$dst, frameIndex:$addr)]>; > > Which is later replaced by a MOV (output register = stack pointer + constant offset) in eliminateFrameIndex(). > > However, it isn't appropriate to do this with the proposed stack memory - it doesn't make sense to move the address into a register (where arithmetic can be performed on it), as it isn't possible to move that back to the domain of an immediate. So I conditionally disabled this instruction. But that leads to most programs failing to select the above pattern. > > The issue is that this pattern is required even in code that doesn't conceptually seem to need it (see the example below). I couldn't figure out how to avoid this during DAG legalisation. Most often, the resulting machine assembly when the above pattern is enabled, simply stores a particular stack slot in a register, for later use in the same basic block, e.g.: > > MOV out=r4 in=SP+4 > LOAD out=r4 addr=r4 > > despite patterns existing for LOAD with a constant offset (which is successfully used by other stack slots in the same basic block), e.g.: > > LOAD out=r3 addr=SP off=8 > > Am I missing some other patterns that would avoid this? For example, is it possible to write patterns that allow for arithmetic involving only immediates, with the result being another immediate? Sounds like your load / store address selection routine isn't working like what you expected. Evan > > If all else fails, I was thinking of writing a custom pass to identify and remove these. But that could be a lot of work. > > Thanks, > > - Mark > > > Example: > > int result; > > int foo(int cond, int a, int b) > { > return cond? a : b; > } > > int main() > { > return result = foo(1, 2, 3); > // Expected: result = 2. > } > > Resulting IR: > > @result = common global i32 0, align 4 ; [#uses=2] > > define i32 @foo(i32 %cond, i32 %a, i32 %b) nounwind { > entry: > %retval = alloca i32 ; [#uses=2] > %cond.addr = alloca i32 ; [#uses=2] > %a.addr = alloca i32 ; [#uses=2] > %b.addr = alloca i32 ; [#uses=2] > store i32 %cond, i32* %cond.addr > store i32 %a, i32* %a.addr > store i32 %b, i32* %b.addr > %tmp = load i32* %cond.addr ; [#uses=1] > %tobool = icmp ne i32 %tmp, 0 ; [#uses=1] > %tmp1 = load i32* %a.addr ; [#uses=1] > %tmp2 = load i32* %b.addr ; [#uses=1] > %cond3 = select i1 %tobool, i32 %tmp1, i32 %tmp2 ; [#uses=1] > store i32 %cond3, i32* %retval > %0 = load i32* %retval ; [#uses=1] > ret i32 %0 > } > > define i32 @main() nounwind { > entry: > %retval = alloca i32 ; [#uses=3] > store i32 0, i32* %retval > %call = call i32 @foo(i32 1, i32 2, i32 3) ; [#uses=1] > store i32 %call, i32* @result > %tmp = load i32* @result ; [#uses=1] > store i32 %tmp, i32* %retval > %0 = load i32* %retval ; [#uses=1] > ret i32 %0 > } > > (Note: for simplicity, the calling convention in use here places all arguments on the stack) > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100118/9b3be6b4/attachment.html From evan.cheng at apple.com Mon Jan 18 13:04:47 2010 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 18 Jan 2010 11:04:47 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: <305d6f61001151204x6c01f4fap2ebfbf979a35022@mail.gmail.com> References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> <305d6f61001151204x6c01f4fap2ebfbf979a35022@mail.gmail.com> Message-ID: On Jan 15, 2010, at 12:04 PM, Sandeep Patel wrote: > On Fri, Jan 15, 2010 at 6:03 PM, Chris Lattner wrote: >> >> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >> >>> Hi, >>> >>> On ARMv6T2 this turns cttz into rbit, clz instead of the 4 >>> instruction sequence it is now. >>> >>> I'm not sure if adding RBIT to ARMISD and doing this optimization in >>> the legalize pass is the best option, but the only better way I >>> could think of doing it was to add a bitreverse intrinsic to llvm >>> ir, which itself might not be the best option since bitreverse >>> probably isn't too common. >> >> I haven't looked at the patch in detail, but this approach makes sense >> to me. >> >>> Other targets that I know of that could potentially benefit from >>> this optimization being global (that have a clz and bitreverse >>> instruction but not ctz) are AVR32 and C64x, neither of which llvm >>> has backends for yet. >> >> When/if another target wants this, we could add a ISD::RBIT operation, >> it doesn't need to be added at the llvm ir level, > > Bit reversal turns up in most FFT algorithms, so it wouldn't hurt to > be able to add an instcombine that recognizes it, etc. I agree with Chris it doesn't make sense to add a llvm instruction for this since it's rare. But it's something that can be recognized in dag combine / isel. Can you attach some examples? Evan > > deep > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From evan.cheng at apple.com Mon Jan 18 13:07:33 2010 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 18 Jan 2010 11:07:33 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> Message-ID: <78ADE3B0-50F6-4B00-B83D-1F20E8F2133D@apple.com> On Jan 15, 2010, at 2:52 PM, Jim Grosbach wrote: > > On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: > >> >> On 15 Jan 2010, at 18:03, Chris Lattner wrote: >> >>> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >>> >>>> Other targets that I know of that could potentially benefit from >>>> this optimization being global (that have a clz and bitreverse >>>> instruction but not ctz) are AVR32 and C64x, neither of which llvm >>>> has backends for yet. >>> >>> When/if another target wants this, we could add a ISD::RBIT >>> operation, >>> it doesn't need to be added at the llvm ir level, >> >> The XCore also has ctlz and bitreverse instructions and not cttz. At >> the moment in the XCore backend cttz is marked as legal and expanded >> to this pair of instructions in a pattern in the InstrInfo.td. > > In that case, perhaps it makes sense to add it as an ISD::RBIT > operation straight away. Since only a couple of targets can use this, it shouldn't block this patch from going in. Jim, can you commit this? Thanks, Evan > > The rest of the patch looks good to me. > > -Jim > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From vkutuzov at accesssoftek.com Mon Jan 18 13:52:09 2010 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Mon, 18 Jan 2010 11:52:09 -0800 Subject: [LLVMdev] LLVM-gcc for ARM In-Reply-To: <328335.13577.qm@web45315.mail.sp1.yahoo.com> References: <328335.13577.qm@web45315.mail.sp1.yahoo.com> Message-ID: <6AE1604EE3EC5F4296C096518C6B77EE01884D4387@mail.accesssoftek.com> Hello Corina, I used a two-stage sequence to build the llvm and llvm-gcc with the codesourcery toolchain and my custom built arm toolchain. There is some scripted chunks for each step. I have attached them as a single file to this email. May be it will help you somehow. Viktor. --- From: llvmdev-bounces at cs.uiuc.edu [llvmdev-bounces at cs.uiuc.edu] On Behalf Of corina s [corina_fff at yahoo.com] Sent: Friday, January 15, 2010 12:54 PM To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] LLVM-gcc for ARM Hello, I am building llvm-gcc4.2-2.6 for ARM target. I used the next command line option:>>../configure --enable-languages=c,c++ --enable-checking --target=arm-eabi>> and then>> make target_alias=arm-eabi>>And then I obtain the following error:In file included from ../../gcc/config/arm/arm.c:59:./../../libcpp/internal.h: In function ?ufputs?:./../../libcpp/internal.h:693: warning: implicit declaration of function ?fputs_unlocked?../../gcc/config/arm/arm.c: At top level:../../gcc/config/arm/arm.c:514: error: ?MASK_INTERWORK? undeclared here (not in a function)../../gcc/config/arm/arm.c: In function ?optimization_options?:../../gcc/config/arm/arm.c:23444: warning: unused parameter ?level?What would be the problem?It is OK the configure line? Thanks,Corina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100118/9ebeb5a2/attachment.htm From grosbach at apple.com Mon Jan 18 13:59:47 2010 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 18 Jan 2010 11:59:47 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: <78ADE3B0-50F6-4B00-B83D-1F20E8F2133D@apple.com> References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> <78ADE3B0-50F6-4B00-B83D-1F20E8F2133D@apple.com> Message-ID: <062505A9-37E6-46E4-9729-4C9885051329@apple.com> On Jan 18, 2010, at 11:07 AM, Evan Cheng wrote: > > On Jan 15, 2010, at 2:52 PM, Jim Grosbach wrote: > >> >> On Jan 15, 2010, at 11:37 AM, Richard Osborne wrote: >> >>> >>> On 15 Jan 2010, at 18:03, Chris Lattner wrote: >>> >>>> On Jan 14, 2010, at 10:13 PM, David Conrad wrote: >>>> >>>>> Other targets that I know of that could potentially benefit from >>>>> this optimization being global (that have a clz and bitreverse >>>>> instruction but not ctz) are AVR32 and C64x, neither of which llvm >>>>> has backends for yet. >>>> >>>> When/if another target wants this, we could add a ISD::RBIT >>>> operation, >>>> it doesn't need to be added at the llvm ir level, >>> >>> The XCore also has ctlz and bitreverse instructions and not cttz. At >>> the moment in the XCore backend cttz is marked as legal and expanded >>> to this pair of instructions in a pattern in the InstrInfo.td. >> >> In that case, perhaps it makes sense to add it as an ISD::RBIT >> operation straight away. > > Since only a couple of targets can use this, it shouldn't block this patch from going in. Jim, can you commit this? > Works for me. Done in r93758. Thanks for doing this, David. -Jim From clattner at apple.com Mon Jan 18 15:09:17 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 18 Jan 2010 13:09:17 -0800 Subject: [LLVMdev] mkpatch patch In-Reply-To: <42219856-8E7C-4F92-8EED-470857E8F705@gmail.com> References: <42219856-8E7C-4F92-8EED-470857E8F705@gmail.com> Message-ID: <9C8FE0FE-0287-46C6-9A01-E83F40A8E986@apple.com> On Jan 15, 2010, at 6:12 PM, Garrison Venn wrote: > I've included a patch which does not remove mkpatch but does remove > diff search > directories which caused a failure because those directories were no > longer > in svn. I was uncomfortable removing mkpatch since I believe it > helps document > creating patches for beginners who do not use separate source and > build (object) > root directories. Its existence is also expected by readers of: http://llvm.org/docs/DeveloperPolicy.html#patches > . Applied in r93771, thanks! From clattner at apple.com Mon Jan 18 15:11:58 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 18 Jan 2010 13:11:58 -0800 Subject: [LLVMdev] [PATCH] - Union types, attempt 2 In-Reply-To: References: <299D5308-32B6-4D04-9D84-E8BFF3767F3B@apple.com> Message-ID: <0352B218-D9E0-4F7A-9898-AECA6A372AA4@apple.com> > Using a union here (as opposed to using bitcast) solves a number of > problems: > > 1) The size of the struct is automatically calculated by taking the > largest field of the union. Without unions, your frontend would have > to calculate the size of each possible field, as well as their > alignment, and use that to figure the maximum structure size. If > your front-end is target-agnostic, you may not even know how to > calculate the correct struct size. > > 2) The struct is small enough to be returned as a first-class SSA > value, and with a union you can use it directly. Since bitcast only > works on pointers, in order to use it you would have to alloca some > temporary memory to hold the function result, store the result into > it, then use a combination of GEP and bitcast to get a correctly- > typed pointer to the second field, and finally load the value. With > a union, you can simply extract the second field without ever having > to muck about with pointers and allocas. > > 3) The union provides an additional layer of type safety, since you > can only extract types which are declared in the union, and not any > arbitrary type that you could get with a bitcast. (Although I > consider this a relatively minor point since type safety isn't a > major concern in IR.) > > 4) It's possible that some future version of the optimizer could use > the additional type information provided by the union which the > bitcast does not. Perhaps an optimizer which knows that all of the > union members are numbers and not pointers could make some > additional assumptions... > > 5) Something I forgot to mention - by allowing GEP and extractvalue > to work with unions, we can handle unions nested inside structs and > vice versa with a single GEP instruction. This is my main argument > against having special instructions for dealing with unions. > > For example, in the case of { i1, union { float, i32 } }* we can use > a GEP with indices [0, 1, 0] to get access to the float field in a > single GEP instruction. > > So just as GEP allows chaining together operations on structs, > pointers and arrays, we can also chain them together with operations > on unions. This can be quite powerful I think. > Yes, this is all very compelling to me. Beyond all this, we don't support bitcast of aggregate values. -Chris From clattner at apple.com Mon Jan 18 15:40:41 2010 From: clattner at apple.com (Chris Lattner) Date: Mon, 18 Jan 2010 13:40:41 -0800 Subject: [LLVMdev] [patch] Union Types - work in progress In-Reply-To: References: Message-ID: <51F63373-36A9-4DD6-B375-B9FDEB33B796@apple.com> On Jan 16, 2010, at 11:15 AM, Talin wrote: > OK here's the patch for real this time :) > > On Fri, Jan 15, 2010 at 4:36 PM, Talin wrote: > Here's a work in progress of the union patch. Note that the test > "union.ll" does not work, so you probably don't want to check this > in as is. However, I'd be interested in any feedback you're willing > to give. Looking good so far, some thoughts: The LangRef.html patch looks great. One thing that I notice is that the term 'aggregate' is not defined anywhere. Please add it to the #t_classifications section and change the insert/extractvalue instructions to refer to that type classification instead of enumerating the options. The ConstantUnion ctor or ConstantUnion::get should assert that the constant has type that matches one of the elements of the union. @@ -928,7 +949,7 @@ /// if the elements of the array are all ConstantInt's. bool ConstantArray::isString() const { // Check the element type for i8... - if (!getType()->getElementType()->isInteger(8)) + if (getType()->getElementType() != Type::getInt8Ty(getContext())) return false; // Check the elements to make sure they are all integers, not constant // expressions. You have a couple of these things which revert a recent patch, please don't :) Funky indentation in ConstantUnion::replaceUsesOfWithOnConstant and implementation missing :) In UnionValType methods, please use "UT" instead of "ST" as an acronym. +bool UnionType::isValidElementType(const Type *ElemTy) { + return ElemTy->getTypeID() != VoidTyID && ElemTy->getTypeID() != LabelTyID && + ElemTy->getTypeID() != MetadataTyID && ! isa(ElemTy); +} Please use "!ElemTy->isVoidTy()" etc. --- lib/VMCore/ConstantsContext.h (revision 93451) +template<> +struct ConstantKeyData { + typedef Constant* ValType; + static ValType getValType(ConstantUnion *CS) { CU not CS. LLParser.cpp: In LLParser::ParseUnionType, you can use SmallVector instead of std::vector for ParamsList & ParamsListTy. @@ -2135,7 +2173,8 @@ ParseToken(lltok::rparen, "expected ')' in extractvalue constantexpr")) return true; - if (!isa(Val->getType()) && !isa(Val- >getType())) + if (!isa(Val->getType()) && !isa(Val- >getType()) && + !isa(Val->getType())) return Error(ID.Loc, "extractvalue operand must be array or struct"); if (!ExtractValueInst::getIndexedType(Val->getType(), Indices.begin(), Indices.end())) @@ -2156,7 +2195,8 @@ ParseIndexList(Indices) || ParseToken(lltok::rparen, "expected ')' in insertvalue constantexpr")) return true; - if (!isa(Val0->getType()) && !isa(Val0- >getType())) + if (!isa(Val0->getType()) && !isa(Val0- >getType()) && + !isa(Val0->getType())) How about changing this to use Type::isAggregateType() instead of enumerating? This happens a few times in LLParser.cpp + if (ID.ConstantVal->getType() != Ty) { + // Allow a constant struct with a single member to be converted + // to a union, if the union has a member which is the same type + // as the struct member. + if (const UnionType* utype = dyn_cast(Ty)) { + if (const StructType* stype = dyn_cast( + ID.ConstantVal->getType())) { + if (stype->getNumContainedTypes() == 1) { + int index = utype->getElementTypeIndex(stype- >getContainedType(0)); + if (index >= 0) { + V = ConstantUnion::get( + utype, cast(ID.ConstantVal- >getOperand(0))); + return false; + } + } + } + } + Please split this out to a static helper function that uses early exits. In this code you should be able to do something like: if (ID.ConstantVal->getType() != Ty) if (Constant *Elt = TryConvertingSingleElementStructToUnion(...)) return Elt; +++ lib/Bitcode/Reader/BitcodeReader.cpp (working copy) @@ -584,6 +584,13 @@ ResultTy = StructType::get(Context, EltTys, Record[0]); break; } + case bitc::TYPE_CODE_UNION: { // UNION: [eltty x N] + std::vector EltTys; + for (unsigned i = 0, e = Record.size(); i != e; ++i) + EltTys.push_back(getTypeByID(Record[i], true)); + ResultTy = UnionType::get(&EltTys[0], EltTys.size()); + break; + } This can use SmallVector. Otherwise, the patch is looking great to me! -Chris From sahmad at adobe.com Mon Jan 18 17:42:52 2010 From: sahmad at adobe.com (Shad Ahmad) Date: Mon, 18 Jan 2010 15:42:52 -0800 Subject: [LLVMdev] Any detailed instructions for building LLVM on Win XP? Message-ID: <5D901C0A29C7A742B99B2D61CA13BE1E08D1B4@nambxv01a.corp.adobe.com> I need to build LLVM on Win XP VM. Are there any detailed instructions for that? Thank you -Shad Ahmad | Release Manager, Adobe AIR | Adobe Systems, Inc | sahmad at adobe.com | 408.536.4101 From ChristophErhardt at gmx.de Mon Jan 18 18:19:50 2010 From: ChristophErhardt at gmx.de (Christoph Erhardt) Date: Tue, 19 Jan 2010 01:19:50 +0100 Subject: [LLVMdev] Any detailed instructions for building LLVM on Win XP? In-Reply-To: <5D901C0A29C7A742B99B2D61CA13BE1E08D1B4@nambxv01a.corp.adobe.com> References: <5D901C0A29C7A742B99B2D61CA13BE1E08D1B4@nambxv01a.corp.adobe.com> Message-ID: <4B54FAA6.306@gmx.de> Hi, I suppose you want to build LLVM with Visual Studio. The corresponding quickstart guide (http://llvm.org/docs/GettingStartedVS.html) should contain all the information you need. Christoph From ofv at wanadoo.es Mon Jan 18 18:20:43 2010 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Tue, 19 Jan 2010 01:20:43 +0100 Subject: [LLVMdev] Any detailed instructions for building LLVM on Win XP? References: <5D901C0A29C7A742B99B2D61CA13BE1E08D1B4@nambxv01a.corp.adobe.com> Message-ID: <87fx63ymg4.fsf@telefonica.net> Shad Ahmad writes: > I need to build LLVM on Win XP VM. Are there any detailed instructions > for that? Visual Studio or MinGW? For MinGW: http://www.llvm.org/docs/GettingStarted.html For VS: http://www.llvm.org/docs/GettingStartedVS.html Both: http://www.llvm.org/docs/CMake.html -- ?scar From stoklund at 2pi.dk Mon Jan 18 19:10:39 2010 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 18 Jan 2010 17:10:39 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> Message-ID: On Jan 15, 2010, at 10:03 AM, Chris Lattner wrote: > > When/if another target wants this, we could add a ISD::RBIT operation, > it doesn't need to be added at the llvm ir level, Blackfin can add with backwards carry, essentially doing (rbit (add (rbit a), (rbit b))) This is used for FFTs. I wasn't hoping to be able to pattern-match something so complicated. From eli.friedman at gmail.com Mon Jan 18 20:29:51 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 18 Jan 2010 18:29:51 -0800 Subject: [LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz In-Reply-To: References: <1B7B6FBB-0B43-4B08-A1B3-C618CDB88387@gmail.com> <79F7C763-5126-4542-BE40-D1703CFA5CB0@apple.com> Message-ID: On Mon, Jan 18, 2010 at 5:10 PM, Jakob Stoklund Olesen wrote: > > On Jan 15, 2010, at 10:03 AM, Chris Lattner wrote: >> >> When/if another target wants this, we could add a ISD::RBIT operation, >> it doesn't need to be added at the llvm ir level, > > Blackfin can add with backwards carry, essentially doing > > (rbit (add (rbit a), (rbit b))) > > This is used for FFTs. > > I wasn't hoping to be able to pattern-match something so complicated. Feel free to add target intrinsics where appropriate... -Eli From junchao.zhang at gmail.com Mon Jan 18 21:58:42 2010 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 18 Jan 2010 21:58:42 -0600 Subject: [LLVMdev] Can I port LLVM as a source-to-source compiler? Message-ID: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> Hello, I am working in a project on a parallel programming language. I want to base our language on Java or C/C++. But Java is preferred. Many similar projects adopts a source-to-source methodology, e.g., Berkeley UPC(using Open64), Titanium, and Rice University's Co-array Fortran. They output C code with calls to the runtime. I think there are at least three reasons: 1) using C as the output, it gets more portability. 2) leverage the front ends of existing compilers. 3) leverage optimizations in existing compilers. I wonder if LLVM is suitable for this kind of work. Can LLVM experienced users give me some hints on this topic? Thanks in advance. Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100118/000cff95/attachment.html From mark.i.r.muir at gmail.com Tue Jan 19 06:27:43 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Tue, 19 Jan 2010 12:27:43 +0000 Subject: [LLVMdev] Frame index arithmetic In-Reply-To: <2B28F2EF-B836-4F14-A3DC-F7EE424FF1A5@apple.com> References: <7AD1EE04-9CB7-4289-BF42-0DC0BA8A3E1E@gmail.com> <2B28F2EF-B836-4F14-A3DC-F7EE424FF1A5@apple.com> Message-ID: <4F09CA97-5B07-43E2-8BEB-76E91E811985@gmail.com> >> I'm trying something cunning/crazy with the stack - implementing it in a type of memory that can only be addressed via immediates. >> >> I've got this mostly working. However, I came across a problem which I've been unable to work around: lowering the IR (even without any optimisations enabled) often requires the pattern: >> >> i32 = FrameIndex >> >> It isn't appropriate to do this with the proposed stack memory - it doesn't make sense to move the address into a register, as it isn't possible to move that back to the domain of an immediate. So I conditionally disabled this instruction. But that leads to most programs failing to select the above pattern. > > Sounds like your load / store address selection routine isn't working like what you expected. > Thanks for the reply. Unfortunately, this doesn't seem to be the problem. I have the following definition for the frameIndex: def frameIndex : Operand, ComplexPattern { let PrintMethod = "printFrameIndexOperand"; let MIOperandInfo = (ops GPR); } And the following selection code: bool MyDAGToDAGISel:: SelectFrameIndex(SDValue Op, SDValue N, SDValue& Address) { if (FrameIndexSDNode* FIN = dyn_cast(N)) { Address = CurDAG->getTargetFrameIndex(FIN->getIndex(), MVT::i32); return true; } return false; } In light of your comment, I tried extending this method to only allow cases where Op is ISD::LOAD or ISD::STORE. I found this made no difference to the behaviour. That was surprising, so I added code to print out each instruction seen by that method. And it turns out that all the operations were loads or stores anyway. So it must be later on that the conversion happens, which turns the operation into some form of indirect addressing. To further explore the example I gave in my original email, I have an instruction matching the pattern: [(set GPR:$dst, (select GPR:$sel, immOrGPR:$a, immOrGPR:$b))] The target-independent IR (as shown in the original message): > define i32 @foo(i32 %cond, i32 %a, i32 %b) nounwind { > entry: > %retval = alloca i32 ; [#uses=2] > %cond.addr = alloca i32 ; [#uses=2] > %a.addr = alloca i32 ; [#uses=2] > %b.addr = alloca i32 ; [#uses=2] > store i32 %cond, i32* %cond.addr > store i32 %a, i32* %a.addr > store i32 %b, i32* %b.addr > %tmp = load i32* %cond.addr ; [#uses=1] > %tobool = icmp ne i32 %tmp, 0 ; [#uses=1] > %tmp1 = load i32* %a.addr ; [#uses=1] > %tmp2 = load i32* %b.addr ; [#uses=1] > %cond3 = select i1 %tobool, i32 %tmp1, i32 %tmp2 ; [#uses=1] To me seems to be doing what I want - i.e. storing arguments 'a' and 'b' into local stack slots, then selecting between the values stored in those stack slots. This appears to be what is seen during selection - SelectFrameIndex(), as indicated above. The normal debug output from the back-end shows that the target-independent IR gets lowered to a select between the addresses of the two argument stack slots, rather than their values. That's a nice optimisation in general, but isn't allowed here, hence leading to the: LLVM ERROR: Cannot yet select: 0x1811798: i32 = FrameIndex <3> What transforms are performed during selection? I think this is where I should be looking, but I'm a bit lost. Any help would be greatly appreciated. - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100119/c3e5d2dd/attachment.html From baldrick at free.fr Tue Jan 19 06:55:55 2010 From: baldrick at free.fr (Duncan Sands) Date: Tue, 19 Jan 2010 13:55:55 +0100 Subject: [LLVMdev] Frame index arithmetic In-Reply-To: <4F09CA97-5B07-43E2-8BEB-76E91E811985@gmail.com> References: <7AD1EE04-9CB7-4289-BF42-0DC0BA8A3E1E@gmail.com> <2B28F2EF-B836-4F14-A3DC-F7EE424FF1A5@apple.com> <4F09CA97-5B07-43E2-8BEB-76E91E811985@gmail.com> Message-ID: <4B55ABDB.3040207@free.fr> Hi Mark, >> Sounds like your load / store address selection routine isn't working >> like what you expected. >> > > Thanks for the reply. Unfortunately, this doesn't seem to be the problem. do you handle truncating stores and extending loads? Ciao, Duncan. From mark.i.r.muir at gmail.com Tue Jan 19 07:14:30 2010 From: mark.i.r.muir at gmail.com (Mark Muir) Date: Tue, 19 Jan 2010 13:14:30 +0000 Subject: [LLVMdev] Frame index arithmetic In-Reply-To: <4B55ABDB.3040207@free.fr> References: <7AD1EE04-9CB7-4289-BF42-0DC0BA8A3E1E@gmail.com> <2B28F2EF-B836-4F14-A3DC-F7EE424FF1A5@apple.com> <4F09CA97-5B07-43E2-8BEB-76E91E811985@gmail.com> <4B55ABDB.3040207@free.fr> Message-ID: >>> Sounds like your load / store address selection routine isn't working like what you expected. >> Thanks for the reply. Unfortunately, this doesn't seem to be the problem. > > do you handle truncating stores and extending loads? > I have instructions matching patterns for zero- and sign-extending loads (8-bit to 32-bit, 16-bit to 32-bit), and truncating stores (32-bit to 16-bit, 32-bit to 8-bit). And I've also defined patterns to map 'extload*' to 'zextload*'. A bit more info as to where the problem occurs (in case that helps), the debug output shows that at the 'Initial selection DAG' phase, the loads, stores, and select are all as I want them to be. However, the very next dump 'Optimized lowered selection DAG' shows the select having been altered to operate on the addresses instead of the values. I've listed these two dumps below. Regards, - Mark === foo Initial selection DAG: SelectionDAG has 35 nodes: 0x1403e98: ch = EntryToken 0x1810e90: i32 = undef 0x1403e98: 0x1810e08: i32 = FrameIndex <-1> 0x1810e90: 0x1810f18: i32,ch = load 0x1403e98, 0x1810e08, 0x1810e90 alignment=4 0x1403e98: 0x1810fa0: i32 = FrameIndex <-2> 0x1810e90: 0x1811028: i32,ch = load 0x1403e98, 0x1810fa0, 0x1810e90 alignment=4 0x1403e98: 0x18110b0: i32 = FrameIndex <-3> 0x1810e90: 0x1811138: i32,ch = load 0x1403e98, 0x18110b0, 0x1810e90 alignment=4 0x18114f0: i32 = FrameIndex <1> 0x1811688: i32 = FrameIndex <2> 0x1811798: i32 = FrameIndex <3> 0x1403e98: 0x1810f18: 0x18114f0: 0x1810e90: 0x1811600: ch = store 0x1403e98, 0x1810f18, 0x18114f0, 0x1810e90 <0x14021fc:0> alignment=4 0x1811028: 0x1811688: 0x1810e90: 0x1811710: ch = store 0x1811600, 0x1811028, 0x1811688, 0x1810e90 <0x140226c:0> alignment=4 0x1811138: 0x1811798: 0x1810e90: 0x1811820: ch = store 0x1811710, 0x1811138, 0x1811798, 0x1810e90 <0x14022ac:0> alignment=4 0x1811820: 0x18114f0: 0x1810e90: 0x18118a8: i32,ch = load 0x1811820, 0x18114f0, 0x1810e90 <0x14021fc:0> alignment=4 0x1811820: 0x1811688: 0x1810e90: 0x1811a40: i32,ch = load 0x1811820, 0x1811688, 0x1810e90 <0x140226c:0> alignment=4 0x1811820: 0x1811798: 0x1810e90: 0x1811ac8: i32,ch = load 0x1811820, 0x1811798, 0x1810e90 <0x14022ac:0> alignment=4 0x1811bd8: i32 = FrameIndex <0> 0x18118a8: 0x1811a40: 0x1811ac8: 0x1811c60: ch = TokenFactor 0x18118a8:1, 0x1811a40:1, 0x1811ac8:1 0x18118a8: 0x1811578: i32 = Constant <0> 0x1811930: ch = setne 0x18119b8: i1 = setcc 0x18118a8, 0x1811578, 0x1811930 0x1811a40: 0x1811ac8: 0x1811b50: i32 = select 0x18119b8, 0x1811a40, 0x1811ac8 0x1811bd8: 0x1810e90: 0x1811ce8: ch = store 0x1811c60, 0x1811b50, 0x1811bd8, 0x1810e90 <0x14020fc:0> alignment=4 0x1403e98: 0x18111c0: i32 = Register #1024 0x1810f18: 0x1811248: ch = CopyToReg 0x1403e98, 0x18111c0, 0x1810f18 0x1403e98: 0x18112d0: i32 = Register #1025 0x1811028: 0x1811358: ch = CopyToReg 0x1403e98, 0x18112d0, 0x1811028 0x1403e98: 0x18113e0: i32 = Register #1026 0x1811138: 0x1811468: ch = CopyToReg 0x1403e98, 0x18113e0, 0x1811138 0x1811ce8: 0x1822808: ch = TokenFactor 0x1811248, 0x1811358, 0x1811468, 0x1811ce8 0x1822890: i32 = Register r3 0x1811ce8: 0x1811bd8: 0x1810e90: 0x1811d70: i32,ch = load 0x1811ce8, 0x1811bd8, 0x1810e90 <0x14020fc:0> alignment=4 0x1822918: ch,flag = CopyToReg 0x1822808, 0x1822890, 0x1811d70 0x1822918: 0x1822918: 0x18229a0: ch = MYISD::RET_FLAG 0x1822918, 0x1822918:1 Replacing.1 0x1811d70: i32,ch = load 0x1811ce8, 0x1811bd8, 0x1810e90 <0x14020fc:0> alignment=4 With: 0x1811b50: i32 = select 0x18119b8, 0x1811a40, 0x1811ac8 and 1 other values Replacing.1 0x1811b50: i32 = select 0x18119b8, 0x1811a40, 0x1811ac8 With: 0x1822a28: i32,ch = load 0x1811820, 0x1811d70, 0x1810e90 <0x140226c:0> alignment=4 and 0 other values Replacing.1 0x1811a40: i32,ch = load 0x1811820, 0x1811688, 0x1810e90 <0x140226c:0> alignment=4 With: 0x1822a28: i32,ch = load 0x1811820, 0x1811d70, 0x1810e90 <0x140226c:0> alignment=4 and 1 other values Replacing.1 0x1811ac8: i32,ch = load 0x1811820, 0x1811798, 0x1810e90 <0x14022ac:0> alignment=4 With: 0x1822a28: i32,ch = load 0x1811820, 0x1811d70, 0x1810e90 <0x140226c:0> alignment=4 and 1 other values Replacing.1 0x1811c60: ch = TokenFactor 0x18118a8:1, 0x1822a28:1, 0x1822a28:1 With: 0x1811ac8: ch = TokenFactor 0x18118a8:1, 0x1822a28:1 and 0 other values Optimized lowered selection DAG: SelectionDAG has 33 nodes: 0x1403e98: ch = EntryToken 0x1810e90: i32 = undef 0x1403e98: 0x1810e08: i32 = FrameIndex <-1> 0x1810e90: 0x1810f18: i32,ch = load 0x1403e98, 0x1810e08, 0x1810e90 alignment=4 0x1403e98: 0x1810fa0: i32 = FrameIndex <-2> 0x1810e90: 0x1811028: i32,ch = load 0x1403e98, 0x1810fa0, 0x1810e90 alignment=4 0x1403e98: 0x18110b0: i32 = FrameIndex <-3> 0x1810e90: 0x1811138: i32,ch = load 0x1403e98, 0x18110b0, 0x1810e90 alignment=4 0x18114f0: i32 = FrameIndex <1> 0x1811688: i32 = FrameIndex <2> 0x1811798: i32 = FrameIndex <3> 0x1403e98: 0x1810f18: 0x18114f0: 0x1810e90: 0x1811600: ch = store 0x1403e98, 0x1810f18, 0x18114f0, 0x1810e90 <0x14021fc:0> alignment=4 0x1811028: 0x1811688: 0x1810e90: 0x1811710: ch = store 0x1811600, 0x1811028, 0x1811688, 0x1810e90 <0x140226c:0> alignment=4 0x1811138: 0x1811798: 0x1810e90: 0x1811820: ch = store 0x1811710, 0x1811138, 0x1811798, 0x1810e90 <0x14022ac:0> alignment=4 0x1811820: 0x18114f0: 0x1810e90: 0x18118a8: i32,ch = load 0x1811820, 0x18114f0, 0x1810e90 <0x14021fc:0> alignment=4 0x1403e98: 0x18111c0: i32 = Register #1024 0x1810f18: 0x1811248: ch = CopyToReg 0x1403e98, 0x18111c0, 0x1810f18 0x1403e98: 0x18112d0: i32 = Register #1025 0x1811028: 0x1811358: ch = CopyToReg 0x1403e98, 0x18112d0, 0x1811028 0x1403e98: 0x18113e0: i32 = Register #1026 0x1811138: 0x1811468: ch = CopyToReg 0x1403e98, 0x18113e0, 0x1811138 0x18118a8: 0x1822a28: 0x1811ac8: ch = TokenFactor 0x18118a8:1, 0x1822a28:1 0x1822a28: 0x1811bd8: i32 = FrameIndex <0> 0x1810e90: 0x1811ce8: ch = store 0x1811ac8, 0x1822a28, 0x1811bd8, 0x1810e90 <0x14020fc:0> alignment=4 0x1822808: ch = TokenFactor 0x1811248, 0x1811358, 0x1811468, 0x1811ce8 0x1822890: i32 = Register r3 0x1822a28: 0x1822918: ch,flag = CopyToReg 0x1822808, 0x1822890, 0x1822a28 0x1811820: 0x18118a8: 0x1811578: i32 = Constant <0> 0x1811930: ch = setne 0x18119b8: i1 = setcc 0x18118a8, 0x1811578, 0x1811930 0x1811688: 0x1811798: 0x1811d70: i32 = select 0x18119b8, 0x1811688, 0x1811798 0x1810e90: 0x1822a28: i32,ch = load 0x1811820, 0x1811d70, 0x1810e90 <0x140226c:0> alignment=4 0x1822918: 0x1822918: 0x18229a0: ch = MYISD::RET_FLAG 0x1822918, 0x1822918:1 Also, just after that, the dump includes: Legally typed node: 0x1811798: i32 = FrameIndex <3> Legally typed node: 0x1811688: i32 = FrameIndex <2> I'm not sure how to indicate that these aren't legally typed. Or at least, aren't legally typed when storing that i32 into a register. From kennethuil at gmail.com Tue Jan 19 08:06:40 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Tue, 19 Jan 2010 08:06:40 -0600 Subject: [LLVMdev] Can I port LLVM as a source-to-source compiler? In-Reply-To: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> References: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> Message-ID: <400d33ea1001190606y55c62828nd0d47c327672af6e@mail.gmail.com> On Mon, Jan 18, 2010 at 9:58 PM, Junchao Zhang wrote: > Hello, > I am working in a project on a parallel programming language. I want to base > our language on Java or C/C++. But Java is preferred. > > Many similar projects adopts a source-to-source methodology, e.g., Berkeley > UPC(using Open64), Titanium, and Rice University's Co-array Fortran. They > output C code with calls to the runtime. ?I think there are at least three > reasons: 1) using C as the output, it gets more portability. 2) leverage the > front ends of existing compilers. 3) leverage optimizations in existing > compilers. > > I wonder if LLVM is suitable for this kind of work. Can LLVM experienced > users give me some hints on this topic? > > Thanks in advance. > > Junchao Zhang > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > LLVM can be made to output (horribly gnarly and non-portable) C code. However, I haven't tried it and I'm not sure what state that functionality is in. From devlists at shadowlab.org Tue Jan 19 08:49:25 2010 From: devlists at shadowlab.org (Jean-Daniel Dupas) Date: Tue, 19 Jan 2010 15:49:25 +0100 Subject: [LLVMdev] Can I port LLVM as a source-to-source compiler? In-Reply-To: <400d33ea1001190606y55c62828nd0d47c327672af6e@mail.gmail.com> References: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> <400d33ea1001190606y55c62828nd0d47c327672af6e@mail.gmail.com> Message-ID: Le 19 janv. 2010 ? 15:06, Kenneth Uildriks a ?crit : > On Mon, Jan 18, 2010 at 9:58 PM, Junchao Zhang wrote: >> Hello, >> I am working in a project on a parallel programming language. I want to base >> our language on Java or C/C++. But Java is preferred. >> >> Many similar projects adopts a source-to-source methodology, e.g., Berkeley >> UPC(using Open64), Titanium, and Rice University's Co-array Fortran. They >> output C code with calls to the runtime. I think there are at least three >> reasons: 1) using C as the output, it gets more portability. 2) leverage the >> front ends of existing compilers. 3) leverage optimizations in existing >> compilers. >> >> I wonder if LLVM is suitable for this kind of work. Can LLVM experienced >> users give me some hints on this topic? >> >> Thanks in advance. >> >> Junchao Zhang >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > LLVM can be made to output (horribly gnarly and non-portable) C code. > However, I haven't tried it and I'm not sure what state that > functionality is in. > Not very well supported I think: http://llvm.org/docs/ReleaseNotes.html LLVM 2.6 Release Notes ?The C Backend (-march=c) is no longer considered part of the LLVM release criteria. We still want it to work, but no one is maintaining it and it lacks support for arbitrary precision integers and other important IR features.? -- Jean-Daniel From gac43 at cam.ac.uk Tue Jan 19 08:57:35 2010 From: gac43 at cam.ac.uk (Greg Chadwick) Date: Tue, 19 Jan 2010 14:57:35 +0000 Subject: [LLVMdev] Can I port LLVM as a source-to-source compiler? In-Reply-To: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> References: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> Message-ID: <4B55C85F.5070105@cam.ac.uk> Junchao Zhang wrote: > Hello, > I am working in a project on a parallel programming language. I want to > base our language on Java or C/C++. But Java is preferred. > > Many similar projects adopts a source-to-source methodology, e.g., > Berkeley UPC(using Open64), Titanium, and Rice University's Co-array > Fortran. They output C code with calls to the runtime. I think there > are at least three reasons: 1) using C as the output, it gets more > portability. 2) leverage the front ends of existing compilers. 3) > leverage optimizations in existing compilers. > > I wonder if LLVM is suitable for this kind of work. Can LLVM experienced > users give me some hints on this topic? > > Thanks in advance. > > Junchao Zhang As others have mentioned there is a C backend, though in this case I see little point in using it. If you produce a LLVM language front-end (i.e. something which takes your language and creates LLVM) then you get portability and a whole bunch of optimizations. If you think it's significantly easier to compile your language down to C rather than LLVM then you could pass that into one of the C language front ends (LLVM-GCC or Clang). Cheers, Greg Chadwick From gac43 at cam.ac.uk Tue Jan 19 08:57:42 2010 From: gac43 at cam.ac.uk (Greg Chadwick) Date: Tue, 19 Jan 2010 14:57:42 +0000 Subject: [LLVMdev] ComplexPattern Message-ID: <4B55C866.5060809@cam.ac.uk> Hi, I was wondering if someone could explain precisely what the ComplexPattern tablegen class does? Here's the first line of the definition (from TargetSelectionDAG.td) for reference: class ComplexPattern roots = [], list props = [], list attrs = []> As far as I can tell it gives the name of a selection function (fn) that will be called to match that particular ComplexPattern. Should that function return true that pattern has matched. The match function can also fill in some operands that can be used later on (Number is specified by numops), ty presumably specifies the type of node that this match can be attempted on. Is my understanding of this correct? The thing I'm still unsure about is roots, what exactly does this do? The comment above the definition specifies that 'RootNodes are the list of possible root nodes of the sub-dags to match' (RootsNodes is assigned to root so they're the same) but I can't make any sense of this. Cheers, Greg Chadwick From 49640f8a at gmail.com Tue Jan 19 09:07:38 2010 From: 49640f8a at gmail.com (Martins Mozeiko) Date: Tue, 19 Jan 2010 17:07:38 +0200 Subject: [LLVMdev] Fwd: JIT on ARM References: Message-ID: <0A66C5C8-AE00-4B3B-A049-1BB47C03BC56@gmail.com> I found out that even calling functions that are defined inside LLVM bitcode doesn't work. So following C code when compiled without any optimizations also crashes with JIT when main function is called (with optimizations function call is optimized away, of course). *** static int fun(int x) { return x + 1; } int main() { return fun(30); } *** Is there some special conditions on ARM I need to check for LLVM call functions correctly? -- Martins Mozeiko Begin forwarded message: > From: Martins Mozeiko <49640f8a at gmail.com> > Date: September 25, 2009 12:04:04 GMT+03:00 > To: llvmdev at cs.uiuc.edu > Subject: JIT on ARM > > Hello. > > My goal is to use LLVM with JIT compiler for ARM on Android device. > > Currently I have successfully built and executed LLVM bitcode with interpreter on Android. Speed is not so great, that is why I want to use JIT. > I tried building bitcode on windows with llvm-gcc that is provided on llvm home page. Resulting bitcode runs great in interpreter, but it doesn't use JIT. From what I understand from LLVM that it is because of target triple or datalayout. On PC in my case target triple is "i386-mingw32", but LLVM JIT for ARM expects something that starts with "armv" or "thumb". Also datalayout differs a lot. If I understand correctly that is because bitcode actually contains some of platform specific optimizations (like integer endianness), so I can't simply change target triple string and everything will run. > > What I am now doing, is - I am trying to build gcc from gcc-4.2-llvm-2.5 sources with following configure line in cygwin shell: > > ../llvm-gcc4.2-2.4.source/configure --enable-languages=c,c++ --enable-checking --enable-llvm=$PWD/../llvm-objects --disable-bootstrap --disable-multilib --disable-nls --target=arm-eabi > > My undertanding is that this will produce llvm-gcc compiler that will output LLVM bitcode with target triple that starts with "arm", so LLVM will use JIT on it. > > Currently I am getting compile problems when running make (see below). > Has anyone done something similar and can explain what I am doing wrong, or how should it be done in some other way to get JIT compiler running on ARM? > > > make[2]: Entering directory `/cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc' > gcc -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -o build/gengtype.exe \ > build/gengtype.o build/gengtype-lex.o build/gengtype-yacc.o build/errors.o ../build-i686-pc-cygwin/libiberty/libiberty.a > build/gengtype.o: In function `adjust_field_type': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:763: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:771: undefined reference to `_lexer_line' > build/gengtype.o: In function `adjust_field_tree_exp': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:713: undefined reference to `_lexer_line' > build/gengtype.o: In function `adjust_field_rtx_def': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:488: undefined reference to `_lexer_line' > build/gengtype.o: In function `adjust_field_type': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:785: undefined reference to `_lexer_line' > build/gengtype.o:/cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:725: more undefined references to `_lexer_line' follow > build/gengtype.o: In function `main': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:3070: undefined reference to `_parse_file' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype.c:3070: undefined reference to `_parse_file' > build/gengtype-yacc.o: In function `yyparse': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:73: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:75: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:76: undefined reference to `_lexer_toplevel_done' > build/gengtype-yacc.o: In function `yyparse': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/gengtype-yacc.c:1873: undefined reference to `_yyerror' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/gengtype-yacc.c:1379: undefined reference to `_yylex' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/gengtype-yacc.c:1877: undefined reference to `_yyerror' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/gengtype-yacc.c:1995: undefined reference to `_yyerror' > build/gengtype-yacc.o: In function `yyparse': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:122: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:110: undefined reference to `_lexer_toplevel_done' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:102: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:97: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:92: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:239: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:235: undefined reference to `_lexer_line' > build/gengtype-yacc.o:/cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:231: more undefined references to `_lexer_line' follow > build/gengtype-yacc.o: In function `yyparse': > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:84: undefined reference to `_lexer_toplevel_done' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:203: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:203: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:176: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:176: undefined reference to `_lexer_line' > /cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:165: undefined reference to `_lexer_line' > build/gengtype-yacc.o:/cygdrive/r/android/llvm/llvm-gcc4.2-objects/gcc/../../llvm-gcc4.2-2.4.source/gcc/gengtype-yacc.y:165: more undefined references to `_lexer_line' follow > collect2: ld returned 1 exit status > make[2]: *** [build/gengtype.exe] Error 1 > > -- > Martins Mozeiko > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100119/0db6dab6/attachment.html From daniel at zuster.org Tue Jan 19 10:39:33 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 19 Jan 2010 08:39:33 -0800 Subject: [LLVMdev] Finding the host datalayout In-Reply-To: <62524.26201.qm@web62005.mail.re1.yahoo.com> References: <62524.26201.qm@web62005.mail.re1.yahoo.com> Message-ID: <6a8523d61001190839x728ba80bo8401a7901326c38f@mail.gmail.com> If you create a TargetMachine for the target, you can ask getTargetData()->getStringRepresentation(). See llvm-gcc-4.2/gcc/llvm-backend.cpp:~499. - Daniel On Mon, Jan 18, 2010 at 10:04 AM, Samuel Crow wrote: > Hello all, > > As we work the last few bugs out of our project for the last release, we need to find a way to set the default datalayout of the LLVM Assembly file we are generating to be that of the host machine. ?I've seen options for target triples in the Doxygen but not the datalayout. ?BTW, we're using version 2.6 of LLVM. > > --Sam > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From jyasskin at google.com Tue Jan 19 10:43:03 2010 From: jyasskin at google.com (Jeffrey Yasskin) Date: Tue, 19 Jan 2010 08:43:03 -0800 Subject: [LLVMdev] JIT on ARM In-Reply-To: <3C7EFA3E-DEE9-435E-9570-AA925A56815E@gmail.com> References: <3C7EFA3E-DEE9-435E-9570-AA925A56815E@gmail.com> Message-ID: This sounds like http://llvm.org/bugs/show_bug.cgi?id=5201. The summary of that bug talks about x86, but it applies to arm too, and with a much smaller offset limit. I didn't remember it happening with external functions, but there could be cases where it does. To check, can you run your program in a debugger and look at the address of my_add1 and the address it actually calls (and crashes on)? If it's this bug, the low bits will be right, and the high bits will be the same as the caller. On Monday, January 18, 2010, Martins Mozeiko <49640f8a at gmail.com> wrote: > Hi. > I am trying to run LLVM with JIT on ARM processor (Android phone). > Currently I have problems using external functions. Any call to external function crashes and gives me signal 11 (SIGSEGV) at some random address. > > I'm trying to run following C code: > *** > extern void add1(int* x); > int main() > { > ? ?int a = 10; > ? ?int b = 20; > ? ?add1(&b); > ? ?int c = a + b; > ? ?return c; > } > *** > It gives me following LL code: > *** > define i32 @main() nounwind { > entry: > ?%b = alloca i32, align 4 ? ? ? ? ? ? ? ? ? ? ? ?; [#uses=3] > ?store i32 20, i32* %b, align 4 > ?call void @add1(i32* %b) nounwind > ?%0 = load i32* %b, align 4 ? ? ? ? ? ? ? ? ? ? ?; [#uses=1] > ?%1 = add nsw i32 %0, 10 ? ? ? ? ? ? ? ? ? ? ? ? ; [#uses=1] > ?ret i32 %1 > } > > declare void @add1(i32*) > *** > When using llvm::DebugFlag=true JIT gives me following debug messages: > > ********** Function: main > > Ifcvt: function (0) 'main' > block 0 offset 0 size 40 > block 0 offset 0 size 40 > JITTing function 'main' > JIT: Starting CodeGen of Function main > JIT: Emitting BB0 at [0x4512e010] > JIT: 0x4512e010: ? ? ? ?STM %SP, 12, 14, %reg0, %R11, %LR > ?0xe92d4800 > JIT: 0x4512e014: ? ? ? ?%SP = SUBri %SP, 8, 14, %reg0, %reg0 > ?0xe24dd008 > JIT: 0x4512e018: ? ? ? ?%R0 = MOVi 20, 14, %reg0, %reg0 > ?0xe3a00014 > JIT: 0x4512e01c: ? ? ? ?STR %R0, %SP, %reg0, 4, 14, %reg0, Mem:ST(4,4) [b + 0] > ?0xe58d0004 > JIT: 0x4512e020: ? ? ? ?%R0 = ADDri %SP, 4, 14, %reg0, %reg0 > ?0xe28d0004 > JIT: 0x4512e024: ? ? ? ?BL , %R0, %R0, %R1, %R2, %R3, %R12, %LR, %D0, %D1, %D2, %D3, %D4, %D5, %D6, %D7, %D16, %D17, %D18, %D19, %D20, %D21, %D22, %D23, %D24, %D25, %D26, %D27, %D28, %D29, %D30, %D31, %CPSR > ?0xeb000000 > JIT: 0x4512e028: ? ? ? ?%R0 = LDR %SP, %reg0, 4, 14, %reg0, Mem:LD(4,4) [b + 0] > ?0xe59d0004 > JIT: 0x4512e02c: ? ? ? ?%R0 = ADDri %R0, 10, 14, %reg0, %reg0 > ?0xe280000a > JIT: 0x4512e030: ? ? ? ?%SP = ADDri %SP, 8, 14, %reg0, %reg0 > ?0xe28dd008 > JIT: 0x4512e034: ? ? ? ?LDM_RET %SP, 9, 14, %reg0, %R11, %PC > ?0xe8bd8800 > JIT: Map 'add1' to [0x808e80b4] > JIT: Stub emitted at [0x42540008] for function 'add1' > JIT: Finished CodeGen of [0x4512e010] Function: main: 40 bytes of text, 1 relocations > JIT: Binary code: > JIT: 00000000: e92d4800 e24dd008 e3a00014 e58d0004 > JIT: 00000010: e28d0004 eb5047f7 e59d0004 e280000a > JIT: 00000020: e28dd008 e8bd8800 > **** > > Just to be sure I inform LLVM about add1 function with following code: > extern "C" void my_add1(int* x) > { > ? ?LOG("in add1, x=%i\n", *x); > ? ?*x = *x + 1; > } > sys::DynamicLibrary::AddSymbol("add1", (void*)&my_add1); > > Is there something wrong with generated code? Same code, using same test program, runs fine using JIT on x86 under Windows. > > I have successfuly run program that doesn't use external functions. For example following C code > **** > int main() > { > ? ?int a = 10; > ? ?int b = 20; > ? ?int c = a + b; > ? ?return c; > } > **** > compiles to following LL code: > **** > define i32 @main() nounwind readnone { > entry: > ?ret i32 30 > } > **** > which runs fine on ARM. When using llvm::DebugFlag=true JIT prints following debug messages: > ********** Function: main > > Ifcvt: function (0) 'main' > block 0 offset 0 size 8 > block 0 offset 0 size 8 > JITTing function 'main' > JIT: Starting CodeGen of Function main > JIT: Emitting BB0 at [0x4512e010] > JIT: 0x4512e010: ? ? ? ?%R0 = MOVi 30, 14, %reg0, %reg0 > ?0xe3a0001e > JIT: 0x4512e014: ? ? ? ?BX_RET 14, %reg0, %R0 > ?0xe12fff1e > JIT: Finished CodeGen of [0x4512e010] Function: main: 8 bytes of text, 0 relocations > JIT: Binary code: > JIT: 00000000: e3a0001e e12fff1e > *** > > I appreciate any suggestions what can I do in my situation. > > -- > Martins Mozeiko > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From rnk at mit.edu Tue Jan 19 12:04:28 2010 From: rnk at mit.edu (Reid Kleckner) Date: Tue, 19 Jan 2010 13:04:28 -0500 Subject: [LLVMdev] Can I port LLVM as a source-to-source compiler? In-Reply-To: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> References: <9f3253451001181958q94c6d17t9f89d9c54fba5165@mail.gmail.com> Message-ID: <9a9942201001191004l63eb5935x3f7369c3c95a417c@mail.gmail.com> On Mon, Jan 18, 2010 at 10:58 PM, Junchao Zhang wrote: > Hello, > I am working in a project on a parallel programming language. I want to base > our language on Java or C/C++. But Java is preferred. > > Many similar projects adopts a source-to-source methodology, e.g., Berkeley > UPC(using Open64), Titanium, and Rice University's Co-array Fortran. They > output C code with calls to the runtime. ?I think there are at least three > reasons: 1) using C as the output, it gets more portability. 2) leverage the > front ends of existing compilers. 3) leverage optimizations in existing > compilers. > > I wonder if LLVM is suitable for this kind of work. Can LLVM experienced > users give me some hints on this topic? No, I don't think that LLVM would be useful for making a source-to-source compiler. LLVM itself is mostly concerned with optimizations and code generation. It doesn't have any code for representing any high-level ASTs or doing type checking on them. You could look at clang, though, if for some reason you need an AST for C. Reid From gvenn.cfe.dev at gmail.com Tue Jan 19 13:22:43 2010 From: gvenn.cfe.dev at gmail.com (Garrison Venn) Date: Tue, 19 Jan 2010 14:22:43 -0500 Subject: [LLVMdev] compiler-rt project PATCH Message-ID: In lib/gcc_personality_v0.c, I believe the type of the exception_cleanup member of the struct _Unwind_Exception is incorrect. It has a return type of _Unwind_Reason_Code versus a return type of void as specified in http://refspecs.freestandards.org/abi-eh-1.21.html, and as can be seen in unwind.h on Linux, and OS X 10.6.x (the latter's unwind.h is under the Developer directory tree). The attached patch changes this return type to void. Garrison -------------- next part -------------- A non-text attachment was scrubbed... Name: gcc_personality_v0.patch Type: application/octet-stream Size: 702 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100119/237b26e5/attachment.obj From devlists at shadowlab.org Tue Jan 19 16:18:12 2010 From: devlists at shadowlab.org (Jean-Daniel Dupas) Date: Tue, 19 Jan 2010 23:18:12 +0100 Subject: [LLVMdev] Question about llvm::dbg() Message-ID: Maybe I miss something but I think there is a problem with this code in Support/Debug.cpp when compiling in Release build (NDEBUG defined): namespace llvm { /// dbgs - Return dbgs(). raw_ostream &dbgs() { return dbgs(); } } I though it would crash with a stack overflow due to infinite recursion, but llvm-gcc is smart enough to compile it as a jmp and create an infinite loop instead otool -tv Debug.o llvm::dbgs(): 0000000000000000 pushq %rbp 0000000000000001 movq %rsp,%rbp 0000000000000004 jmp 0x00000004 Anyway, I think neither crash nor infinite loop is the expected behavior, isn't it ? -- Jean-Daniel From viridia at gmail.com Tue Jan 19 20:20:22 2010 From: viridia at gmail.com (Talin) Date: Tue, 19 Jan 2010 18:20:22 -0800 Subject: [LLVMdev] Accessing Dwarf data at runtime Message-ID: Random question - suppose I wanted to access the DWARF section info from within my LLVM-generated executable. How would I do that? I know that the DWARF stuff is stored in ELF sections, how does one access an ELF section from the currently running binary? -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100119/75901b74/attachment.html From criswell at uiuc.edu Tue Jan 19 22:10:28 2010 From: criswell at uiuc.edu (John Criswell) Date: Tue, 19 Jan 2010 22:10:28 -0600 Subject: [LLVMdev] LLVM 2.6 and Aggregate Return Values: 64 bit Message-ID: <4B568234.60106@uiuc.edu> Dear All. How well does LLVM 2.6 support aggregate return values for 64 bit targets? I'm currently working on 64 bit Mac OS X and 64 bit Linux. Are there any known problems or limitations? -- John T. From kledzik at apple.com Wed Jan 20 00:13:53 2010 From: kledzik at apple.com (Nick Kledzik) Date: Tue, 19 Jan 2010 22:13:53 -0800 Subject: [LLVMdev] compiler-rt project PATCH In-Reply-To: References: Message-ID: <233CDB2B-7A09-41D6-BC2C-54747AA43132@apple.com> Committed revision 93983. Thanks! -Nick On Jan 19, 2010, at 11:22 AM, Garrison Venn wrote: > In lib/gcc_personality_v0.c, I believe the type of the > exception_cleanup member of the struct > _Unwind_Exception is incorrect. It has a return type of > _Unwind_Reason_Code versus a return > type of void as specified in http://refspecs.freestandards.org/abi-eh-1.21.html > , and as can be seen > in unwind.h on Linux, and OS X 10.6.x (the latter's unwind.h is > under the Developer directory tree). > > The attached patch changes this return type to void. > > Garrison > > > < > gcc_personality_v0 > .patch>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From minwook.ahn at gmail.com Wed Jan 20 01:34:34 2010 From: minwook.ahn at gmail.com (minwook Ahn) Date: Wed, 20 Jan 2010 16:34:34 +0900 Subject: [LLVMdev] [LLVMDev] Is it possible to implement target specific optimizations which can be applied after instruction selection or later? Message-ID: <3d49ce701001192334r5bfeabadg73b6e93aebd0bef4@mail.gmail.com> Dear developers. My question is the same as the title. Is there any way to implement target specific optimizations after instruction selection or later? I cannot find any related document. Please let me know. Thanks in advance. Minwook Ahn -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/12439e9b/attachment.html From minwook.ahn at gmail.com Wed Jan 20 01:55:30 2010 From: minwook.ahn at gmail.com (minwook Ahn) Date: Wed, 20 Jan 2010 16:55:30 +0900 Subject: [LLVMdev] [LLVMDev] Is there any way to eliminate zero-extension instruction? Message-ID: <3d49ce701001192355p43cc28bak84cc66f322e7697b@mail.gmail.com> Dear developers. We try to make our own backend of llvm for our target machine. Assume that we have the following code in our source code. int i = ( a < b ); The code is translated into r0 <- gt r1 r2 r3 <- and r0 0x1 We think that r3 is not necessary. Is there any way to eliminate it by just modifying our backend? Thank you in advance. Minwook Ahn -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/13ad88bb/attachment.html From richard at xmos.com Wed Jan 20 02:52:31 2010 From: richard at xmos.com (Richard Osborne) Date: Wed, 20 Jan 2010 08:52:31 +0000 Subject: [LLVMdev] [LLVMDev] Is there any way to eliminate zero-extension instruction? In-Reply-To: <3d49ce701001192355p43cc28bak84cc66f322e7697b@mail.gmail.com> References: <3d49ce701001192355p43cc28bak84cc66f322e7697b@mail.gmail.com> Message-ID: On 20 Jan 2010, at 07:55, minwook Ahn wrote: > Dear developers. > > We try to make our own backend of llvm for our target machine. > > Assume that we have the following code in our source code. > > int i = ( a < b ); > > The code is translated into > > r0 <- gt r1 r2 > r3 <- and r0 0x1 > > We think that r3 is not necessary. Is there any way to eliminate it by just modifying > > our backend? > > Thank you in advance. > > Minwook Ahn Have you told LLVM the result of setcc operations is 0 or 1? Add the following to the constructor of your ISelLowering class: setBooleanContents(ZeroOrOneBooleanContent); -- Richard Osborne | XMOS http://www.xmos.com From omerigener at gmail.com Wed Jan 20 03:27:21 2010 From: omerigener at gmail.com (Gener Omer) Date: Wed, 20 Jan 2010 11:27:21 +0200 Subject: [LLVMdev] Profile-Guided Optimization status Message-ID: Hi all, I would like to know the status of profile-guided optimization. What enhancements could be done to the current implementation? The ideas for profile-guided transformations from [0] are still available? Many thanks. Gener [0] http://llvm.org/OpenProjects.html#profileguided -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/ad19ccc0/attachment.html From baldrick at free.fr Wed Jan 20 04:07:13 2010 From: baldrick at free.fr (Duncan Sands) Date: Wed, 20 Jan 2010 11:07:13 +0100 Subject: [LLVMdev] LLVM 2.6 and Aggregate Return Values: 64 bit In-Reply-To: <4B568234.60106@uiuc.edu> References: <4B568234.60106@uiuc.edu> Message-ID: <4B56D5D1.4050705@free.fr> Hi John, > How well does LLVM 2.6 support aggregate return values for 64 bit > targets? I'm currently working on 64 bit Mac OS X and 64 bit Linux. > Are there any known problems or limitations? on x86-64 it depends on what you are returning, but for example you should be able to return a 128 bit integer fine, but anything more will cause the code generator blow up. This limitation has been removed in the development version. Ciao, Duncan. From jon at ffconsultancy.com Wed Jan 20 07:20:59 2010 From: jon at ffconsultancy.com (Jon Harrop) Date: Wed, 20 Jan 2010 13:20:59 +0000 Subject: [LLVMdev] LLVM 2.6 and Aggregate Return Values: ARM In-Reply-To: <4B56D5D1.4050705@free.fr> References: <4B568234.60106@uiuc.edu> <4B56D5D1.4050705@free.fr> Message-ID: <201001201321.00010.jon@ffconsultancy.com> On Wednesday 20 January 2010 10:07:13 Duncan Sands wrote: > Hi John, > > > How well does LLVM 2.6 support aggregate return values for 64 bit > > targets? I'm currently working on 64 bit Mac OS X and 64 bit Linux. > > Are there any known problems or limitations? > > on x86-64 it depends on what you are returning, but for example you > should be able to return a 128 bit integer fine, but anything more > will cause the code generator blow up. This limitation has been > removed in the development version. I'm just curious but does this also work limitlessly on ARM? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e From kennethuil at gmail.com Wed Jan 20 08:13:11 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Wed, 20 Jan 2010 08:13:11 -0600 Subject: [LLVMdev] LLVM 2.6 and Aggregate Return Values: ARM In-Reply-To: <201001201321.00010.jon@ffconsultancy.com> References: <4B568234.60106@uiuc.edu> <4B56D5D1.4050705@free.fr> <201001201321.00010.jon@ffconsultancy.com> Message-ID: <400d33ea1001200613t78cf8867tb3967ede73f05a17@mail.gmail.com> On Wed, Jan 20, 2010 at 7:20 AM, Jon Harrop wrote: > On Wednesday 20 January 2010 10:07:13 Duncan Sands wrote: >> Hi John, >> >> > How well does LLVM 2.6 support aggregate return values for 64 bit >> > targets? ?I'm currently working on 64 bit Mac OS X and 64 bit Linux. >> > Are there any known problems or limitations? >> >> on x86-64 it depends on what you are returning, but for example you >> should be able to return a 128 bit integer fine, but anything more >> will cause the code generator blow up. ?This limitation has been >> removed in the development version. > > I'm just curious but does this also work limitlessly on ARM? Not yet, as far as I know. There is a small target-specific hook that needs to be implemented, and I've only done so for x86 since it's the only machine I own. As far as I know, no one has stepped up to fill in the hook for other processors. From jan_sjodin at yahoo.com Wed Jan 20 08:29:59 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Wed, 20 Jan 2010 06:29:59 -0800 (PST) Subject: [LLVMdev] llvm-mc and JIT Message-ID: <383941.39059.qm@web55603.mail.re4.yahoo.com> I have a question about llvm-mc and allowing inline assembly in the JIT. Is this planned to be implemented in the near future? There seems to have a been a lot more activity integrating llvm-mc lately and I was wondering when this will be completed? Thanks, Jan From dag at cray.com Wed Jan 20 09:21:47 2010 From: dag at cray.com (David Greene) Date: Wed, 20 Jan 2010 09:21:47 -0600 Subject: [LLVMdev] Question about llvm::dbg() In-Reply-To: References: Message-ID: <201001200921.48370.dag@cray.com> On Tuesday 19 January 2010 16:18, Jean-Daniel Dupas wrote: > Maybe I miss something but I think there is a problem with this code in > Support/Debug.cpp when compiling in Release build (NDEBUG defined): > > namespace llvm { > /// dbgs - Return dbgs(). > raw_ostream &dbgs() { > return dbgs(); > } > } I'll fix this. -Dave From criswell at uiuc.edu Wed Jan 20 09:35:22 2010 From: criswell at uiuc.edu (John Criswell) Date: Wed, 20 Jan 2010 09:35:22 -0600 Subject: [LLVMdev] LLVM 2.6 and Aggregate Return Values: 64 bit In-Reply-To: <4B56D5D1.4050705@free.fr> References: <4B568234.60106@uiuc.edu> <4B56D5D1.4050705@free.fr> Message-ID: <4B5722BA.1030704@uiuc.edu> Duncan Sands wrote: > Hi John, > > >> How well does LLVM 2.6 support aggregate return values for 64 bit >> targets? I'm currently working on 64 bit Mac OS X and 64 bit Linux. >> Are there any known problems or limitations? >> > > on x86-64 it depends on what you are returning, but for example you > should be able to return a 128 bit integer fine, Just to make sure I understand, if the size of the aggregate return value is 128 bits or less, then it should work. Correct? > but anything more > will cause the code generator blow up. By "blow up," do you mean that the code generator will fail to generate code at all (e.g., it will hit an assertion) or do you mean that the code generator will generate incorrect code? I am seeing the latter in my code, and I currently suspect it's a code generator bug (though I could be wrong). -- John T. > This limitation has been > removed in the development version. > > Ciao, > > Duncan. > From kennethuil at gmail.com Wed Jan 20 09:43:37 2010 From: kennethuil at gmail.com (Kenneth Uildriks) Date: Wed, 20 Jan 2010 09:43:37 -0600 Subject: [LLVMdev] LLVM 2.6 and Aggregate Return Values: 64 bit In-Reply-To: <4B5722BA.1030704@uiuc.edu> References: <4B568234.60106@uiuc.edu> <4B56D5D1.4050705@free.fr> <4B5722BA.1030704@uiuc.edu> Message-ID: <400d33ea1001200743w3da95263y4f3b05a3f71a14bf@mail.gmail.com> On Wed, Jan 20, 2010 at 9:35 AM, John Criswell wrote: > Duncan Sands wrote: >> Hi John, >> >> >>> How well does LLVM 2.6 support aggregate return values for 64 bit >>> targets? ?I'm currently working on 64 bit Mac OS X and 64 bit Linux. >>> Are there any known problems or limitations? >>> >> >> on x86-64 it depends on what you are returning, but for example you >> should be able to return a 128 bit integer fine, > Just to make sure I understand, if the size of the aggregate return > value is 128 bits or less, then it should work. ?Correct? >> ?but anything more >> will cause the code generator blow up. > By "blow up," do you mean that the code generator will fail to generate > code at all (e.g., it will hit an assertion) or do you mean that the > code generator will generate incorrect code? ?I am seeing the latter in > my code, and I currently suspect it's a code generator bug (though I > could be wrong). You will see the code generator hit an assertion if the return value isn't supported on your platform. If assertions are turned off, then of course it will generate bad code. On the trunk, large struct returns *should* work on x86-64, but I haven't tested that... I've only tested x86. From regehr at cs.utah.edu Wed Jan 20 09:54:55 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 08:54:55 -0700 Subject: [LLVMdev] updated code size comparison Message-ID: <4B57274F.1020109@cs.utah.edu> Hi folks, I've posted an updated code size comparison between LLVM, GCC, and others here: http://embed.cs.utah.edu/embarrassing/ New in this version: - much larger collection of harvested functions: more than 360,000 - bug fixes and UI improvements - added the x86 Open64 compiler John From jan_sjodin at yahoo.com Wed Jan 20 11:05:50 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Wed, 20 Jan 2010 09:05:50 -0800 (PST) Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B4D900D.6040601@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> <4B4D1E79.8070807@fim.uni-passau.de> <25423_1263356804_4B4D4B83_25423_4709_1_457655.70882.qm@web55601.mail.re4.yahoo.com> <4B4D900D.6040601@fim.uni-passau.de> Message-ID: <307129.59669.qm@web55608.mail.re4.yahoo.com> >>> bbs, to generate these edges. As we do not have to modify the CFG other >>> passes like dominance information are still valid, and we do not have to >>> create a lot of auxiliary bbs, to be able to detect all regions. This >>> saves memory and runtime. In general it is probably not too easy to >>> decide where to insert these bbs either: >> >> The general rule is to split all blocks with multiple in-edges and multiple out-edges >> into blocks with either multiple in-edges or multiple out-edges, but not both. > This is not sufficient, as shown in the example below. It would allow > only one region. Yes, but with the insertion of merge-blocks it will allow more (see below). >> One option is to keep this as an invariant throughout the compiler and make use >> of the merge blocks (multiple in-edges) to contain only PHI-nodes, and all other code >> in regular basic blocks. There are different variations on this that may or may not be >> useful. > This might be possible, however probably doubling the number of bbs. I don't know if it will be double, but there will be more basic blocks for sure. >>>> CFG: >>> 0 >>> | >>> 1 >>> / | >>> 2 | >>> / \ 3 >>> 4 5 | >>> | | | >>> 6 7 8 >>> \ | / >>> \ |/ region A: 1 -> 9 {1,2,3,4,5,6,7,8} >>> 9 region B: 2 -> 9 {2,4,5,6,7} >>> >>> So we need one bb that joins 6 and 7 and one that joins the two regions >>> >>> CFG: 0 >>> | >>> 1 >>> / | >>> 2 | >>> / \ 3 >>> 4 5 | >>> | | | >>> 6 7 8 >>> \ | | >>> \ | | region A: (0,1) -> (9b,9) {1,2,3,4,5,6,7,8,9a,9b} >>> 9a | region B: (1,2) -> (9a,9b) {2,4,5,6,7,9a} >>> \ / >>> 9b >>> | >>> 9 >> >> It is fairly simple to use the information from the algorithm to decide >> where those merges should be inserted to get the expected regions. >> This may be needed in the cases where a sub-region is too complicated >> to be represented and must be abstracted into a "black box". > From which algorithm? The program structure tree does not give this > information, does it? The algorithm that computes the SESE-regions can be used to determine where the merge-nodes should be inserted. There are a couple of ways of doing it, but if the bracket sets on two eges can be intersected to match a third edge (which dominates the first two), you can insert a merge block for the two edges. You don't have to compute the dominators, but it helps to explain the problem that way. >>>> My approach is comparable to this paper: >>>> The Refined Process Structure Tree by JussiVanhatalo, Hagen V?lzer, >>>> Jana Koehler >>> >>> I was looking through some slides that described their algorithm. One case that >>> seems to be legal is this: >>> >>> | >>> Entry >>> / \ >>> R0 R1 >>> \ / >>> Exit >>> | >>> >>> With two fragments: Entry->R0->Exit and Entry->R1->Exit, which means >>> that a fragment cannot be identified using only the entry and exit blocks, but >>> the internal blocks or edges will also need to be listed. I don't know if this is >>> relevant to your implementation. > > No. The ideas are comparable, however I believe their implementation is > a little complicated. ;-) Do you have the same definition of a region and entry/exit blocks as they do? > I would mark the regions as: > > Region A: R0 -> Exit, containing {R0} > Region B: R1 -> Exit, containing {R1} Is the entry always contained and is the exit never contained, or is that specified per region? Depending on the restrictions of entry and exit blocks a loop with a single basic block cannot be an entry or exit by itself. Example: | A /| _ / |/ \ B R | \ |\_/ \| C | If you only care about R in this case how is the region formed? >>> The implementation however takes advantage of the existence of >>> Dominance/PostDominance information. Therefore it is simpler and >>> hopefully faster. At the moment run time is comparable to dominance tree >>> calculation. >> >> Both algorithms are linear so there is really no big difference in time imo. > Sure. However in terms of maintainability it is nice to be able to reuse > existing analysis instead of write another triconnected component > analysis upfront. > >> I believe the biggest difference that you mention is that you can capture more >> complicated regions without having to modify the CFG with the current >> algorithm. > Yes. > >>> If you want, have a look into some results I got with a pass extracting >>> maximal non trivial regions that do not contain loops from libbz2: >>> >>> http://tobias.osaft.eu/llvm/region/bz2NoLoops/ >>> >>> Interesting ones: >>> >>> regions_without_loops.BZ2_bzBuffToBuffCompress.dot.png >>> has a lot of exit edges. >> >> I think this example proves the strengths and weaknesses of both >> approaches. Making that region into structured control flow would add a lot >> of additional blocks. This will also happen after generating code >> from the polyhedral model, so either way the cost is there if the optimization >> is successful. > Yes, but just in this case and not for regions where the model cannot be > applied. If the regions pass is used for analysis purposes, nothing has > to be touched. > >> The second case is where the optimization fails (no profitable >> transformation found) and the CFG can remain untouched. >> >> The third case is if one of those blocks contains something complicated. >> I believe the current algorithm simply fails and cannot detect the region. > Which algorithm? The one in Graphite? Or the region detection I wrote > here? This is just plain region detection, that does not even look at > the content but builds a region tree (program structure tree). It just > detects every possible region. > The selection would be a later pass. My assumption was that there is a selection in there somewhere. Do you plan to refine the regions in the selection phase in any way? >> If the >> CFG is modified this would allow an internal SESE-region to become a black box, and the >> the outer regions could be optimized. >This is an optimization, however I think it is orthogonal to the region >detection problem. Say it works with any algorithm. I believe that creating a black-box will map a lot more cleanly to an edge-based region definition, since block-based may include multiple entry/exit sub-regions that will not encapsulate control flow in a reasonable way. >>> regions_without_loops.bzopen_or_bzdopen.dot.png >>> the first region has two entry edges. One is the loop latch. >>> (Keep in mind all regions have the same color, so if it seems there is >>> an edge into a region, there are just two regions close by) >>> >>> Without a prepass that exposes the edges almost no region could be >>> detected with the "standard" approach. >> >> Indeed the CFG will have to be modified for these cases. I it seems to me that the trade-off >> between the two approaches is that the algorithm that you currently have is a cheaper up >> front, but may be less capable in some cases, while the "standard" algorithm will be more >> expensive, but can handle problematic regions better. Would you agree? > > I agree that the algorithm I have is cheaper upfront, but I do not yet > see a case where the algorithm is less capable. Would you mind to give > an example or to highlight the relevant part of the discussion? With the insertion of extra merge-blocks the code becomes more structured and the PST can be refined further. A more fine-grained PST may allow more cases to be handled. Thanks Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/6ce1a3c3/attachment.html From jan_sjodin at yahoo.com Wed Jan 20 11:16:37 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Wed, 20 Jan 2010 09:16:37 -0800 (PST) Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B502D92.5070704@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <314301.28741.qm@web55601.mail.re4.yahoo.com> <6363_1263521041_4B4FCD10_6363_3705_1_4B4FCD05.60107@gmail.com> <4B502D92.5070704@fim.uni-passau.de> Message-ID: <913893.66561.qm@web55606.mail.re4.yahoo.com> > I think this should be implemented as a RegionFilter, that checks if a > region contains a loop, and that can be asked for further information. > In general I do not think this kind of analysis belongs to a region, but > as you proposed some kind of filter could be applied. In the short term > the passes who need this information could get it on their own. > >> and a long term consideration about "region pass": >> >> if we want to integrate region analysis and optimization framework into >> llvm, i think we can use an approach that similar to loop analysis and >> optimization: write a class "regionpass" inherit from "pass", and the >> corresponding pass manger "RegionPassManager". (a kind of function pass) >> if we follow this approach, we need to push the region pass manager into >> the llvm pass manager stack. >> the first question of this approach is, whats the relationship between >> "loop pass manager" and "region pass manager"? >> >> way 1: make region pass manager below loop pass manager in the stack >> >> pass manager stack: >> >> bb pass manager <---top >> loop pass manager >> region pass manager >> function pass manager >> ... <---bottom >> >> in this way the region hierarchy need to be reconstruct when a loop >> transform change it. >> >> way 2: make region pass manager above loop pass manager in the stack >> >> pass manager stack: >> >> bb pass manager <---top >> region pass manager >> loop pass manager >> function pass manager >> ... <---bottom >> >> in this way the loop hierarchy need to be reconstruct when a region pass >> change it. >> >> now we need to choose a way to minimize the loop reconstruction or >> region reconstruction. i think that the chance that a region transform >> affect the loop structure is smaller, so maybe way 2 is better. > This would need some thoughts. Ideal I think we would not order them, but if > a region changed, just reconstruct the loops that are in this region and > if a > loop changed just reconstruct the regions in this loop. Imo, a loop is simply a special kind of region, so a "filter" is perhaps the way to go if you are interested in loops. Regions containing loops will have to be inspected using the PST. >> at last, i have some idea about finding a interesting region: (maybe >> make the region analysis too complex) >> >> we can introduce some thing like "region filter" that determine the >> property of a region, the region filter will like a "pass", which can >> run on an instruction at a time, a basic block at a time, or even a sub >> region at a time, then we can write a "filter manager" like "pass >> manager " to stream the filtering process, so that we can promote the >> the performance of the region finding process. > Yes, I like this idea. > So the basic design would be that we have some passes like: > Maximal Region regarding an Analysis/Filter > Minimal Region regarding an Analysis/Filter > All Regions regarding an Analysis/Filter > So a pass can ask the regionpass manager for a specific kind of regions. > It is than just invoked for regions, that fulfill this requirement. If you want to be able to manipulate specific regions you can have a generic region class, and then sub classes for loop, if, unstructured etc. That way it is easy to ask for the body of a loop or the true or false regions of an if-region. It will also allow you to have different kinds of loops for/do-while, but still treat them in a uniform way in some cases. - Jan From clattner at apple.com Wed Jan 20 11:47:35 2010 From: clattner at apple.com (Chris Lattner) Date: Wed, 20 Jan 2010 09:47:35 -0800 Subject: [LLVMdev] llvm-mc and JIT In-Reply-To: <383941.39059.qm@web55603.mail.re4.yahoo.com> References: <383941.39059.qm@web55603.mail.re4.yahoo.com> Message-ID: <168809C2-8828-4D5E-9358-FB4D71E02003@apple.com> On Jan 20, 2010, at 6:29 AM, Jan Sjodin wrote: > I have a question about llvm-mc and allowing inline assembly in the > JIT. > Is this planned to be implemented in the near future? There seems > to have a been a lot more activity integrating llvm-mc lately and I > was wondering > when this will be completed? No promises, but my goal is to get direct .o file writing to "beta" quality for darwin/x86[-64] by the end of Feb. This will probably not include inline asm support yet, but hopefully that will follow in the month or two after that. -Chris From eli.friedman at gmail.com Wed Jan 20 12:16:28 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 20 Jan 2010 10:16:28 -0800 Subject: [LLVMdev] updated code size comparison In-Reply-To: <4B57274F.1020109@cs.utah.edu> References: <4B57274F.1020109@cs.utah.edu> Message-ID: On Wed, Jan 20, 2010 at 7:54 AM, John Regehr wrote: > Hi folks, > > I've posted an updated code size comparison between LLVM, GCC, and > others here: > > ? http://embed.cs.utah.edu/embarrassing/ > > New in this version: > > - much larger collection of harvested functions: more than 360,000 > > - bug fixes and UI improvements > > - added the x86 Open64 compiler I started looking through the llvm-gcc vs. clang comparisons, and noticed that in http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/A9/A9AB5AE7.c , size_t is declared incorrectly. Any idea how that might have happened? -Eli From kuba at gcc.gnu.org Wed Jan 20 12:51:37 2010 From: kuba at gcc.gnu.org (Jakub Staszak) Date: Wed, 20 Jan 2010 15:51:37 -0300 Subject: [LLVMdev] Non-local DSE optimization In-Reply-To: References: <830819AC-2A14-435D-86FE-CB09F73F31F6@gcc.gnu.org> <4A9F86F4.3070609@free.fr> <273A970E-E76D-4BC0-8EF0-64E9C391DA6F@gcc.gnu.org> <4AA36E3C.7070507@mxc.ca> Message-ID: Hello, Patch was improved and adjusted to the trunk. I think that CFG Simplification would be useful before DSE. In other case some modification of "MergeBlockIntoPredecessor" would be needed (see TODO note). Please note also, that patch is disabled by default. Tests on test-suite haven't shown any failures. Regards -Jakub -------------- next part -------------- A non-text attachment was scrubbed... Name: dse_ssu-4.patch Type: application/octet-stream Size: 17858 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/f11caf20/attachment.obj -------------- next part -------------- On Sep 8, 2009, at 2:26 PM, Jakub Staszak wrote: > Hello, > > Bug is already fixed by Chris (see: http://llvm.org/bugs/show_bug.cgi?id=4915). > > I added getRootNode() == NULL condition to my patch. It's not a great solution, but it is enough for now I think. New patch attached. > > -Jakub > On Sep 6, 2009, at 6:09 AM, Nick Lewycky wrote: > >> Jakub Staszak wrote: >>> Hi, >>> It looks like PDT.getRootNode() returns NULL for: >>> define fastcc void @c974001__lengthy_calculation. 1736(%struct.FRAME.c974001* nocapture %CHAIN.185) noreturn { >>> entry: >>> br label %bb >>> bb: >>> br label %bb >>> } >>> Isn't it a bug in PostDominatorTree? >>> Please note that this crashes: >>> opt -postdomtree -debug dom_crash.bc >>> I think this should be reported as a bug, >> >> Yes, that's a bug. Please file it. >> >> The PDT root calculation is looking for all BBs with no successors, this won't work in the face of loops. Either we need to teach PDT users that there can be zero roots, or we need to synthesize a fake root. >> >> The latter is already supported (to handle multiple exits) so that's probably the easiest fix. >> >> Nick >> >>> -Jakub >>> On Sep 3, 2009, at 7:05 AM, Duncan Sands wrote: >>>> Hi Jakub, interesting patch. I ran it over the Ada testsuite and this >>>> picked up some problems even without enabling dse-ssu. For example, >>>> "opt -inline -dse -domtree" crashes on the attached testcase. >>>> >>>> Ciao, >>>> >>>> Duncan. >>>> ; ModuleID = 'dom_crash.bc' >>>> target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32- i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64- f80:32:32" >>>> target triple = "i386-pc-linux-gnu" >>>> >>>> %struct.FRAME.c974001 = type { i32, i8, void (%struct.c974001__timed_calculation*)* } >>>> %struct.FRAME.c974001__timed_calculationB = type { %struct.FRAME.c974001*, i32 } >>>> %struct.FRAME.c974001__timed_calculation__calculationA = type { %struct.system__tasking__async_delays__delay_block } >>>> %struct.RETURN = type { i32, i32 } >>>> %struct.ada__exceptions__exception_occurrence = type { %struct.system__standard_library__exception_data*, i32, [200 x i8], i8, i8, i32, i32, [50 x i32], i32 } >>>> %struct.c974001__timed_calculation = type { %struct.system__tasking__ada_task_control_block* } >>>> %struct.system__os_interface__pthread_mutex_t = type { i32, i32, i32, i32, %struct.RETURN } >>>> %struct.system__soft_links__tsd = type { %struct.system__stack_checking__stack_info, i32, i32, %struct.ada__exceptions__exception_occurrence } >>>> %struct.system__stack_checking__stack_info = type { i32, i32, i32 } >>>> %struct.system__stack_usage__stack_analyzer = type { [32 x i8], i32, i32, i32, i32, i32, i32, i32, i8, i32 } >>>> %struct.system__standard_library__exception_data = type { i8, i8, i32, i32, %struct.system__standard_library__exception_data*, i32, void ()* } >>>> %struct.system__task_primitives__private_data = type { i32, i32, [48 x i8], %struct.system__os_interface__pthread_mutex_t } >>>> %struct.system__tasking__accept_alternative = type { i8, i32 } >>>> %struct.system__tasking__accept_list_access = type { [0 x %struct.system__tasking__accept_alternative]*, %struct.RETURN* } >>>> %struct.system__tasking__ada_task_control_block = type { i32, %struct.system__tasking__common_atcb, [19 x %struct.system__tasking__entry_call_record], i32, %struct.system__tasking__accept_list_access, i32, i32, i32, i32, i32, i8, i8, i8, i8, i8, i8, i8, i8, i32, i32, i32, i64, i32, i32, [4 x i32], i8, i32*, [0 x %struct.system__tasking__entry_queue] } >>>> %struct.system__tasking__async_delays__delay_block = type { %struct.system__tasking__ada_task_control_block*, i32, i64, i8, %struct.system__tasking__async_delays__delay_block*, %struct.system__tasking__async_delays__delay_block* } >>>> %struct.system__tasking__common_atcb = type { i8, %struct.system__tasking__ada_task_control_block*, i32, i32, i32, [32 x i8], i32, %struct.system__tasking__entry_call_record*, %struct.system__task_primitives__private_data, i32, void (i32)*, %struct.system__soft_links__tsd, %struct.system__tasking__ada_task_control_block*, %struct.system__tasking__ada_task_control_block*, %struct.system__tasking__ada_task_control_block*, i32, i8*, i8, i8, %struct.system__stack_usage__stack_analyzer, i32, %struct.system__tasking__termination_handler, %struct.system__tasking__termination_handler } >>>> %struct.system__tasking__entry_call_record = type { %struct.system__tasking__ada_task_control_block*, i8, i8, i32, %struct.system__standard_library__exception_data*, %struct.system__tasking__entry_call_record*, %struct.system__tasking__entry_call_record*, i32, i32, i32, %struct.system__tasking__ada_task_control_block*, i32, %struct.system__tasking__entry_call_record*, i32, i8, i8, i8 } >>>> %struct.system__tasking__entry_queue = type { %struct.system__tasking__entry_call_record*, %struct.system__tasking__entry_call_record* } >>>> %struct.system__tasking__termination_handler = type { i32, void (i32, i8, %struct.system__tasking__ada_task_control_block*, %struct.ada__exceptions__exception_occurrence*)* } >>>> >>>> @C.168.1967 = external constant %struct.RETURN ; < %struct.RETURN*> [#uses=1] >>>> >>>> define void @system__tasking__activation_chainIP (%struct.c974001__timed_calculation* nocapture %_init) nounwind { >>>> entry: >>>> ret void >>>> } >>>> >>>> define void @_ada_c974001() { >>>> entry: >>>> %tramp = call i8* @llvm.init.trampoline(i8* undef, i8* bitcast (void (%struct.FRAME.c974001*, %struct.c974001__timed_calculation*)* @c974001__timed_calculationB.1770 to i8*), i8* undef) ; [#uses=0] >>>> unreachable >>>> } >>>> >>>> declare i8* @llvm.init.trampoline(i8*, i8*, i8*) nounwind >>>> >>>> define fastcc void @c974001__lengthy_calculation. 1736(%struct.FRAME.c974001* nocapture %CHAIN.185) noreturn { >>>> entry: >>>> br label %bb >>>> >>>> bb: ; preds = %bb, %entry >>>> br label %bb >>>> } >>>> >>>> define fastcc void @c974001__timed_calculation__calculation__B19b__B21b__A17b___clean. 1830(%struct.FRAME.c974001__timed_calculation__calculationA* %CHAIN. 188) { >>>> entry: >>>> ret void >>>> } >>>> >>>> define fastcc void @c974001__timed_calculation__calculationA. 1820(%struct.FRAME.c974001__timed_calculationB* nocapture %CHAIN. 190) { >>>> entry: >>>> br i1 undef, label %bb, label %bb3 >>>> >>>> bb: ; preds = %entry >>>> unreachable >>>> >>>> bb3: ; preds = %entry >>>> br i1 undef, label %bb4, label %bb5 >>>> >>>> bb4: ; preds = %bb3 >>>> unreachable >>>> >>>> bb5: ; preds = %bb3 >>>> invoke void undef() >>>> to label %invcont unwind label %lpad >>>> >>>> invcont: ; preds = %bb5 >>>> %0 = invoke i8 @system__tasking__async_delays__enqueue_duration(i64 undef, %struct.system__tasking__async_delays__delay_block* undef) >>>> to label %bb8 unwind label %lpad ; [#uses=0] >>>> >>>> bb8: ; preds = %invcont >>>> invoke void undef() >>>> to label %invcont9 unwind label %lpad75 >>>> >>>> invcont9: ; preds = %bb8 >>>> invoke fastcc void @c974001__lengthy_calculation. 1736(%struct.FRAME.c974001* undef) >>>> to label %invcont10 unwind label %lpad75 >>>> >>>> invcont10: ; preds = %invcont9 >>>> invoke void @report__failed([0 x i8]* undef, %struct.RETURN* @C. 168.1967) >>>> to label %bb16 unwind label %lpad75 >>>> >>>> bb16: ; preds = %invcont10 >>>> invoke fastcc void @c974001__timed_calculation__calculation__B19b__B21b__A17b___clean. 1830(%struct.FRAME.c974001__timed_calculation__calculationA* undef) >>>> to label %bb27 unwind label %lpad71 >>>> >>>> bb27: ; preds = %bb16 >>>> unreachable >>>> >>>> lpad: ; preds = %invcont, %bb5 >>>> unreachable >>>> >>>> lpad71: ; preds = %bb16 >>>> unreachable >>>> >>>> lpad75: ; preds = %invcont10, %invcont9, %bb8 >>>> unreachable >>>> } >>>> >>>> declare i8 @system__tasking__async_delays__enqueue_duration(i64, %struct.system__tasking__async_delays__delay_block*) >>>> >>>> declare void @report__failed([0 x i8]*, %struct.RETURN*) >>>> >>>> define void @c974001__timed_calculationB.1770(%struct.FRAME.c974001* nest %CHAIN.191, %struct.c974001__timed_calculation* nocapture %_task) { >>>> entry: >>>> invoke void undef() >>>> to label %invcont unwind label %lpad >>>> >>>> invcont: ; preds = %entry >>>> invoke void @system__tasking__stages__complete_activation() >>>> to label %bb unwind label %lpad >>>> >>>> bb: ; preds = %bb5, %invcont4, %invcont >>>> invoke void @system__tasking__rendezvous__selective_wait(%struct.RETURN* noalias sret undef, [0 x %struct.system__tasking__accept_alternative]* undef, %struct.RETURN* undef, i8 2) >>>> to label %invcont4 unwind label %lpad25 >>>> >>>> invcont4: ; preds = %bb >>>> br i1 undef, label %bb5, label %bb >>>> >>>> bb5: ; preds = %invcont4 >>>> invoke fastcc void @c974001__timed_calculation__calculationA. 1820(%struct.FRAME.c974001__timed_calculationB* undef) >>>> to label %bb unwind label %lpad25 >>>> >>>> bb7: ; preds = %lpad25 >>>> unreachable >>>> >>>> lpad: ; preds = %invcont, %entry >>>> unreachable >>>> >>>> lpad25: ; preds = %bb5, %bb >>>> br i1 undef, label %bb7, label %ppad >>>> >>>> ppad: ; preds = %lpad25 >>>> unreachable >>>> } >>>> >>>> declare void @system__tasking__stages__complete_activation() >>>> >>>> declare void @system__tasking__rendezvous__selective_wait(%struct.RETURN* noalias sret, [0 x %struct.system__tasking__accept_alternative]*, %struct.RETURN*, i8) >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev From edwintorok at gmail.com Wed Jan 20 13:17:11 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Wed, 20 Jan 2010 21:17:11 +0200 Subject: [LLVMdev] updated code size comparison In-Reply-To: <4B57274F.1020109@cs.utah.edu> References: <4B57274F.1020109@cs.utah.edu> Message-ID: <4B5756B7.5030803@gmail.com> On 01/20/2010 05:54 PM, John Regehr wrote: > Hi folks, > > I've posted an updated code size comparison between LLVM, GCC, and > others here: > > http://embed.cs.utah.edu/embarrassing/ > > New in this version: > > - much larger collection of harvested functions: more than 360,000 > > - bug fixes and UI improvements > > - added the x86 Open64 compiler > Hi, Could you also add a main() for each of these files, and do a very simple test that the optimized functions actually work? At least for functions that take only integers and return integers this could be automated if you compare -O0 output with the optimized outputs. The neon_helper.c testcase is clearly misoptimized by gcc-head here: http://embed.cs.utah.edu/embarrassing/jan_10/harvest/compare_clang-head_gcc-head/compare_23BD1620_disasm.shtml Try calling it like this: int main() { printf("%d\n", helper_neon_rshl_s8(0x12345, 15)); return 0; } Prints 74496 here, and not 0 (gcc-head optimized it to a function returning 0). Best regards, --Edwin From regehr at cs.utah.edu Wed Jan 20 14:05:52 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 13:05:52 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> Message-ID: > I started looking through the llvm-gcc vs. clang comparisons, and > noticed that in > http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/A9/A9AB5AE7.c > , size_t is declared incorrectly. Any idea how that might have > happened? Hi Eli, Thanks for pointing this out, I'll look into this tonight. However I can give you the quick generic answer right now (of course you already know it) which is that real C code does just about anything that can be parsed :). If LLVM warns about this incorrect definition I can eliminate this kind of test case, I'll look into this as well. John From regehr at cs.utah.edu Wed Jan 20 14:10:02 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 13:10:02 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: <4B5756B7.5030803@gmail.com> References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> Message-ID: Hi Torok- > Could you also add a main() for each of these files, and do > a very simple test that the optimized functions actually work? Unfortunately, testing isolated C functions is much harder than just passing them random data! Consider this function: int foo (int x, int y) { return x+y; } The behavior of foo() is undefined when x+y overflows. If course it is trivial to come up with similar examples based on shifts, multiplies and divides, etc. A potential solution is "under-constrained execution": http://www.stanford.edu/~engler/issta07v-engler.pdf I will bug Dawson and Daniel and see if I can get ahold of some code for this. John From edwintorok at gmail.com Wed Jan 20 14:33:45 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Wed, 20 Jan 2010 22:33:45 +0200 Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> Message-ID: <4B5768A9.6020002@gmail.com> On 01/20/2010 10:10 PM, John Regehr wrote: > Hi Torok- > >> Could you also add a main() for each of these files, and do >> a very simple test that the optimized functions actually work? > > Unfortunately, testing isolated C functions is much harder than just > passing them random data! > > Consider this function: > > int foo (int x, int y) { return x+y; } > > The behavior of foo() is undefined when x+y overflows. If course it > is trivial to come up with similar examples based on shifts, > multiplies and divides, etc. Indeed, but can't an analysis find at least one value for each variable where the behavior is not undefined? Such a value must exist, or the entire function is useless if it always has undefined behavior. Sure, testing on 1 such value (or a random) value won't prove that the result is correct, but may help finding trivial miscompilations like the neon_helper case. Alternatively a testcase could be manually constructed for the top 10 functions in the size comparison charts, and see whether they are miscompiled. Repeat until top 10 has no miscompilations. > > A potential solution is "under-constrained execution": > > http://www.stanford.edu/~engler/issta07v-engler.pdf > > I will bug Dawson and Daniel and see if I can get ahold of some code > for this. Although EXE isn't, KLEE is publicly available. Best regards, --Edwin From regehr at cs.utah.edu Wed Jan 20 14:49:22 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 13:49:22 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: <4B5768A9.6020002@gmail.com> References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> <4B5768A9.6020002@gmail.com> Message-ID: > Indeed, but can't an analysis find at least one value for each variable > where the behavior is not undefined? > Such a value must exist, or the entire function is useless if it always > has undefined behavior. Good point :). > Sure, testing on 1 such value (or a random) value won't prove that the > result is correct, but may help finding trivial > miscompilations like the neon_helper case. Are you absolutely sure it's a miscompilation? I have already shot myself in the foot a couple times on the GCC mailing list or bugzilla by pointing out a bug that turned out to be code with subtle undefined behavior... > Alternatively a testcase could be manually constructed for the top 10 > functions in the size comparison charts, > and see whether they are miscompiled. Repeat until top 10 has no > miscompilations. Tell you what: if I get enough test cases like this, I'll write the test harness supporting it. I don't have time to do this kind of code inspection myself. There has been talk (I don't remember where) about a Clang option for detecting undefined behavior. Is there any progress on this? This could be used to enable automated random testing. John From clattner at apple.com Wed Jan 20 14:54:29 2010 From: clattner at apple.com (Chris Lattner) Date: Wed, 20 Jan 2010 12:54:29 -0800 Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> <4B5768A9.6020002@gmail.com> Message-ID: On Jan 20, 2010, at 12:49 PM, John Regehr wrote: > > There has been talk (I don't remember where) about a Clang option for > detecting undefined behavior. Is there any progress on this? This > could > be used to enable automated random testing. -fcatch-undefined-behavior: http://clang.llvm.org/docs/UsersManual.html#codegen Right now it only catches out of range shifts and simple array out of bound issues, not all undefined behavior. -Chris From edwintorok at gmail.com Wed Jan 20 14:58:15 2010 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Wed, 20 Jan 2010 22:58:15 +0200 Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> <4B5768A9.6020002@gmail.com> Message-ID: <4B576E67.4050500@gmail.com> On 01/20/2010 10:49 PM, John Regehr wrote: >> Indeed, but can't an analysis find at least one value for each variable >> where the behavior is not undefined? >> Such a value must exist, or the entire function is useless if it always >> has undefined behavior. > > Good point :). > >> Sure, testing on 1 such value (or a random) value won't prove that the >> result is correct, but may help finding trivial >> miscompilations like the neon_helper case. > > Are you absolutely sure it's a miscompilation? I have already shot > myself in the foot a couple times on the GCC mailing list or bugzilla > by pointing out a bug that turned out to be code with subtle undefined > behavior... Well if it is not then it is a qemu bug, so it is a bug in either case, you just have to report it to another bugzilla ;) The code does conversions by assigning to one union member and reading from another. AFAIK that was a GCC language extension, maybe they don't support it in the latest release, or accidentaly broke it. I don't know. Someone should reduce a testcase for gcc-head to see exactly what it is about. My gcc (4.4) doesn't miscompile it. Either way I'd rather see a warning from gcc when it decides to optimize the entire function away. > >> Alternatively a testcase could be manually constructed for the top 10 >> functions in the size comparison charts, >> and see whether they are miscompiled. Repeat until top 10 has no >> miscompilations. > > Tell you what: if I get enough test cases like this, I'll write the > test harness supporting it. I don't have time to do this kind of code > inspection myself. Makes sense. > > There has been talk (I don't remember where) about a Clang option for > detecting undefined behavior. Is there any progress on this? This > could be used to enable automated random testing. *Yes, -fcatch-undefined-behavior. http://clang.llvm.org/docs/UsersManual.html#codegen * Best regards, --Edwin From regehr at cs.utah.edu Wed Jan 20 15:39:21 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 14:39:21 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: <4B576E67.4050500@gmail.com> References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> <4B5768A9.6020002@gmail.com> <4B576E67.4050500@gmail.com> Message-ID: > *Yes, -fcatch-undefined-behavior. > http://clang.llvm.org/docs/UsersManual.html#codegen Thanks guys. My understanding of the situation is that for meaningful automated testing, the protection from undefined behavior has to cover all problems that actually occur in the code under test. But I'll keep checking on this... John From eli.friedman at gmail.com Wed Jan 20 16:10:14 2010 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 20 Jan 2010 14:10:14 -0800 Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> Message-ID: On Wed, Jan 20, 2010 at 12:05 PM, John Regehr wrote: >> I started looking through the llvm-gcc vs. clang comparisons, and >> noticed that in >> http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/A9/A9AB5AE7.c >> , size_t is declared incorrectly. ?Any idea how that might have >> happened? > > Hi Eli, > > Thanks for pointing this out, I'll look into this tonight. > > However I can give you the quick generic answer right now (of course you > already know it) which is that real C code does just about anything that can > be parsed :). Of course, but this looks like the declaration of memset came from a system header. > If LLVM warns about this incorrect definition I can eliminate this kind of > test case, I'll look into this as well. clang warns and doesn't treat the usual declaration of memset as the C library memset if size_t is wrong; gcc apparently doesn't care. -Eli From minwook.ahn at gmail.com Wed Jan 20 18:38:42 2010 From: minwook.ahn at gmail.com (minwook Ahn) Date: Thu, 21 Jan 2010 09:38:42 +0900 Subject: [LLVMdev] [LLVMDev] Does our own developed module and functions can go along with the future improved version of LLVM? In-Reply-To: <9a9942201001120905p4d977226gf29ce9cd3121b066@mail.gmail.com> References: <3d49ce701001112025v1801244wea5c19248d250796@mail.gmail.com> <4B4C3259.3010905@free.fr> <9a9942201001120905p4d977226gf29ce9cd3121b066@mail.gmail.com> Message-ID: <3d49ce701001201638sf18ec99gec2bb4fc662c0a70@mail.gmail.com> Thank you for your reply. Minwook Ahn 2010/1/13 Reid Kleckner > On Tue, Jan 12, 2010 at 3:27 AM, Duncan Sands wrote: > > Hi Minwook Ahn, > > > >> We want to build our compiler based on LLVM by adding our own modules > >> and functions > >> > >> which are specific to the features of our processor hardware. > > > > do you mean that you have files containing bitcode which contain useful > > routines for your processor, and that you use like a library? > > I think the question was, can they write their own backend for LLVM (a > new Target) and will their code automatically work with future > releases of LLVM. > > In that case, the answer is yes, you can develop your own backend, but > no, LLVM does not provide API stability. As new versions of LLVM are > released you would have to update your code to the new API or stay > with the old version of LLVM. > > Reid > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100121/406e9567/attachment.html From minwook.ahn at gmail.com Wed Jan 20 19:14:21 2010 From: minwook.ahn at gmail.com (minwook Ahn) Date: Thu, 21 Jan 2010 10:14:21 +0900 Subject: [LLVMdev] [LLVMDev] Is there any way to eliminate zero-extension instruction? In-Reply-To: References: <3d49ce701001192355p43cc28bak84cc66f322e7697b@mail.gmail.com> Message-ID: <3d49ce701001201714h2ef7828jb16868e85d60288@mail.gmail.com> Thank you for your reply. In case of setcc, I saw it was removed. But I could not delete the extension instruction in other cases of zero extension from i1 to i32. For example, in this case. int main( int argc, char *argv[] ) { int i = ( argc > 0 ) & ( argv != NULL ); return i; } So can you let us know how to remove it? Thank you in advance. Minwook Ahn p.s. Sorry for the duplicated message if you get this twice. 2010/1/20 Richard Osborne > > On 20 Jan 2010, at 07:55, minwook Ahn wrote: > > > Dear developers. > > > > We try to make our own backend of llvm for our target machine. > > > > Assume that we have the following code in our source code. > > > > int i = ( a < b ); > > > > The code is translated into > > > > r0 <- gt r1 r2 > > r3 <- and r0 0x1 > > > > We think that r3 is not necessary. Is there any way to eliminate it by > just modifying > > > > our backend? > > > > Thank you in advance. > > > > Minwook Ahn > > Have you told LLVM the result of setcc operations is 0 or 1? Add the > following to the constructor of your ISelLowering class: > > setBooleanContents(ZeroOrOneBooleanContent); > > -- > Richard Osborne | XMOS > http://www.xmos.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100121/cc5c4aeb/attachment.html From regehr at cs.utah.edu Wed Jan 20 19:29:19 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 18:29:19 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> Message-ID: > Of course, but this looks like the declaration of memset came from a > system header. Argh, my fault-- I let some files preprocessed on a 64-bit host sneak into the harvesting run. I'll get rid of them for the next run. John From regehr at cs.utah.edu Wed Jan 20 21:17:14 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 20:17:14 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> Message-ID: > clang warns and doesn't treat the usual declaration of memset as the C > library memset if size_t is wrong; gcc apparently doesn't care. Eli-- I looked at this code a bit more closely and it seems to me that (in this particular case, by luck) the gcc strategy of ignoring the problem is OK. Clang wants size_t to be an unsigned int, whereas in these files, size_t is an unsigned long. I can't think of any observable difference between these two types on x86-clang. Anyway this doesn't form an argument that clang should relax its rules, but it does indicate that gcc is probably not doing anything too silly. John From regehr at cs.utah.edu Wed Jan 20 22:21:46 2010 From: regehr at cs.utah.edu (John Regehr) Date: Wed, 20 Jan 2010 21:21:46 -0700 (MST) Subject: [LLVMdev] updated code size comparison In-Reply-To: References: <4B57274F.1020109@cs.utah.edu> <4B5756B7.5030803@gmail.com> <4B5768A9.6020002@gmail.com> Message-ID: > Right now it only catches out of range shifts and simple array out of > bound issues, not all undefined behavior. Besides the obvious memory safety stuff, my list of top undefined behaviors to catch would be: - multiple updates to objects between sequence points - integer overflows - use-after-death of stack variables - use of uninitialized stack variables - const/volatile violations Some of these will be no fun to implement. But the resulting tool would be enormously valuable. John From Vasudev.Negi at microchip.com Wed Jan 20 23:59:42 2010 From: Vasudev.Negi at microchip.com (Vasudev.Negi at microchip.com) Date: Wed, 20 Jan 2010 22:59:42 -0700 Subject: [LLVMdev] call graph not complete Message-ID: Consider C code void foo(); int main(void) { foo(); return 0; } void foo(void) { int a =10; } Bitcode generated by clang with -O0 is : define i32 @main() nounwind { entry: %retval = alloca i32 ; [#uses=3] store i32 0, i32* %retval call void (...)* bitcast (void ()* @foo to void (...)*)() store i32 0, i32* %retval %0 = load i32* %retval ; [#uses=1] ret i32 %0 } define void @foo() nounwind { entry: %a = alloca i32, align 4 ; [#uses=1] store i32 10, i32* %a ret void } Bitcode generated by llvm-ld with -disable-opt and -basiccg options is: define i32 @main() nounwind { entry: %retval = alloca i32 ; [#uses=3] store i32 0, i32* %retval call void (...)* bitcast (void ()* @foo to void (...)*)() store i32 0, i32* %retval %0 = load i32* %retval ; [#uses=1] ret i32 %0 } define void @foo() nounwind { entry: %a = alloca i32, align 4 ; [#uses=1] store i32 10, i32* %a ret void } My point is why is the call to foo not resolved correctly in llvm-ld. Here if I try to make use of the call graph pass to get the call graph, I do not get the complete call graph. Thanks Vasudev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/cc2eb964/attachment-0001.html From rkephart at kns.com Wed Jan 20 16:16:05 2010 From: rkephart at kns.com (Kephart, Ryan) Date: Wed, 20 Jan 2010 17:16:05 -0500 Subject: [LLVMdev] Bullet Physics for WindRiver's vxWorks? Message-ID: <956B4073B6C7C94CB379EF2DF3C50E1706F4CBCF@ftwex1.corp.kns.com> Hi. I was wondering if anyone has compiled Bullet Physics for WindRiver's vxWorks (or know of anyone who may have done so). Any insight / info / help would be very much appreciated. Thanks! -Ryan (aka keppy) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100120/53f5b439/attachment.html From baldrick at free.fr Thu Jan 21 01:28:38 2010 From: baldrick at free.fr (Duncan Sands) Date: Thu, 21 Jan 2010 08:28:38 +0100 Subject: [LLVMdev] call graph not complete In-Reply-To: References: Message-ID: <4B580226.5000306@free.fr> Hi, > Bitcode generated by llvm-ld with ?disable-opt and ?basiccg options is: ... > My point is why is the call to foo not resolved correctly in llvm-ld. Resolving an call to a direct call is an optimization. But you turned all optimizations off. Ciao, Duncan. From sanjiv.gupta at microchip.com Thu Jan 21 02:47:42 2010 From: sanjiv.gupta at microchip.com (Sanjiv Gupta) Date: Thu, 21 Jan 2010 14:17:42 +0530 Subject: [LLVMdev] call graph not complete In-Reply-To: <4B580226.5000306@free.fr> References: <4B580226.5000306@free.fr> Message-ID: <4B5814AE.5000003@microchip.com> Duncan Sands wrote: > Hi, > > >> Bitcode generated by llvm-ld with ?disable-opt and ?basiccg options is: >> > > ... > > > >> My point is why is the call to foo not resolved correctly in llvm-ld. >> > > Resolving an call to a direct call is an optimization. But you turned all > optimizations off. > How can we just selectively turn only that optimization ON? We don't want to turn on a whole lot of other stuff that instcombine does as they mess up debugging in our case. - Sanjiv > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From baldrick at free.fr Thu Jan 21 02:52:05 2010 From: baldrick at free.fr (Duncan Sands) Date: Thu, 21 Jan 2010 09:52:05 +0100 Subject: [LLVMdev] call graph not complete In-Reply-To: <4B5814AE.5000003@microchip.com> References: <4B580226.5000306@free.fr> <4B5814AE.5000003@microchip.com> Message-ID: <4B5815B5.1000500@free.fr> Hi Sanjiv, > How can we just selectively turn only that optimization ON? you can't. That said, you could write your own pass that does it. > We don't want to turn on a whole lot of other stuff that instcombine > does as they mess up debugging in our case. If instcombine makes debug info useless, then that's rather bad. Can you please explain more about this. Ciao, Duncan. From sanjiv.gupta at microchip.com Thu Jan 21 02:58:09 2010 From: sanjiv.gupta at microchip.com (Sanjiv Gupta) Date: Thu, 21 Jan 2010 14:28:09 +0530 Subject: [LLVMdev] call graph not complete In-Reply-To: <4B5815B5.1000500@free.fr> References: <4B580226.5000306@free.fr> <4B5814AE.5000003@microchip.com> <4B5815B5.1000500@free.fr> Message-ID: <4B581721.4040906@microchip.com> Duncan Sands wrote: > Hi Sanjiv, > >> How can we just selectively turn only that optimization ON? > > you can't. That said, you could write your own pass that does it. > >> We don't want to turn on a whole lot of other stuff that instcombine >> does as they mess up debugging in our case. > > If instcombine makes debug info useless, then that's rather bad. Can > you please explain more about this. multiple loads/stores to individual bitfields of a type are combined into single load/store and there are no generated code for certain C statements. > > Ciao, > > Duncan. From grosser at fim.uni-passau.de Wed Jan 20 17:07:44 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Thu, 21 Jan 2010 00:07:44 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4367_1264007152_4B5737F0_4367_451_1_307129.59669.qm@web55608.mail.re4.yahoo.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> <4B4D1E79.8070807@fim.uni-passau.de> <25423_1263356804_4B4D4B83_25423_4709_1_457655.70882.qm@web55601.mail.re4.yahoo.com> <4B4D900D.6040601@fim.uni-passau.de> <4367_1264007152_4B5737F0_4367_451_1_307129.59669.qm@web55608.mail.re4.yahoo.com> Message-ID: <4B578CC0.5090501@fim.uni-passau.de> On 01/20/10 18:05, Jan Sjodin wrote: >>>> bbs, to generate these edges. As we do not have to modify the CFG other >>>> passes like dominance information are still valid, and we do not have to >>>> create a lot of auxiliary bbs, to be able to detect all regions. This >>>> saves memory and runtime. In general it is probably not too easy to >>>> decide where to insert these bbs either: >>> >>> The general rule is to split all blocks with multiple in-edges and multiple out-edges >>> into blocks with either multiple in-edges or multiple out-edges, but not both. >> This is not sufficient, as shown in the example below. It would allow >> only one region. > > Yes, but with the insertion of merge-blocks it will allow more (see below). Sure. If you insert the right merge blocks you could be optimal. In terms of finding all regions that could be created by inserting merge blocks. >>> One option is to keep this as an invariant throughout the compiler and make use >>> of the merge blocks (multiple in-edges) to contain only PHI-nodes, and all other code >>> in regular basic blocks. There are different variations on this that may or may not be >>> useful. >> This might be possible, however probably doubling the number of bbs. > > I don't know if it will be double, but there will be more basic blocks for sure. Sure. >>>>> CFG: >>>> 0 >>>> | >>>> 1 >>>> / | >>>> 2 | >>>> / \ 3 >>>> 4 5 | >>>> | | | >>>> 6 7 8 >>>> \ | / >>>> \ |/ region A: 1 -> 9 {1,2,3,4,5,6,7,8} >>>> 9 region B: 2 -> 9 {2,4,5,6,7} >>>> >>>> So we need one bb that joins 6 and 7 and one that joins the two regions >>>> >>>> CFG: 0 >>>> | >>>> 1 >>>> / | >>>> 2 | >>>> / \ 3 >>>> 4 5 | >>>> | | | >>>> 6 7 8 >>>> \ | | >>>> \ | | region A: (0,1) -> (9b,9) {1,2,3,4,5,6,7,8,9a,9b} >>>> 9a | region B: (1,2) -> (9a,9b) {2,4,5,6,7,9a} >>>> \ / >>>> 9b >>>> | >>>> 9 >>> >>> It is fairly simple to use the information from the algorithm to decide >>> where those merges should be inserted to get the expected regions. >>> This may be needed in the cases where a sub-region is too complicated >>> to be represented and must be abstracted into a "black box". >> From which algorithm? The program structure tree does not give this >> information, does it? > > The algorithm that computes the SESE-regions can be used to determine > where the merge-nodes should be inserted. There are a couple of ways > of doing it, but if the bracket sets on two eges can be intersected to > match a third edge (which dominates the first two), you can insert a > merge block for the two edges. You don't have to compute the > dominators, but it helps to explain the problem that way. Might be possible. I did not reason about this too much, when I found a reasonable algorithm that did not requiere these blocks. >>>>> My approach is comparable to this paper: >>>>> The Refined Process Structure Tree by JussiVanhatalo, Hagen V?lzer, >>>>> Jana Koehler >>>> >>>> I was looking through some slides that described their algorithm. One case that >>>> seems to be legal is this: >>>> >>>> | >>>> Entry >>>> / \ >>>> R0 R1 >>>> \ / >>>> Exit >>>> | >>>> >>>> With two fragments: Entry->R0->Exit and Entry->R1->Exit, which means >>>> that a fragment cannot be identified using only the entry and exit blocks, but >>>> the internal blocks or edges will also need to be listed. I don't know if this is >>>> relevant to your implementation. >> >> No. The ideas are comparable, however I believe their implementation is >> a little complicated. ;-) > > Do you have the same definition of a region and entry/exit blocks as they do? No. They define a region based on all the edges in the region. This is a very verbose definition, as all edges have to be saved. Also it is quite expensive to compare regions, ... However their description allows to talk about all possible regions. >> I would mark the regions as: >> >> Region A: R0 -> Exit, containing {R0} >> Region B: R1 -> Exit, containing {R1} > > Is the entry always contained and is the exit never contained, or is that specified > per region? Depending on the restrictions of entry and exit blocks a loop with a single > basic block cannot be an entry or exit by itself. Example: The entry is always contained and dominates all bbs in the region. The exit is never contained, but it postdominates all bbs in the region. > > | > A > /| _ > / |/ \ > B R | > \ |\_/ > \| > C > | > If you only care about R in this case how is the region formed? This is: R -> C, containing {R} So R is the entry BB with the entry edge A->R and C is the exit BB with the exit edge R->C. R is in the SCoP, C is not. > > >>>> The implementation however takes advantage of the existence of >>>> Dominance/PostDominance information. Therefore it is simpler and >>>> hopefully faster. At the moment run time is comparable to dominance tree >>>> calculation. >>> >>> Both algorithms are linear so there is really no big difference in time imo. >> Sure. However in terms of maintainability it is nice to be able to reuse >> existing analysis instead of write another triconnected component >> analysis upfront. >> >>> I believe the biggest difference that you mention is that you can capture more >>> complicated regions without having to modify the CFG with the current >>> algorithm. >> Yes. >> >>>> If you want, have a look into some results I got with a pass extracting >>>> maximal non trivial regions that do not contain loops from libbz2: >>>> >>>> http://tobias.osaft.eu/llvm/region/bz2NoLoops/ >>>> >>>> Interesting ones: >>>> >>>> regions_without_loops.BZ2_bzBuffToBuffCompress.dot.png >>>> has a lot of exit edges. >>> >>> I think this example proves the strengths and weaknesses of both >>> approaches. Making that region into structured control flow would add a lot >>> of additional blocks. This will also happen after generating code >>> from the polyhedral model, so either way the cost is there if the optimization >>> is successful. >> Yes, but just in this case and not for regions where the model cannot be >> applied. If the regions pass is used for analysis purposes, nothing has >> to be touched. >> >>> The second case is where the optimization fails (no profitable >>> transformation found) and the CFG can remain untouched. >>> >>> The third case is if one of those blocks contains something complicated. >>> I believe the current algorithm simply fails and cannot detect the region. >> Which algorithm? The one in Graphite? Or the region detection I wrote >> here? This is just plain region detection, that does not even look at >> the content but builds a region tree (program structure tree). It just >> detects every possible region. >> The selection would be a later pass. > > My assumption was that there is a selection in there somewhere. > Do you plan to refine the regions in the selection phase in any way? Sure. The region tree can be processed and several sets of regions can be filtered out. E.g. the maximal regions that fulfill a certain restriction. The minimal regions that fullfill a certain restriction. Or some more complicated ones where some problems (like irregular control flow) can be hidden in SESE subcomponents. >>> If the >>> CFG is modified this would allow an internal SESE-region to become a black box, and the >>> the outer regions could be optimized. >> This is an optimization, however I think it is orthogonal to the region >> detection problem. Say it works with any algorithm. > > I believe that creating a black-box will map a lot more cleanly to an edge-based > region definition, since block-based may include multiple entry/exit sub-regions > that will not encapsulate control flow in a reasonable way. The block based definition works as well as any edge based definition to define a sese region. And it should be able to hide stuff in black boxes. > >>>> regions_without_loops.bzopen_or_bzdopen.dot.png >>>> the first region has two entry edges. One is the loop latch. >>>> (Keep in mind all regions have the same color, so if it seems there is >>>> an edge into a region, there are just two regions close by) >>>> >>>> Without a prepass that exposes the edges almost no region could be >>>> detected with the "standard" approach. >>> >>> Indeed the CFG will have to be modified for these cases. I it seems to me that the trade-off >>> between the two approaches is that the algorithm that you currently have is a cheaper up >>> front, but may be less capable in some cases, while the "standard" algorithm will be more >>> expensive, but can handle problematic regions better. Would you agree? >> >> I agree that the algorithm I have is cheaper upfront, but I do not yet >> see a case where the algorithm is less capable. Would you mind to give >> an example or to highlight the relevant part of the discussion? > > With the insertion of extra merge-blocks the code becomes more structured and the PST > can be refined further. A more fine-grained PST may allow more cases to be handled. That is actually the difference between my algorithm and the PST bracket algorithm. Mine should get the same regions, that the PST one gets after inserting the best merge blocks possible. Just without requiring CFG changes. I am in writing some documentation about this that we could discuss later on. If you are interested I run my analysis on aermod from polyhedron. The results can be found here: http://tobias.osaft.eu/llvm/region/aermod/ There is a .regtree.txt file containing the region tree for every function of the binary. Furthermore there are the results of an example analysis, that finds the biggest regions without loops. (The .dot and a .svg files) If you have the impression some regions are not SESE keep in mind that I took the same color for two regions, therefore they may just be next too each other. Thanks for discussing this with me. It helped me to get a feeling where the differences are. Hoping we can have these discussions more often. Tobi From grosser at fim.uni-passau.de Thu Jan 21 03:12:18 2010 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Thu, 21 Jan 2010 10:12:18 +0100 Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4367_1264007800_4B573A77_4367_537_1_913893.66561.qm@web55606.mail.re4.yahoo.com> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <314301.28741.qm@web55601.mail.re4.yahoo.com> <6363_1263521041_4B4FCD10_6363_3705_1_4B4FCD05.60107@gmail.com> <4B502D92.5070704@fim.uni-passau.de> <4367_1264007800_4B573A77_4367_537_1_913893.66561.qm@web55606.mail.re4.yahoo.com> Message-ID: <4B581A72.5070109@fim.uni-passau.de> On 01/20/10 18:16, Jan Sjodin wrote: >> I think this should be implemented as a RegionFilter, that checks if a > >> region contains a loop, and that can be asked for further information. >> In general I do not think this kind of analysis belongs to a region, but >> as you proposed some kind of filter could be applied. In the short term >> the passes who need this information could get it on their own. >> >>> and a long term consideration about "region pass": >>> >>> if we want to integrate region analysis and optimization framework into >>> llvm, i think we can use an approach that similar to loop analysis and >>> optimization: write a class "regionpass" inherit from "pass", and the >>> corresponding pass manger "RegionPassManager". (a kind of function pass) >>> if we follow this approach, we need to push the region pass manager into >>> the llvm pass manager stack. >>> the first question of this approach is, whats the relationship between >>> "loop pass manager" and "region pass manager"? >>> >>> way 1: make region pass manager below loop pass manager in the stack >>> >>> pass manager stack: >>> >>> bb pass manager <---top >>> loop pass manager >>> region pass manager >>> function pass manager >>> ... <---bottom >>> >>> in this way the region hierarchy need to be reconstruct when a loop >>> transform change it. >>> >>> way 2: make region pass manager above loop pass manager in the stack >>> >>> pass manager stack: >>> >>> bb pass manager <---top >>> region pass manager >>> loop pass manager >>> function pass manager >>> ... <---bottom >>> >>> in this way the loop hierarchy need to be reconstruct when a region pass >>> change it. >>> >>> now we need to choose a way to minimize the loop reconstruction or >>> region reconstruction. i think that the chance that a region transform >>> affect the loop structure is smaller, so maybe way 2 is better. >> This would need some thoughts. Ideal I think we would not order them, but if >> a region changed, just reconstruct the loops that are in this region and >> if a >> loop changed just reconstruct the regions in this loop. > > Imo, a loop is simply a special kind of region, so a "filter" is perhaps the way to > go if you are interested in loops. Regions containing loops will have to be inspected > using the PST. Except loops that have multiple exits. they are not necessarily (single entry single exit) region, if the exists do not jump to the same exit bb. From omerigener at gmail.com Thu Jan 21 04:00:20 2010 From: omerigener at gmail.com (Gener Omer) Date: Thu, 21 Jan 2010 12:00:20 +0200 Subject: [LLVMdev] Profile-Guided Optimization status In-Reply-To: References: Message-ID: Any sugestions? On Wed, Jan 20, 2010 at 11:27 AM, Gener Omer wrote: > Hi all, > > I would like to know the status of profile-guided optimization. What > enhancements could be done to the current implementation? The ideas for > profile-guided transformations from [0] are still available? > > Many thanks. > Gener > > > [0] http://llvm.org/OpenProjects.html#profileguided > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100121/a5c7c541/attachment.html From sanjiv.gupta at microchip.com Thu Jan 21 04:20:25 2010 From: sanjiv.gupta at microchip.com (Sanjiv Gupta) Date: Thu, 21 Jan 2010 15:50:25 +0530 Subject: [LLVMdev] call graph not complete In-Reply-To: <4B581721.4040906@microchip.com> References: <4B580226.5000306@free.fr> <4B5814AE.5000003@microchip.com> <4B5815B5.1000500@free.fr> <4B581721.4040906@microchip.com> Message-ID: <4B582A69.5060908@microchip.com> Sanjiv Gupta wrote: > Duncan Sands wrote: > >> Hi Sanjiv, >> >> >>> How can we just selectively turn only that optimization ON? >>> >> you can't. That said, you could write your own pass that does it. >> >> >>> We don't want to turn on a whole lot of other stuff that instcombine >>> does as they mess up debugging in our case. >>> >> If instcombine makes debug info useless, then that's rather bad. Can >> you please explain more about this. >> > multiple loads/stores to individual bitfields of a type are combined > into single load/store and there are no generated code for certain C > statements. > > I think I was not clear enough in my previous email. Our requirement is a little peculiar. We want debugging to work better when no optimizations are specified but at the same time we want the call resolve to happen as we do function frame overlaying on top of the call graph. So, it does not necessarily mean that -instcombine is messing up the debug info. - Sanjiv >> Ciao, >> >> Duncan. >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From astifter-llvm at gmx.at Thu Jan 21 04:20:51 2010 From: astifter-llvm at gmx.at (Andreas Neustifter) Date: Thu, 21 Jan 2010 11:20:51 +0100 Subject: [LLVMdev] ProfileInfo Questions -- How to proceed? In-Reply-To: <201001202330.o0KNUTPZ023868@zion.cs.uiuc.edu> References: <201001202330.o0KNUTPZ023868@zion.cs.uiuc.edu> Message-ID: <4B582A83.50608@gmx.at> Hi all, I have some questions about the maintenance of the Profiling Code, since this is largely my contribution I still feel responsible for it. (I have not finished my master thesis, so it also is still work in progress... :-) When there are API changes in LLVM, I guess the person changing the API is responsible for changing all the occurrences in LLVM? (So I do not have to worry about that...) The other thing is bugs and problems: I'm watching the llvm-bugs and llvm-commits lists for profiling related posts, I hope that's enough to keep everyone happy. (I usually do not respond to build problems related to the runtime library libprofile since I really can not help in that cases, I hope this is okay too.) Chris and me talked a while ago that he would like to see the ProfileInfo implementation changed from an template-based one to a (void*)-based one (for a lack of better terms). I haven't wrapped my head around this, how important is this? Would it be a showstopper for 2.7 if this stuff is still template-based? And the (almost) last question: How shall I proceed with preserving the ProfileInfo throughout the various optimistation passes? I have a (huge) patch that adds support for the std-compile-opts passes, but it is still not preserving all of the ProfileInfo correctly. Is correctness a huge factor? Or is having the preservation at least in a somewhat working state (so that it is usable) more important? In general terms it works fine, there are just some quirks where the flow condition is not met. In my personal opinion the ProfileInfo is still of some use after that... And the last one: When passes are added or modified and the ProfileInfo preservation breaks, who is responsible for fixing it? I guess the ProfileInfo is the only analysis that can not be recalculated after its destroyed (as compared to e.g the LoopInfo) and thus has to be preserved (idealy) everywhere, so this is quite an issue... Sorry for the lengthy post, thanks for the answers in advance... Andi From miaoyisz at gmail.com Thu Jan 21 08:51:37 2010 From: miaoyisz at gmail.com (Mary_nju) Date: Thu, 21 Jan 2010 06:51:37 -0800 (PST) Subject: [LLVMdev] How to create a CallInst that calls a standard c function like "printf" In-Reply-To: <27210247.post@talk.nabble.com> References: <27210247.post@talk.nabble.com> Message-ID: <27258957.post@talk.nabble.com> SOS!!! It is really an emergency problem for me to resolve, if anyone knows the answer, please let me know, I will appreciate it, thank you very much! Mary_nju wrote: > > > I am working on a program based on LLVM. I want to modify the .bc file > throught C++ APIs provided by LLVM, but I don't know how to create a > CallInst that calls a standard c function like "printf", can anyone help > me with this problem? > > The file attached is the program I wrote, it can be compiled, however, the > result of the dump of the retrieved module is not correct(missing global > variable and will cause 'program use external function 'myprintf' which > could not be resolved') problem. > http://old.nabble.com/file/p27210247/test.cpp test.cpp > :-) -- View this message in context: http://old.nabble.com/How-to-create-a-CallInst-that-calls-a-standard-c-function-like-%22printf%22-tp27210247p27258957.html Sent from the LLVM - Dev mailing list archive at Nabble.com. From criswell at uiuc.edu Thu Jan 21 10:03:41 2010 From: criswell at uiuc.edu (John Criswell) Date: Thu, 21 Jan 2010 10:03:41 -0600 Subject: [LLVMdev] How to create a CallInst that calls a standard c function like "printf" In-Reply-To: <27258957.post@talk.nabble.com> References: <27210247.post@talk.nabble.com> <27258957.post@talk.nabble.com> Message-ID: <4B587ADD.1060304@uiuc.edu> Mary_nju wrote: > SOS!!! > It is really an emergency problem for me to resolve, if anyone knows the > answer, please let me know, I will > appreciate it, thank you very much! > The way to do this is to write code to do two things: 1) Write code that will insert a new function named printf that has no body. 2) Write code that will insert a call instruction (CallInst) that will call the printf function you created in step 1. After running your transform, you should generate native code and then link against the C library. The function with no body will be resolved during the final native code link. The JIT will automatically do this for you. If you're doing static compilation, you use llc to generate native assembly code from the bitcode (.bc file) and then use gcc to assembly the output and link it with standard libraries. Looking at your code, you seem to have done this. The only problem is that you named the function "myprintf" instead of "printf". -- John T. > > > Mary_nju wrote: > >> I am working on a program based on LLVM. I want to modify the .bc file >> throught C++ APIs provided by LLVM, but I don't know how to create a >> CallInst that calls a standard c function like "printf", can anyone help >> me with this problem? >> >> The file attached is the program I wrote, it can be compiled, however, the >> result of the dump of the retrieved module is not correct(missing global >> variable and will cause 'program use external function 'myprintf' which >> could not be resolved') problem. >> http://old.nabble.com/file/p27210247/test.cpp test.cpp >> >> > :-) > From rich at rd.gen.nz Thu Jan 21 12:40:11 2010 From: rich at rd.gen.nz (Rich Dougherty) Date: Fri, 22 Jan 2010 07:40:11 +1300 Subject: [LLVMdev] How to create a CallInst that calls a standard c function like "printf" In-Reply-To: <27210247.post@talk.nabble.com> References: <27210247.post@talk.nabble.com> Message-ID: On Tue, Jan 19, 2010 at 2:08 AM, Mary_nju wrote: > I am working on a program based on LLVM. I want to modify the .bc file > throught C++ APIs provided by LLVM, but I don't know how to create a > CallInst that calls a standard c function like "printf", can anyone help me > with this problem? Hi A good way to work this kind of thing out is to use the online demo application and look at the C++ code that it prints out when you compile it. e.g. * Go to http://llvm.org/demo/ * Click "Show LLVM C++ API code" * Click "Compile Source Code" Search the code for "printf" and you should be able to easily see how it is used. Cheers Rich -- Rich Dougherty http://www.richdougherty.com/ From sanjiv.gupta at microchip.com Thu Jan 21 13:03:15 2010 From: sanjiv.gupta at microchip.com (Sanjiv Gupta) Date: Fri, 22 Jan 2010 00:33:15 +0530 Subject: [LLVMdev] call graph not complete In-Reply-To: <4B580226.5000306@free.fr> References: <4B580226.5000306@free.fr> Message-ID: <4B58A4F3.3010906@microchip.com> Duncan Sands wrote: > Hi, > > >> Bitcode generated by llvm-ld with ?disable-opt and ?basiccg options is: >> > > ... > > > >> My point is why is the call to foo not resolved correctly in llvm-ld. >> > > Resolving an call to a direct call is an optimization. But you turned all > optimizations off. > > BTW, why do clang generates an indirect call in the first place for the program given in the original post? Does c99 tell us to generate an indirect call when the prototype is like this void foo(); ? - Sanjiv > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From rjmccall at apple.com Thu Jan 21 13:20:22 2010 From: rjmccall at apple.com (John McCall) Date: Thu, 21 Jan 2010 11:20:22 -0800 Subject: [LLVMdev] call graph not complete In-Reply-To: <4B58A4F3.3010906@microchip.com> References: <4B580226.5000306@free.fr> <4B58A4F3.3010906@microchip.com> Message-ID: <582CE241-EEF1-475B-AE52-E33D4C7431E6@apple.com> On Jan 21, 2010, at 11:03 AM, Sanjiv Gupta wrote: > Duncan Sands wrote: >> Hi, >> >> >>> Bitcode generated by llvm-ld with ?disable-opt and ?basiccg options is: >>> >> >> ... >> >> >> >>> My point is why is the call to foo not resolved correctly in llvm-ld. >>> >> >> Resolving an call to a direct call is an optimization. But you turned all >> optimizations off. >> >> > BTW, why do clang generates an indirect call in the first place for the > program given in the original post? Does c99 tell us to generate an > indirect call when the prototype is like this > void foo(); > > ? C99 doesn't tell us to generate an indirect call, but it does tell us that we can't make assumptions about the actual signature of foo (other than the return type). That matters here because it means we have no idea what that signature is when we're generating code for main(). You could easily write: void foo(); int main() { foo(); } void foo(int i) { } If we codegenned this as a direct call, we'd get: define i32 @main() nounwind { entry: call void (...)* @foo() ret i32 0 } define void @foo(i32) nounwind { entry: ret void } This is a type error, because @foo is not of type void (...)*. We could work around this specific problem by delaying code-generation for @main until we see the definition of foo, but that still wouldn't work for e.g. link-time optimization. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100121/1788b621/attachment.html From daniel at zuster.org Thu Jan 21 13:38:16 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Thu, 21 Jan 2010 11:38:16 -0800 Subject: [LLVMdev] [cfe-dev] FindExecutable In-Reply-To: References: Message-ID: <6a8523d61001211138x4fd0bd3fx440072ef4b5685@mail.gmail.com> FYI, this is probably more appropriate to llvmdev. I think someone had a patch on the llvmdev/llvm-commits list to deal with this. Just appending .exe isn't really correct, it should also respect .bat and the user definable things. It imagine there is a Win32 function to help with this, but don't know what it is. The ideal approach is to make sure the function does exactly what cmd.exe does. - Daniel On Thu, Jan 21, 2010 at 11:24 AM, Jim Crafton wrote: > I was just playing with llvm-ld, and noticed that it keeps giving me > errors that it can't find "llc", even though it's in the same > directory. This is running on windows, built with visual studio 2008. > Stepping through the code, I notice that llvm::FindExecutable() is > looking for "llc" not "llc.exe", which means, on windows at least, > that it will *never* find the compiler. Would a simple fix be to just > append a ".exe" to the ExeName parameter if the search fails and > search again? Or is there a better approach? > > Cheers > > Jim > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > From clattner at apple.com Thu Jan 21 13:39:23 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 21 Jan 2010 11:39:23 -0800 Subject: [LLVMdev] ComplexPattern In-Reply-To: <4B55C866.5060809@cam.ac.uk> References: <4B55C866.5060809@cam.ac.uk> Message-ID: <51C0D89D-9546-49EB-A7F5-97F7E1DD42B3@apple.com> On Jan 19, 2010, at 6:57 AM, Greg Chadwick wrote: > Hi, > > I was wondering if someone could explain precisely what the > ComplexPattern tablegen class does? ComplexPattern allows you to define arbitrary c++ code that does pattern matching. This is useful for things like the X86 addressing mode matcher, which have lots of special cases. > > Here's the first line of the definition (from TargetSelectionDAG.td) > for > reference: > > class ComplexPattern list roots = [], > list props = [], > list attrs = []> > > As far as I can tell it gives the name of a selection function (fn) > that > will be called to match that particular ComplexPattern. Should that > function return true that pattern has matched. The match function can > also fill in some operands that can be used later on (Number is > specified by numops), ty presumably specifies the type of node that > this > match can be attempted on. Is my understanding of this correct? Yes. > The thing I'm still unsure about is roots, what exactly does this do? > The comment above the definition specifies that 'RootNodes are the > list > of possible root nodes of the sub-dags to match' (RootsNodes is > assigned > to root so they're the same) but I can't make any sense of this. I don't recall offhand, the best advice is to find an existing target that does it and look at what it is accomplishing. -Chris From jan_sjodin at yahoo.com Thu Jan 21 13:42:26 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Thu, 21 Jan 2010 11:42:26 -0800 (PST) Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B581A72.5070109@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <314301.28741.qm@web55601.mail.re4.yahoo.com> <6363_1263521041_4B4FCD10_6363_3705_1_4B4FCD05.60107@gmail.com> <4B502D92.5070704@fim.uni-passau.de> <4367_1264007800_4B573A77_4367_537_1_913893.66561.qm@web55606.mail.re4.yahoo.com> <4B581A72.5070109@fim.uni-passau.de> Message-ID: <754916.87861.qm@web55607.mail.re4.yahoo.com> >> Imo, a loop is simply a special kind of region, so a "filter" is perhaps the way to >> go if you are interested in loops. Regions containing loops will have to be inspected >> using the PST. > > Except loops that have multiple exits. they are not necessarily (single > entry single exit) region, if the exists do not jump to the same exit bb. Yes, my assumption was to classify structured loops, but that does not exclude unstructured regions that contain unstructured loops. From jim.crafton at gmail.com Thu Jan 21 14:26:39 2010 From: jim.crafton at gmail.com (Jim Crafton) Date: Thu, 21 Jan 2010 15:26:39 -0500 Subject: [LLVMdev] how to compile asm output for x86 with Micorsoft's ML Message-ID: I was recently trying to compile some bitcode generated like so: llc --x86-asm-syntax=intel test.out it generates a test.out.s file, and while I'm not much of an assembly person, it looks OK. When I feed this to ML ml test.out.s I get tons of errors, most of which appear to be syntax errors. Stuff like: test.out.s(2) : error A2008:syntax error : . error A2008:syntax error : objc_class_name_Fraction etc. Is it possible to generate X86 code that Microsoft's assembler will accept? Is there another option that I missed? I looked around a bit but couldn't find anything. The example on the llvm site mentions being able to compile to C code, but when I try llc -march=c test.out it errors out with: error: invalid target 'c'. Checking the code (I built LLVM 2.7svn) it looks like the C target isn't registered, and llc -version gives me: llc -version Low Level Virtual Machine (http://llvm.org/): llvm version 2.7svn DEBUG build with assertions. Built Jan 20 2010 (14:58:43). Host: i686-pc-win32 Host CPU: core2 Registered Targets: x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 Cheers Jim From ofv at wanadoo.es Thu Jan 21 14:35:43 2010 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Thu, 21 Jan 2010 21:35:43 +0100 Subject: [LLVMdev] how to compile asm output for x86 with Micorsoft's ML References: Message-ID: <87iqav8acg.fsf@telefonica.net> Jim Crafton writes: [snip] > I looked around a bit but couldn't find anything. The example on the > llvm site mentions being able to compile to C code, but when I try llc > -march=c test.out > > it errors out with: > error: invalid target 'c'. [snip] > Registered Targets: > x86 - 32-bit X86: Pentium-Pro and above > x86-64 - 64-bit X86: EM64T and AMD64 By default, the cmake build generates Visual Studio project files for the X86 target only. Take a look at http://www.llvm.org/docs/CMake.html#llvmvars for learning how to build other targets. IIRC, the C target is named CBackend. From jan_sjodin at yahoo.com Thu Jan 21 14:38:38 2010 From: jan_sjodin at yahoo.com (Jan Sjodin) Date: Thu, 21 Jan 2010 12:38:38 -0800 (PST) Subject: [LLVMdev] Make LoopBase inherit from "RegionBase"? In-Reply-To: <4B578CC0.5090501@fim.uni-passau.de> References: <26944_1261830433_4B360121_26944_6925_1_5f72161f0912260406o55089883l3ea705c33484cf48@mail.gmail.com> <4B36837A.2020707@fim.uni-passau.de> <17380_1262789602_4B44A3E2_17380_3045_1_4B44A3F2.4000109@gmail.com> <4B44AD1A.30204@fim.uni-passau.de> <645d868c1001060811v5bdb4cedib032fa4a9c2c6a07@mail.gmail.com> <4B473032.5080803@gmail.com> <25196_1262956810_4B47310A_25196_885_1_4B473118.4090104@gmail.com> <4B4CB891.5090902@fim.uni-passau.de> <25423_1263340620_4B4D0C4B_25423_3544_1_314301.28741.qm@web55601.mail.re4.yahoo.com> <4B4D1E79.8070807@fim.uni-passau.de> <25423_1263356804_4B4D4B83_25423_4709_1_457655.70882.qm@web55601.mail.re4.yahoo.com> <4B4D900D.6040601@fim.uni-passau.de> <4367_1264007152_4B5737F0_4367_451_1_307129.59669.qm@web55608.mail.re4.yahoo.com> <4B578CC0.5090501@fim.uni-passau.de> Message-ID: <397148.38552.qm@web55606.mail.re4.yahoo.com> >>> No. The ideas are comparable, however I believe their implementation is >>> a little complicated. ;-) >> >> Do you have the same definition of a region and entry/exit blocks as they do? > > No. They define a region based on all the edges in the region. This is a > very verbose definition, as all edges have to be saved. Also it is quite > expensive to compare regions, ... > > However their description allows to talk about all possible regions. I agree, their definition did not seem practical for our purposes. >> Is the entry always contained and is the exit never contained, or is that specified >> per region? Depending on the restrictions of entry and exit blocks a loop with a single >> basic block cannot be an entry or exit by itself. Example: > > The entry is always contained and dominates all bbs in the region. The > exit is never contained, but it postdominates all bbs in the region. Okay, that makes sense. >> With the insertion of extra merge-blocks the code becomes more structured and the PST >> can be refined further. A more fine-grained PST may allow more cases to be handled. > > That is actually the difference between my algorithm and the PST bracket > algorithm. Mine should get the same regions, that the PST one gets after > inserting the best merge blocks possible. Just without requiring CFG > changes. With the definition of a SESE-region you mention above I can see how that would work.? It was just not clear to me what the definition was to begin with.?Back-edges is another matter, but I am sure they are handled one way or another. I will have to look more into the algorithm. > I am in writing some documentation about this that we could discuss > later on. Sure. > Thanks for discussing this with me. It helped me to get a feeling where > the differences are. Hoping we can have these discussions more often. Yes, there are always interesting details to be discussed?between?different algorithms. - Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20100121/49d1910d/attachment.html From jim.crafton at gmail.com Thu Jan 21 14:54:53 2010 From: jim.crafton at gmail.com (Jim Crafton) Date: Thu, 21 Jan 2010 15:54:53 -0500 Subject: [LLVMdev] how to compile asm output for x86 with Micorsoft's ML In-Reply-To: <87iqav8acg.fsf@telefonica.net> References: <87iqav8acg.fsf@telefonica.net> Message-ID: > By default, the cmake build generates Visual Studio project files for > the X86 target only. Take a look at > > http://www.llvm.org/docs/CMake.html#llvmvars > > for learning how to build other targets. OK thanks, I'll look at that. In the meantime, is it possible to get the assembly generated by llc to work wiht ML? That would probably be the ideal solution. Cheers Jim From clattner at apple.com Thu Jan 21 15:07:21 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 21 Jan 2010 13:07:21 -0800 Subject: [LLVMdev] cygwin/mingw help Message-ID: Can someone with cygwin/mingw access tell me what .s file gcc compiles this .c file to: static int test[100]; void *x = &test; Thanks! -Chris From jim.crafton at gmail.com Thu Jan 21 15:14:16 2010 From: jim.crafton at gmail.com (Jim Crafton) Date: Thu, 21 Jan 2010 16:14:16 -0500 Subject: [LLVMdev] cygwin/mingw help In-Reply-To: References: Message-ID: This is what I got from gcc -S t.c .file "t.c" .globl _x .data .align 4 _x: .long _test .lcomm _test,400 On Thu, Jan 21, 2010 at 4:07 PM, Chris Lattner wrote: > Can someone with cygwin/mingw access tell me what .s file gcc compiles > this .c file to: > > static int test[100]; > void *x = &test; > > Thanks! > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu ? ? ? ? http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From czhao at eecg.toronto.edu Thu Jan 21 15:16:00 2010 From: czhao at eecg.toronto.edu (Chuck Zhao) Date: Thu, 21 Jan 2010 16:16:00 -0500 Subject: [LLVMdev] cygwin/mingw help In-Reply-To: References: Message-ID: <4B58C410.1060301@eecg.toronto.edu> .file "test0.c" .globl _x .data .align 4 _x: .long _test .lcomm _test,400 Using gcc-4.2.4 on Cygwin/WinXP, with -c -O0 -S flags. Chuck On 1/21/2010 4:07 PM, Chris Lattner wrote: > Can someone with cygwin/mingw access tell me what .s file gcc compiles > this .c file to: > > > static int test[100]; > void *x =&test; > > Thanks! > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > From clattner at apple.com Thu Jan 21 15:31:05 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 21 Jan 2010 13:31:05 -0800 Subject: [LLVMdev] how to compile asm output for x86 with Micorsoft's ML In-Reply-To: References: <87iqav8acg.fsf@telefonica.net> Message-ID: On Jan 21, 2010, at 12:54 PM, Jim Crafton wrote: >> By default, the cmake build generates Visual Studio project files for >> the X86 target only. Take a look at >> >> http://www.llvm.org/docs/CMake.html#llvmvars >> >> for learning how to build other targets. > > OK thanks, I'll look at that. > > In the meantime, is it possible to get the assembly generated by llc > to work wiht ML? That would probably be the ideal solution. Nope, llvm's .s output is only compatible with GAS and other at&t syntax assemblers. It turns out that MASM syntax is highly ambiguous and MASM is not production quality for use by a compiler. This is why visual studio doesn't go through it. Long term, we'd like LLVM to be able to write out .o files directly, if you're interested in adding PECOFF support, that would be very nice :) -Chris From clattner at apple.com Thu Jan 21 15:31:29 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 21 Jan 2010 13:31:29 -0800 Subject: [LLVMdev] cygwin/mingw help In-Reply-To: References: Message-ID: thanks! On Jan 21, 2010, at 1:14 PM, Jim Crafton wrote: > This is what I got from > > gcc -S t.c > > > .file "t.c" > .globl _x > .data > .align 4 > _x: > .long _test > .lcomm _test,400 > > > > On Thu, Jan 21, 2010 at 4:07 PM, Chris Lattner > wrote: >> Can someone with cygwin/mingw access tell me what .s file gcc >> compiles >> this .c file to: >> >> static int test[100]; >> void *x = &test; >> >> Thanks! >> >> -Chris >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> From jim.crafton at gmail.com Thu Jan 21 16:01:07 2010 From: jim.crafton at gmail.com (Jim Crafton) Date: Thu, 21 Jan 2010 17:01:07 -0500 Subject: [LLVMdev] how to compile asm output for x86 with Micorsoft's ML In-Reply-To: References: <87iqav8acg.fsf@telefonica.net> Message-ID: > Nope, llvm's .s output is only compatible with GAS and other at&t syntax > assemblers. ?It turns out that MASM syntax is highly ambiguous and MASM is > not production quality for use by a compiler. ?This is why visual studio > doesn't go through it. ?Long term, we'd like LLVM to be able to write out .o > files directly, if you're interested in adding PECOFF support, that would be > very nice :) Crapola. I was afraid that was going to be the case. This was originally something to do to have fun playing with Objective C, I'm not sure PECOFF support would fall under that :) Any idea how nasty that would be? Oh well, I guess the idea of doing this on windows isn't going to happen anytime soon. Cheers Jim > > -Chris > From clattner at apple.com Thu Jan 21 16:56:52 2010 From: clattner at apple.com (Chris Lattner) Date: Thu, 21 Jan 2010 14:56:52 -0800 Subject: [LLVMdev] how to compile asm output for x86 with Micorsoft's ML In-Reply-To: References: <87iqav8acg.fsf@telefonica.net> Message-ID: <07E4D6BA-7FDC-4866-AAA3-AE88E813E408@apple.com> On Jan 21, 2010, at 2:01 PM, Jim Crafton wrote: >> Nope, llvm's .s output is only compatible with GAS and other at&t >> syntax >> assemblers. It turns out that MASM syntax is highly ambiguous and >> MASM is >> not production quality for use by a compiler. This is why visual >> studio >> doesn't go through it. Long term, we'd like LLVM to be able to >> write out .o >> files directly, if you're interested in adding PECOFF support, that >> would be >> very nice :) > > Crapola. I was afraid that was going to be the case. This was > originally something to do to have fun playing with Objective C, I'm > not sure PECOFF support would fall under that :) Any idea how nasty > that would be? > Oh well, I guess the idea of doing this on windows isn't going to > happen anytime soon. Why not just install the cygwin assembler? -Chris From junk at giantblob.com Thu Jan 21 17:50:54 2010 From: junk at giantblob.com (James Williams) Date: Thu, 21 Jan 2010 23:50:54 +0000 Subject: [LLVMdev] Exception handling question Message-ID: Hi, I'm trying to get exception handling working in my compiler targetting LLVM. I've been working from the LLVM exception handling documentation (including http://llvm.org/docs/ExceptionHandling.html and http://wiki.llvm.org/HowTo:_Build_JIT_based_Exception_mechanism) and looking at g++-llvm's output. I've been trying to get a minimal test function to work, which simply invokes _Unwind_RaiseException with a single clean-up landing pad. However. when I run it my personality function is not getting called - _Unwind_RaiseException simply returns apparently doing nothing. Looking at the x86-64 assembly output from llc, I can see this is happening because the personality function is not getting into the DWARF eh table (the landing pad is there though). I'm stumped as to why not. I'd be grateful if anyone can point out what I'm doing wrong here: define i32 @_ZN4N0014Main5test5EN2IO6WriterEiA_l(%6*, %4*, i32, %33*) { entry: %err = alloca %4* ; <%4**> [#uses=1] %count = alloca i32 ; [#uses=1] %e = alloca %33* ; <%33**> [#uses=2] %this = alloca %6* ; <%6**> [#uses=1] %.ex_value = alloca i8* ; [#uses=1] %.ex_value_l = alloca i8* ; [#uses=0] %.ex_type = alloca i64 ; [#uses=1] br label %4 ;