[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
yabin.hwu at gmail.com
Sun Apr 29 09:26:55 CDT 2012
ÔÚ 2012-4-29£¬ÏÂÎç9:37£¬ Tobias Grosser Ð´µÀ£º
> On 04/29/2012 01:21 AM, Justin Holewinski wrote:
>> On Sat, Apr 28, 2012 at 8:27 AM, Tobias Grosser <tobias at grosser.es
>> <mailto:tobias at grosser.es>> wrote:
>> regalloc= is different. It is global and consequently influences
>> both host and device code generation. However, to me it is rather a
>> debugging option. It is never set by clang and targets provide a
>> reasonable default based on the optimization level. I believe we can
>> assume that for our use case it is not set. In case it is really
>> necessary to explicitly set the register allocator, the right
>> solution would be to make regalloc a target option.
>> The regalloc= option was just an example of the types of flags that can
>> be passed to llc, which are handled as global options instead of target
> Yes, thanks for pointing us to this problem. For now I think we can ignore them as they are mostly debugging options and they can be included in the target options if needed.
>> The implicit assumption seems to be that the host code wants the device
>> code as assembly text. What happens when you need to link the device
>> binary and upload it separately? Think automatic SPU codegen on Cell.
>> Is it up to the host program to invoke the other target's linker?
> OK, I get what you mean. The intrinsic is currently targeted at the OpenCL/CUDA model. It is the most widely used. Stuff like cell sounds interesting, but probably needs further thoughts. Even with OpenCL/CUDA,
> this intrinsic works currently only for PTX code generation, but I hope we can gain support for other GPU devices later on.
>> I agree that future work can be useful here. However, before
>> spending a large amount of time to engineer a complex solution, I
>> propose to start with the proposed light-weight approach. It is
>> sufficient for our needs and will allow us to get the experience and
>> infrastructure that can help us to choose and implement a more
>> complex later on.
>> I agree that this approach is the best way to get short-term results,
>> especially for the GSoC project.
> OK, let's go ahead.
> Yabin, can you update the patch with the following changes:
> - Remove the Arch flag
> - Document that we require a triple
> - Add two new arguments that take a feature string and a mcpu
> flag (can be set to "", which means we use the default)
OK. I will do that.
Thanks for all your comments.
More information about the LLVMdev