[LLVMdev] Compiling zlib to static bytecode archive
Maarten ter Huurne
maarten at treewalker.org
Thu Sep 27 21:08:29 CDT 2007
On Thursday 27 September 2007, Chris Lattner wrote:
> >> Sure, this would also work. Is there any reason not to merge them
> >> together?
> > Ease of maintenance, mainly. Having it in a separate file makes it
> > easier to migrate the code to new GCC releases. Also, collect2.c is
> > already 2658 lines, which is more than I typically like to have in a
> > single source file.
> My impression is that collect2 doesn't change very much. In any case,
> the idea here would be that collect2 only has minimally invasive hooks to
> call into liblto. It seems like this would be much simpler than handling
> all the command line argument swizzling needed for forking subprocesses,
> and having the LTO app have to read all the .o files and analyze them
> (which collect2 is already doing).
After studying collect2.c a bit more, I see that quite a lot of it is for
option parsing and signal handling, so maybe merging is better indeed.
As far as I can see, collect2.c does not read the object files though: it
only runs "nm" on them, which is not what we need to determine which files
are bitcode files.
One thing I'm wondering is how to merge the C code of collect2 with the C++
code that uses liblto:
- convert collect2.c to collect2.cpp?
- put the C++ code in a separate source file and link the C object file and
the C++ object file together into a single collect2 executable?
- expose more functionality from include/llvm-c/LinkTimeOptimizer.h?
(meaning the code using liblto would be C, not be C++)
I currently have something that links the example without errors. It is not
pretty though: a Python script intercepts the invocation of collect2,
splits the list of object files into bitcode and native, calls a process I
named "precollect" to link the bitcode objects into a single native object
and then calls the real collect2 with only native objects. The precollect
tool is based on the llvm-ld source.
What does not work yet, is the actual optimization: precollect does not take
advantage of the fact that this is the final link step that will produce an
executable and all unreferenced symbols are unused. Therefore the dead code
elimination from the example is not performed. To make that possible,
precollect would have to know about all object files, including the native
ones, to determine which symbols are unused. Also, I should figure out how
to tell liblto "there are no symbol references that you do not know about";
I assume that option already exists, but I didn't look for it yet.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20070928/f19643f0/attachment.bin
More information about the LLVMdev