[cfe-dev] clang building linux

pageexec at freemail.hu pageexec at freemail.hu
Tue Oct 26 08:46:20 CDT 2010

hello folks,

given the recent interest both on the list and elsewhere in building a working
linux kernel, here's my 2 cents. i began this work some half a year ago when
2.7 came out but got held up by other projects so i could only finish it recently.

my approach is different from others who have been working on this in that i
went for patching linux itself in order to compile and link with clang properly.
it turns out that with a hundred or so lines patched in linux and a recent clang
(read: use svn HEAD) it's very easy to build a working kernel now. obviously some
of these patches are workarounds for features lacking in clang so the right
approach there is to change clang. some patches are needed for linux bugs, there's
nothing clang can (or should) do about them i think. here's a summary of the issues
i ran into in no particular order:

1. early boot code and .codegcc16/mregparm

   i'm not sure if it's codegcc16 or not, but something makes clang ignore
   -mregparm when compiling the early linux boot code so there'll be a mismatch
   between how arguments are passed from C code and how assembly code expects
   them. the workaround is to explicitly annotate some functions with the attribute.

2. probably related to the above, __builtin_memcpy and __builtin_memset also
   ignore -mregparm and cause the same kind of trouble at runtime so i worked it
   around by using explicit inline asm.

3. sse code in kernel

   in general linux is already built with -mno-sse and others but some Makefiles
   such as the x86 boot code forget to use it with bad consequences for early boot
   (read: the kernel doesn't even decompress ;).

4. unused variable/function elimination

   it seems that clang is more aggressive than gcc and eliminates more actually
   required data/code than desired. earliest causalty is the boot code as usual
   but there're also some module parameter related structures affected. the fix
   is needed on the linux side of course.

5. asm 'p' constraint

   this was fixed last week in subversion, so i'm omitting the patch for it, but
   if someone really wants to use an earlier clang (such as the 2.8 release), then
   just duplicate the percpu_read macro into percpu_read_stable.

6. .gnu.linkonce.d.* section usage

   it seems that clang can emit code/data into sections that the linux linker
   scripts were not aware of.

7. extern and __attribute__((visibility("hidden"))) usage in the vdso

   it seems that this construct doesn't work with clang so i worked it around for
   now by abusing the weak attribute and the linker's ability to merge such symbols.

8. const merging in the vdso

   possibly related to the above, the linker(?) merges const variables when their
   value is the same which, while technically correct, defeats some self-checking
   code in the vdso so i had to deconstify the affected variables.

9. lack of __label__ support

   linux needs this for implementing an arch-independent way to acquire the current
   program counter or something close to it at least, for now the workaround is an
   arch specific inline asm block.

10. clang crash on __verify_pcpu_ptr use

    when compiling i think init/main.c, clang crashes on the above macro. i tried to
    extract a minimal example but that failed to produce any errors, so probably there
    is more context needed to trigger the segfault. interestingly, the workaround for
    getting this compiled was to turn the body of the macro into a statement expression
    but otherwise it's the same code inside.

11. excessive inlining and stack usage

    while apparently gcc and clang make different inlining decisions, they're both
    bad at reusing the stack for the local variables of the inlined functions and
    sometimes produce high stack usage. linux already has an explicit way to prevent
    such undesired inlining, i just had to annotate a few more functions (but it's
    not meant to be exhaustive, it's based on my own config only).

11. uninitialized variable handling

    this one was a fun one to debug (no :P). apparently the getdents code computes
    a structure offset by computing a pointer difference - where the pointer in
    question is uninitialized. gcc seemingly manages to produce the desired offset
    whereas clang produces a 0 for the uninitialized pointers and hence for their
    difference as well, resulting in getdents not returning any entries in this
    particular case. very funny when you enter a directory but cannot list its
    content, although initramfs scripts tend not to appreciate it :). fortunately
    clang --analyze warns about such problems but then it crashes on a few more
    constructs so it's not an entirely painless exercise to go through the whole
    tree looking for such uninitialized variable usage (i checked most things but
    drivers/ and the non-x86 arch subtrees).

12. variable length arrays in crypto/netfilter/crc

    this is an already known issue (in that clang is not going to support this
    gcc extension), so the workaround/fix was to rewrite the linux code.

13. ignoring -fcall-saved-xxx

    it seems that clang for some reason ignores -fcall-saved-xxx and miscompiles some
    code relying on it (lib/hweight.c) so as a workaround i removed this optimization
    from linux but obviously clang should be fixed instead.

beyond the above fixes here and there, there're some opportunities to make better
use of clang specific features as well, so if anyone feels inclined... ;)

14. clang's address_space attribute extension

    this would probably allow to simplify all the x86 per-cpu accessors (ditto
    for userland btw).

15. fix analyzer crashes

    as i mentioned above, there're a few constructs that make the analyzer crash
    on the linux tree, it'd probably be easy to fix them for someone familiar with
    the internals. the easiest way to run the analyzer (and to reproduce the problems)
    is to issue make CC=.../clang C=2 CHECK="clang --analyze" .

16. fix issues found by clang --analyze

    this is a bigger undertaking as the false positive ratio is quite low in my
    experience and there're many issues it finds (mostly unused variables or useless
    variable writes that sometimes can point at deeper issues such as not doing
    anything with error return values but i saw also potential NULL derefs).

17. extend the analyzer to understand the sparse defines

    sparse is a standalone static analyzer built for linux and several important
    subsystems have already been properly marked up for sparse analysis so it'd be
    nice if clang could make use of this information (in fact, some analysis could
    probably be done at normal compile time already since the checks are cheap).

  PaX Team

-------------- next part --------------
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

   ---- File information -----------
     File:  pax-linux-
     Date:  25 Oct 2010, 22:17
     Size:  29747 bytes.
     Type:  Unknown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pax-linux-
Type: application/octet-stream
Size: 29747 bytes
Desc: not available
Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20101026/99a12afa/attachment-0001.obj 

More information about the cfe-dev mailing list