From Sanjiv.Gupta at microchip.com Thu May 1 00:20:13 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Wed, 30 Apr 2008 22:20:13 -0700 Subject: [cfe-dev] Debug Info Generation in Clang. In-Reply-To: <65F77D39-55C8-44A0-A2B1-EE1E7B8D1184@apple.com> Message-ID: > > I do not know how to retrieve last two pieces of information here. > > DirectoryEntry.getName() gives relative path but not absolute. > > Ok, use SourceMgr.getFileEntryForLoc(Loc), which returns a > FileEntry. > A FileEntry has a 'getName()' accessor for the file name, and > 'getDir()' which returns a directory entry. DirectoryEntry > has 'getName()' to get the name of the dir. > Chris, I meant the same thing when I said "DirectoryEntry.getName()." But the problem is that it doesn't give absolute path. llvm-gcc keeps absolute path in compile_unit. > > What is the API to retrive version string? > > I would just set it to 'clang' or something. > Fine. - Sanjiv From clattner at apple.com Thu May 1 00:27:45 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 30 Apr 2008 22:27:45 -0700 Subject: [cfe-dev] Debug Info Generation in Clang. In-Reply-To: References: Message-ID: On Apr 30, 2008, at 10:20 PM, Sanjiv.Gupta at microchip.com wrote: >>> I do not know how to retrieve last two pieces of information here. >>> DirectoryEntry.getName() gives relative path but not absolute. >> >> Ok, use SourceMgr.getFileEntryForLoc(Loc), which returns a >> FileEntry. >> A FileEntry has a 'getName()' accessor for the file name, and >> 'getDir()' which returns a directory entry. DirectoryEntry >> has 'getName()' to get the name of the dir. >> > > Chris, > I meant the same thing when I said "DirectoryEntry.getName()." > But the problem is that it doesn't give absolute path. > llvm-gcc keeps absolute path in compile_unit. I don't think you need an absolute path. If you really want it, you can prefix 'pwd' onto it. -Chris From Sanjiv.Gupta at microchip.com Thu May 1 06:31:41 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Thu, 1 May 2008 04:31:41 -0700 Subject: [cfe-dev] Debug Info Generation in Clang. In-Reply-To: Message-ID: > -----Original Message----- > From: Chris Lattner [mailto:clattner at apple.com] > Sent: Tuesday, April 29, 2008 10:16 PM > To: Sanjiv Kumar Gupta - I00171 > Cc: cedric.venet at laposte.net; cfe-dev at cs.uiuc.edu > Subject: Re: [cfe-dev] Debug Info Generation in Clang. > > On Apr 29, 2008, at 3:54 AM, Sanjiv.Gupta at microchip.com wrote: > > I was thinking that putting it in CodeGenFunction::EmitStmt > may result > > in redundant stoppoint being emitted. But that is taken care of by > > EmitStopPoint function itself, which checks to see if we > have changed > > from the previous line number. So EmitStmt looks the correct place. > > Yep, that makes sense to me too. > Well, putting it into EmitStmt still results into unnecessay stoppoints being generated. For a piece of code like 1: foo () 2: { 3: int I = 5; 4: } Two stoppoints will be generated for line 2 and line 3; We do not want to generate a stoppoint for a '{' (CompoundStmt). I am thinking to go back to my earlier thoughts and put it into below functions: EmitScalarExpr EmitComplexExpr EmitAggExpr - Sanjiv From Sanjiv.Gupta at microchip.com Thu May 1 06:37:38 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Thu, 1 May 2008 04:37:38 -0700 Subject: [cfe-dev] LLVM-GCC does not generate llvm.dbg.region.start In-Reply-To: <9C02E711-E52A-4BA4-8FA9-13CF610BC630@apple.com> Message-ID: > On Apr 30, 2008, at 12:16 AM, Sanjiv.Gupta at microchip.com wrote: > > > I do not see llvm-gcc generating llvm.dbg.region.start/end > pairs for > > blocks. > > Why? > > I'm not sure off-hand. I don't think that the dwarf emitter > supports them yet, so llvm-gcc probably just doesn't bother > producing them. > > -Chris > Well, its quite easy for clang to produce them. We have EmitCompoundStmt for a {...}. So I am going ahead and emit them too. -Sanjiv From clattner at apple.com Thu May 1 11:42:46 2008 From: clattner at apple.com (Chris Lattner) Date: Thu, 1 May 2008 09:42:46 -0700 Subject: [cfe-dev] LLVM-GCC does not generate llvm.dbg.region.start In-Reply-To: References: Message-ID: <88946846-4536-4ECC-AD4C-28697F77EA5B@apple.com> On May 1, 2008, at 4:37 AM, Sanjiv.Gupta at microchip.com wrote: >> I'm not sure off-hand. I don't think that the dwarf emitter >> supports them yet, so llvm-gcc probably just doesn't bother >> producing them. >> >> -Chris >> > > Well, its quite easy for clang to produce them. We have > EmitCompoundStmt > for a {...}. > So I am going ahead and emit them too. Ok! -Chris From clattner at apple.com Thu May 1 11:45:06 2008 From: clattner at apple.com (Chris Lattner) Date: Thu, 1 May 2008 09:45:06 -0700 Subject: [cfe-dev] Debug Info Generation in Clang. In-Reply-To: References: Message-ID: <1DE9A33E-3F73-43AD-9261-EE9D0DE51A1F@apple.com> On May 1, 2008, at 4:31 AM, Sanjiv.Gupta at microchip.com wrote: >> On Apr 29, 2008, at 3:54 AM, Sanjiv.Gupta at microchip.com wrote: >>> I was thinking that putting it in CodeGenFunction::EmitStmt >> may result >>> in redundant stoppoint being emitted. But that is taken care of by >>> EmitStopPoint function itself, which checks to see if we >> have changed >>> from the previous line number. So EmitStmt looks the correct place. >> >> Yep, that makes sense to me too. >> > > Well, putting it into EmitStmt still results into unnecessay > stoppoints > being generated. > > For a piece of code like > 1: foo () > 2: { > 3: int I = 5; > 4: } > > Two stoppoints will be generated for line 2 and line 3; > > We do not want to generate a stoppoint for a '{' (CompoundStmt). I think it is probably best to put it into EmitStmt and filter out stmts that you don't want stoppoints for. EmitStopPoint itself should avoid emitting multiple stoppoints on the same line. For example, in "x = 4; y = 1; z = 12;" we only want one stoppoint. > I am thinking to go back to my earlier thoughts and put it into below > functions: > EmitScalarExpr > EmitComplexExpr > EmitAggExpr The problem with this is that it means that you'll get tons of duplicated stoppoints for every subexpression "2*a + 1" would have 5 stoppoints, most of which would get filtered out. Are you sure about this? -Chris From akyrtzi at gmail.com Thu May 1 16:10:47 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Thu, 01 May 2008 14:10:47 -0700 Subject: [cfe-dev] -emit-html example In-Reply-To: <734C263F-54CF-4B54-9C7F-A744BEF545A2@apple.com> References: <63E11935-E280-4C58-8D7D-0F50A407D223@apple.com> <480BCFBC.6090703@gmail.com> <734C263F-54CF-4B54-9C7F-A744BEF545A2@apple.com> Message-ID: <481A31D7.7000601@gmail.com> Hi Ted, Ted Kremenek wrote: > > I took a look at this patch. I like the low-level refactorings to the > HTML rewriter API (e.g., adding HighlightKeyword). This provides some > nice cleanups that simplify the conceptual complexity of the code > (particularly in SyntaxHighlight). These aren't strictly necessary, > but do but some structure into how we want to name HTML classes for > span tags, etc. > > While I appreciate it's clean design, I have to be honest that I'm not > really sold (yet) on the Annotator class. While I can envision that > we will have multiple clients of the HTML rewriter (e.g., the HTML > pretty-printer, the HTMLDiagnostics used by the static analysis > engine, a doxygen-like documentation generator, and so on) these > different clients will not necessarily fall into an ASTConsumer model, > nor will this interface necessarily be the one they want. > > Basically I'm not certain if it really solves a problem at this point, > and right now adds an extra abstraction layer to implement the > HTMLPrinter (something at its heart is very, very simple). Right now > we have two clients of the HTML Rewriter: one is an ASTConsumer, and > the other is not. I don't believe that an IDE would be an ASTConsumer > (in the clang driver sense) either, but would rather interact with the > clang libraries interactively to regenerate ASTs on-the-fly. > > The nice thing about the "low-level" APIs in HTMLRewrite.h is that > they make little assumption about the target application, but do the > lion's share of the work when pretty-printing code to HTML without > introducing an abstraction layer. The result is that for the current > clients of the HTML Rewrite API (HTMLPrinter and HTMLDiagnostics) the > amount of code they do to perform HTML "tweaking" is small. The > HTMLPrinter has about 20-30 lines of code (which includes opening > files and comments) and HTMLDiagnostics contains a little code for > doing HTML work but this is proportional to the extra stuff that it > outputs. > > Don't get me wrong; I'm a big believer in refactoring and modular > design. I don't think the Annotator has a bad design, I just don't > think it's necessary at this point, and I'd rather not add more > abstraction unless its a clear benefit. My motivation to propose the Annotator lib wasn't specifically to apply it for HTMLPrinter, that was more like an example. The Annotator's purpose would be to verify clang's suitability for an IDE, at least from the aspect of syntax/semantic colorizing. For example it would answer questions like: -Can I colorize all variable names ? (with exclusive color) -Can I colorize all type names ? -Can I associate opening/closing braces for all kinds of blocks (namespaces, functions etc.) ? -Does the AST carry enough information for doing [insert task] ? Now, assuming that you have a working Annotator lib, the best way to put it to use (without messing with some IDE) would be to make a HTMLAnnotator. HTMLAnnotator would be a client of Annotator and HTML Rewrite API. What do you think about the above ? > I don't believe that an IDE would be an ASTConsumer (in the clang > driver sense) either, but would rather interact with the clang > libraries interactively to regenerate ASTs on-the-fly. I was thinking that in the specific task of semantic colorizing, you would have to utilize Preprocessor+Parser+Sema for a particular source file, so the Annotator being an ASTConsumer, that handles the declarations that the parser gives it, seemed reasonable, do you have something else in mind ? -Argiris From doug.gregor at gmail.com Thu May 1 20:57:33 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Thu, 1 May 2008 21:57:33 -0400 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <4811998B.1020000@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> Message-ID: <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> Chris - there's a patch in here that you probably want to take a look at. On Fri, Apr 25, 2008 at 4:42 AM, Argiris Kirtzidis wrote: > Doug Gregor wrote: > How about > > (1) -#2 is checked for error diagnostics by considering default args from > both #1 and #2 > > > And in the end, only #1 gets the merged args. This is the same as only #2 > gets the merged args. I went a slightly different way... both #1 and #2 get the merged args (so each shows the complete state at that time), but we use CXXDefauiltArgExpr for default arguments that were the result of merging default arguments. That way, we always know exactly which parameter (in which declaration) the default argument originally came from. > Actually, I'm thinking that maybe there could be a common convention that, > not only functions, but vars and namespaces should follow. > An extended namespace definition could be regarded as a redeclaration. Yes, agreed. I've moved addRedeclaration (and its brethren) into ScopedDecl, so the same approach can apply to extern variables, functions, namespaces, classes, etc. > I agree that it is philosophical, thus I don't really care much about the > name pointing to the first or the last decl. > What bothers me is the "swapping content" bit. It seems to me that feeding > to the consumer the same pointer for, essentially, different > declaration objects is a source of complexity and possible subtle bugs, > without much benefit. I was never quite thrilled with the swapping bit, either. > I hope I'm not coming off as too pedantic :) It's a C++ front end, so it pays to be pedantic. Anyway, you've convinced me. The attached patch makes a few changes to the way redeclarations are handled: 1) The swapping behavior is gone; instead, we merge into the redeclaration and into the original declaration. 2) The addRedeclaration/getNextRedeclaration function has moved into ScopedDecl, so it can be used for other kinds of declarations 3) We use a DenseMap to store the redeclaration links, since so few actual ScopedDecls will have redeclarations 4) We keep track of where merged default arguments came from, so that we can provide better diagnostics for them. - Doug -------------- next part -------------- A non-text attachment was scrubbed... Name: clang-redecl-noswap.patch Type: text/x-patch Size: 15023 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080501/e6e46b3b/attachment-0001.bin From doug.gregor at gmail.com Thu May 1 21:51:30 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Thu, 1 May 2008 22:51:30 -0400 Subject: [cfe-dev] PATCH: Diagnosing use of C++ default arguments outside of a function declaration Message-ID: <24b520d20805011951x6c6c1913sd361cc344497bf4d@mail.gmail.com> The attached patch diagnoses attempts to use C++ default arguments outside of a parameter-declaration of a function declaration, e.g., void foo(int (*p)(int x = 5)); // ill-formed: p's parameters are not allowed to have default arguments I believe that this wraps up support for default arguments until Clang gets templates or member functions. Otherwise, this is a pretty boring patch. - Doug -------------- next part -------------- A non-text attachment was scrubbed... Name: clang-defarg-nondecl.patch Type: text/x-patch Size: 7018 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080501/f7795d79/attachment.bin From clattner at apple.com Fri May 2 12:42:19 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 2 May 2008 10:42:19 -0700 Subject: [cfe-dev] stack-less model on small devices (patch) In-Reply-To: References: <5DEED6CB-7673-481F-922B-8B7D47A1D6C2@apple.com> <0C080190-B0E1-4DA4-BE80-D0AA27F91764@apple.com> <2CC37DFD-CFEC-436C-AF74-C5F24AFA2020@apple.com> Message-ID: On Apr 30, 2008, at 11:31 AM, Alireza.Moshtaghi at microchip.com wrote: > I created the patch for my target specific modifications and sent it > to > cfe-commits. Since this is my first time to send a patch I don't > know if > I have submitted my changes to the right place or not and of course > what > is the turnaround time. You did exactly the right thing. I've been bogged down with other things lately and haven't had much time to stay on top of clang, this will hopefully be fixed next week, I apologize for the delay. > I also have attached it to this email just in case. > Please let me know if I have to do it differently. The patch looks great. Some specific comments: /// getPointerWidth - Return the width of pointers on this target, for the /// specified address space. FIXME: implement correctly. - uint64_t getPointerWidth(unsigned AddrSpace) const { return 32; } - uint64_t getPointerAlign(unsigned AddrSpace) const { return 32; } + virtual uint64_t getPointerWidth(unsigned AddrSpace) const { return 32; } + virtual uint64_t getPointerAlign(unsigned AddrSpace) const { return 32; } /// getIntWidth/Align - Return the size of 'signed int' and 'unsigned int' for /// this target, in bits. - unsigned getIntWidth() const { return 32; } // FIXME - unsigned getIntAlign() const { return 32; } // FIXME + virtual unsigned getIntWidth() const { return 32; } // FIXME + virtual unsigned getIntAlign() const { return 32; } // FIXME Instead of making these virtual, please add instance variables for these like double and wchar are handled. You can also remove the FIXMEs. Thanks for doing this. +++ lib/Basic/Targets.cpp (working copy) @@ -863,6 +863,28 @@ +class PIC16TargetInfo : public TargetInfo{ +public: + virtual const char *getVAListDeclaration() const { return "";} + virtual const char *getClobbers() const {return "";} + virtual const char *getTargetPrefix() const {return "";} + virtual void getGCCRegNames(const char * const *&Names, unsigned &NumNames) const {} + virtual bool validateAsmConstraint(char c, TargetInfo::ConstraintInfo &info) const {return true;} + virtual void getGCCRegAliases(const GCCRegAlias *&Aliases, unsigned &NumAliases) const {} +}; +} Please make sure the code fits in 80 columns. +++ lib/CodeGen/CGDecl.cpp (working copy) @@ -15,6 +15,7 @@ + if (strncmp (this->Target.getTargetTriple(), "pic16-", 6) == 0) { + const llvm::Type *LTy = CGM.getTypes().ConvertTypeForMem(Ty); The preferred way to do a target check like this is to add some new property to TargetInfo with an accessor like "Target.useGlobalsForAutomaticVariables()" or something like that. PIC16 can return true, all other targets return false. Can you just use the code path for static variables to handle the LLVM IR emission? That would avoid duplicating the code. -Chris From csdavec at swansea.ac.uk Fri May 2 13:00:44 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 2 May 2008 19:00:44 +0100 Subject: [cfe-dev] Objective-C Class, Protocol and Category generation Message-ID: <68F0AF7D-2EF4-4A7C-961E-07FA65F59753@swan.ac.uk> Hi Everyone, Here is the next in the series of ObjC code generation diffs. I've also attached the little Objective-C program I've been using to test it. Still to do: - Constant strings (this will be quite major, since constant objective- c strings are being lowered to CF strings far too early at the moment and a lot of code is likely to need to be moved around). - Message sends to super (not difficult, I just haven't got around to it yet) - The corresponding implementation for the ?toil? runtime (and, presumably, someone wants to write one for the Apple ones too) - Objective-C 2.0 stuff (much of ObjC 2 is syntactic sugar - or salt, depending on your perspective - and should be handled nearer the front, but some things require runtime support). David -------------- next part -------------- A non-text attachment was scrubbed... Name: clang.diff Type: application/octet-stream Size: 46563 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080502/8556e335/attachment-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.m Type: application/octet-stream Size: 1593 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080502/8556e335/attachment-0003.obj From csdavec at swansea.ac.uk Fri May 2 13:07:44 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 2 May 2008 19:07:44 +0100 Subject: [cfe-dev] Objective-C Class, Protocol and Category generation In-Reply-To: <68F0AF7D-2EF4-4A7C-961E-07FA65F59753@swan.ac.uk> References: <68F0AF7D-2EF4-4A7C-961E-07FA65F59753@swan.ac.uk> Message-ID: On 2 May 2008, at 19:00, David Chisnall wrote: > Hi Everyone, > > Here is the next in the series of ObjC code generation diffs. I've > also attached the little Objective-C program I've been using to test > it. > > Still to do: > - Constant strings (this will be quite major, since constant > objective-c strings are being lowered to CF strings far too early at > the moment and a lot of code is likely to need to be moved around). > > - Message sends to super (not difficult, I just haven't got around > to it yet) > > - The corresponding implementation for the ?toil? runtime (and, > presumably, someone wants to write one for the Apple ones too) > > - Objective-C 2.0 stuff (much of ObjC 2 is syntactic sugar - or > salt, depending on your perspective - and should be handled nearer > the front, but some things require runtime support). Oh, and currently selectors are looked up every message send, while they should be looked up at module load time, but this is a relatively small change now that the Module_t is generated. David From csdavec at swansea.ac.uk Fri May 2 19:15:57 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Sat, 3 May 2008 01:15:57 +0100 Subject: [cfe-dev] Message send to super Message-ID: <0FABC18D-581A-4A95-96CE-2BF6E41C117F@swan.ac.uk> Hi, It appears that, in generating the AST, the following expression: [super msg]; is being translated to: [(superclass*)self msg]; Running clang -ast-print confirms this. These two expressions have completely different semantics in Objective-C, and since there appears to be no way of determining whether the user actually performed a cast on self (which is uncommon, but does happen in real code) or sent a message to super (which happens a lot more frequently). Can anyone suggest a way of distinguishing these two? If not, is it possible to introduce a separate kind of AST node, or a flag in the ObjCMessageExpr indicating if super is the receiver? I have now written the code to produce the correct output for the GNU runtime with this kind of expression, but am currently unable to connect it to anything. David From doug.gregor at gmail.com Fri May 2 20:08:55 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Fri, 2 May 2008 21:08:55 -0400 Subject: [cfe-dev] PATCH: Semantic analysis and representation of C++ base classes Message-ID: <24b520d20805021808v5347ae0do7108797caf127b79@mail.gmail.com> This patch adds the missing semantic analysis and representation for C++ base classes. This does a couple of related things: - BaseClassDecl represents a base class of a RecordDecl - Sema::ActOnBaseSpecifiers attaches the base specifiers to a RecordDecl and checks that there are no redundant direct base classes. - Action::isTypeName now has a new argument, IgnoreNonTypes, which instructs name lookup to do exactly that. It's needed when looking up base class names (see C++ [class.derived]p2), e.g., class A { }; void foo(int A) { class B : A { }; // okay: we find class A, not the parameter A. } As part of this, there's a new IdentifierNamespace enumerator called IDNS_Ignore_nontypes, which tells the IdentifierResolver to ignore non-type names. This means that IdentifierNamespace is becoming much more like a set of flags dictating name lookup rules rather than the 4 C namespaces it started at. Expect more movement in this direction as Clang gets more C++-specific name lookup behavior. - Doug -------------- next part -------------- A non-text attachment was scrubbed... Name: clang-inherit.patch Type: application/octet-stream Size: 27373 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080502/dde7c7ba/attachment-0001.obj From csdavec at swansea.ac.uk Sat May 3 11:13:17 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Sat, 3 May 2008 17:13:17 +0100 Subject: [cfe-dev] CFRefCount compile failure Message-ID: <11451AB3-0BFB-41D9-A9B3-D6F4F3F90050@swan.ac.uk> As of the latest svn, CFRefCount no longer compiles, complaining that the left side of a -> is a non-pointer type. This patch appears to fix it (although I only checked it compiles, not that it works, so please review carefully): Index: lib/Analysis/CFRefCount.cpp =================================================================== --- lib/Analysis/CFRefCount.cpp (revision 50606) +++ lib/Analysis/CFRefCount.cpp (working copy) @@ -1731,8 +1731,8 @@ // Determine if there is an LVal binding to the symbol. for (ValueState::vb_iterator I=St->vb_begin(), E=St->vb_end(); I! =E; ++I) { - if (!isa(I->second) // Is the value a symbol? - || cast(I->second).getSymbol() != Sym) + if (!isa((*I).second) // Is the value a symbol? + || cast((*I).second).getSymbol() != Sym) continue; if (VD) { // Multiple decls map to this symbol. @@ -1740,7 +1740,7 @@ break; } - VD = I->first; + VD = (*I).first; } if (VD) FirstDecl = VD; From doug.gregor at gmail.com Sat May 3 11:44:09 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sat, 3 May 2008 09:44:09 -0700 Subject: [cfe-dev] CFRefCount compile failure In-Reply-To: <11451AB3-0BFB-41D9-A9B3-D6F4F3F90050@swan.ac.uk> References: <11451AB3-0BFB-41D9-A9B3-D6F4F3F90050@swan.ac.uk> Message-ID: <24b520d20805030944i6ffdbc98p276ec7133e0d47fb@mail.gmail.com> On Sat, May 3, 2008 at 9:13 AM, David Chisnall wrote: > As of the latest svn, CFRefCount no longer compiles, complaining that > the left side of a -> is a non-pointer type. This patch appears to > fix it (although I only checked it compiles, not that it works, so > please review carefully): I have the same patch locally to work around this problem, but it's not the right long-term solution. Someone, somewhere has operator-> implemented for that iterator but forgot to commit the change to the LLVM repository. - Doug From kremenek at apple.com Sat May 3 13:23:29 2008 From: kremenek at apple.com (Ted Kremenek) Date: Sat, 3 May 2008 11:23:29 -0700 Subject: [cfe-dev] CFRefCount compile failure In-Reply-To: <24b520d20805030944i6ffdbc98p276ec7133e0d47fb@mail.gmail.com> References: <11451AB3-0BFB-41D9-A9B3-D6F4F3F90050@swan.ac.uk> <24b520d20805030944i6ffdbc98p276ec7133e0d47fb@mail.gmail.com> Message-ID: Hi guys, You need to update the LLVM tree (not the clang tree). The following patch (committed yesterday) added operator-> to ImmutableMap::iterator: http://llvm.org/viewvc/llvm-project?rev=50603&view=rev Ted On May 3, 2008, at 9:44 AM, Doug Gregor wrote: > On Sat, May 3, 2008 at 9:13 AM, David Chisnall > wrote: >> As of the latest svn, CFRefCount no longer compiles, complaining that >> the left side of a -> is a non-pointer type. This patch appears to >> fix it (although I only checked it compiles, not that it works, so >> please review carefully): > > I have the same patch locally to work around this problem, but it's > not the right long-term solution. Someone, somewhere has operator-> > implemented for that iterator but forgot to commit the change to the > LLVM repository. > > - Doug > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080503/d7965267/attachment.html From clattner at apple.com Sun May 4 01:01:47 2008 From: clattner at apple.com (Chris Lattner) Date: Sat, 3 May 2008 23:01:47 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20804201904s7221bcb3l9cb75e46213698ec@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <49B6BAF7-96FC-40E1-8BFA-216E046D39A5@apple.com> <24b520d20804201904s7221bcb3l9cb75e46213698ec@mail.gmail.com> Message-ID: <79FB4385-0C0F-43A7-8320-9C34128D4CB2@apple.com> ... catching up on clang, sorry for the delay... On Apr 20, 2008, at 7:04 PM, Doug Gregor wrote: >> Some minor thoughts: >> >> What is the intended ownership model of the previous declarations? >> I see >> that your patch has functiondecl delete other definitions when the >> functiondecl is deleted, is this intended, or do you think >> TranslationUnit >> should own the previous declarations? Is the intent that a client >> would >> walk a declaration list for the translation unit and only see each >> function >> once... walking the prev declaration list to see other >> declarations? This >> model makes sense to me, but it would be nice to explicitly say this >> somewhere in a prominent comment. > > Yes, this was the intent. Since all of the function declarations > represent the same function, I think it makes the most sense for that > function to only have one entry in the translation unit's list of > declarations, because there really is only one function being > declared. Those (probably few) clients that want to know about *all* > declarations of a function can walk the previous-declaration chains. Makes sense. >> It is a little strange to me that getBody() can return non-null if >> isDefinition() return false. How about renaming getBody() -> >> findBody() and >> having getBody() just return the local function? > > I think perhaps the name of isDefinition is really the cause of the > problem here, and that's the name we should change. I've gone with the > *very* explicit isThisDeclarationADefinition, because this is a > special-purpose question about (essentially) the ordering of the > re-declarations. I only expect it to be used in a few diagnostics, > where we want to say "f is (declared|defined) here". Sounds good. > I think splitting getBody into findBody() (which walks the tree) and > getBody() will end up being confusing. Most of the time, all we care > about is "is there a definition for this function?" Having two > functions to answer that question (which produce the same answer > except in very weird cases; see below) is going to cause clients > trouble. Right. >> Also, is there ever a case >> where getBody could ever return a FunctionDecl other than 'this'? >> I thought >> definitions were always at the front of the list. > > Surprisingly, no. One can actually write, e.g., > > void f(int x) { } > void f(int x = 5); Yuck, ok. :) -Chris From clattner at apple.com Sun May 4 01:59:03 2008 From: clattner at apple.com (Chris Lattner) Date: Sat, 3 May 2008 23:59:03 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> Message-ID: <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> On May 1, 2008, at 6:57 PM, Doug Gregor wrote: > Chris - there's a patch in here that you probably want to take a > look at. Reading the backthread, it is a very interesting and nasty topic. I strongly agree with the meta points that Argiris is making: It is bad for clients if parsing new decls causes old decls to change (e.g. due to swapping). Looking forward, this will be a significant hindrance if we want to do things like incremental reparsing. It also breaks a number of useful invariants for clients: the codegen assertion in the testsuite is basically due to the argument list for a functiondecl changing between the first time the client (codegen) sees a decl and the second time it sees it [the first time it sees it, the function is "void()" the second time it is "void(int)"]. Decls changing is surprising and weird. :) It is also nice to be able to capture (as Argiris says) information that is reasonably close to the way the user wrote it. This would argue for representing: void foo(int x, int y = 4); void foo(int x = 12, int y); ... as two functiondecls, each of which that have *one* default argument specified. > On Fri, Apr 25, 2008 at 4:42 AM, Argiris Kirtzidis > wrote: >> Doug Gregor wrote: >> How about >> >> (1) -#2 is checked for error diagnostics by considering default >> args from >> both #1 and #2 >> >> >> And in the end, only #1 gets the merged args. This is the same as >> only #2 >> gets the merged args. > > I went a slightly different way... both #1 and #2 get the merged args > (so each shows the complete state at that time), but we use > CXXDefauiltArgExpr for default arguments that were the result of > merging default arguments. That way, we always know exactly which > parameter (in which declaration) the default argument originally came > from. Ok, so this is effectively handle the above as: void foo(int x, int y = 4); void foo(int x = 12, int y = ); Where points to the '4' in the previous foo prototype? This way you can distinguish between arguments that are explicitly specified from those that were inherited from previous declarations? How does this handle cases like this: void foo() void foo(int x); void foo() case? >> Actually, I'm thinking that maybe there could be a common >> convention that, >> not only functions, but vars and namespaces should follow. >> An extended namespace definition could be regarded as a >> redeclaration. > > Yes, agreed. I've moved addRedeclaration (and its brethren) into > ScopedDecl, so the same approach can apply to extern variables, > functions, namespaces, classes, etc. Nice. I think it is very important to pick a model and be consistent. >> I agree that it is philosophical, thus I don't really care much >> about the >> name pointing to the first or the last decl. >> What bothers me is the "swapping content" bit. It seems to me that >> feeding >> to the consumer the same pointer for, essentially, different >> declaration objects is a source of complexity and possible subtle >> bugs, >> without much benefit. > > I was never quite thrilled with the swapping bit, either. Unfortunately, I did a minor cleanup of the swapping code before reading the whole thread. Maybe that code can just vanish :) >> I hope I'm not coming off as too pedantic :) > > It's a C++ front end, so it pays to be pedantic. :) > Anyway, you've convinced me. The attached patch makes a few changes to > the way redeclarations are handled: > > 1) The swapping behavior is gone; instead, we merge into the > redeclaration and into the original declaration. I'm concerned about this. It violates the sanctity of the original declaration :) > 2) The addRedeclaration/getNextRedeclaration function has moved into > ScopedDecl, so it can be used for other kinds of declarations > 3) We use a DenseMap to store the redeclaration links, since so few > actual ScopedDecls will have redeclarations > 4) We keep track of where merged default arguments came from, so > that we can provide better diagnostics for them. I like this approach much better than the 'swapping' approach. However, it still has a some funny effects. Here is a crazy idea which may be completely impractical :). It seems that we have two different desires here: 1) represent declarations as the user wrote them so that clients can have a simple model where decls are just streaming by. Some of these clients also want the ability to walk the list of redefinitions. 2) provide sema (and some other clients) an 'aggregate' view of the decl where all the info is merged together into one place. In the previous example: void foo(int x, int y = 4); void foo(int x = 12, int y); We want the clients to be able to see those two decls, but also one aggregate decl of: void foo(int x = 12, int y = 4); This saves each client from having to walk the entire list of redeclarations to pull together all the default arguments. Lets assume that functions are usually not redefined, and when they are that the redefinitions are often exactly the same (plus perhaps a body). Given this, maybe a spin on the 'canonical type' idea would work. Ignoring efficiency, imagine if the first time we parsed a decl that we actually created two versions of the decl: one version (the parsed version) that represents the decl as written, and a second version (the aggregate version) that is updated as other redecls are parsed. After parsing the decl the first time, the two decls are exactly the same: void foo(int x, int y = 4); void foo(int x, int y = 4); When parsing the redeclaration, we create another 'parse decl' for the redeclaration: void foo(int x, int y = 4); void foo(int x, int y = 4); void foo(int x = 12, int y); Then we call mergedecl, and it updates the aggregate version to contain all the goop [this is a technical term] merged together for the two declarations: void foo(int x, int y = 4); void foo(int x = 12, int y); void foo(int x = 12, int y = 4); If we later see a definition (e.g. "void foo(int x, int y) {}"), we parse it, but add the actual body to the aggregate version: void foo(int x, int y = 4); void foo(int x = 12, int y); void foo(int x, int y); void foo(int x = 12, int y = 4) {} Instead of adding the body to the "parsed version" of the proto, we add it to the aggregate version, and use a flag on the 'parsed version' to say that it was responsible for providing the definition or not. Again, if we ignore efficiency, I think this provides an interesting sweet spot: all of these [re]decls would be linked together, so clients could walk the list. Also, parsing a redefinition does not cause the previous definition to change. Clients who care about the aggregate semantics of a decl could just always look at the 'aggregate version' of the decl (similar to the 'canonical' version of a type), which would always have the union of information known about the decl. This is also reasonably easy to implement: merge decls just handles the construction and updating of the aggregate version, and it has all the information it could ever want to provide really accurate diagnostics. To me, the only bad thing about this approach is that it is (incredibly!) memory inefficient. Allocating twice the number of decls in the common case (where there is only one decls for each object) is badness. However, this case is also the trivial case where the aggregate and the parsed decl are exactly the same. :) As such, I propose that we add two fields to ScopeDecl (?): a 'next redeclaration' pointer and 'is aggregate redecl' bool. It would actually be implemented as: I. Sema handles the ActOnFooDecl Action by creating a new decl object, verifying that it is self consistent, and then calling mergedecls unconditionally. II. Mergedecls gets the decl, sees that it is the first declaration for the object. Since the 'parsed' and 'aggregate' declaration is the same, set 'nextredecl' pointer to point to itself, and set 'isaggregate' to true. This saves allocation of a second decl in the common case. For example, imagine we just parsed: void foo(int x, int y); // #1 we now get a "foo" decl, which I'll write schematically as (numbering redeclarations, really the numbers would be pointers of course): 1: foo [nextredecl=1, isaggregate=true] void foo(int x, int y); Scope[foo] = 1 // This is the scope chain for foo III. The next time MergeDecls is called, there are two possibilities: first if the redecl is exactly the same (no information is added) then existing redecl is set as the next pointer and isaggregate is set to false: void foo(int x, int y); // #2 1: foo [nextredecl=1, isaggregate=true] void foo(int x, int y); 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); Scope[foo] = 2 At this point, the scope object says that "#2" is the last redeclaration of 'foo'. From that, we can walk the 'nextredecl' list to see #1. When we get to it, we see that the next link is the aggregate version, so clients can decide whether they want to get the aggregate version, or just stop if they want to see decls corresponding to what the user wrote. IV. When mergedecls is called and there *is* new information, two things can happen. So far, there is not an explicit 'aggregate' version. As such, we now create a new aggregate version of it ("A"), and update things: void foo(int x, int y = 0); // #3 A: foo [nextredecl=A, isaggregate=true] void foo(int x, int y = 0); 1: foo [nextredecl=A, isaggregate=true] void foo(int x, int y); 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); 3: foo [nextredecl=2, isaggregate=false] void foo(int x, int y = 0); Scope[foo] = 3 At this point, users can walk the redefn list and see all the 'parsed' definitions 1/2/3 and other clients can also get the full aggregate version (A). We can distinguish between "recycled" aggregate versions and explicit aggregate versions because 'nextredecl' points to 'this' in the reuse case. V. The second case when adding information is that the explicit aggregate version already exists. In that case, we just update (A): void foo(int x = 12, int y) // #4 A: foo [nextredecl=A, isaggregate=true] void foo(int x = 12, int y = 0); 1: foo [nextredecl=A, isaggregate=true] void foo(int x, int y); 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); 3: foo [nextredecl=2, isaggregate=false] void foo(int x, int y = 0); 4: foo [nextredecl=3, isaggregate=false] void foo(int x = 12, int y); Scope[foo] = 4 etc. Of course, the bool and pointer can be swizzeled together so the bool goes in the low bit of the pointer. I think this is a fairly reasonable sweet spot, what do you guys think? -Chris From akyrtzi at gmail.com Sun May 4 03:42:22 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 01:42:22 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> Message-ID: <481D76EE.4010106@gmail.com> Hi Chris, I really like the idea of keeping the decls clean and having a separate 'aggregate' decl. I have two suggestions for consideration: --The 'isaggregate' bool is not needed if we assume that, when we walk the list of redecls, the last one is the 'aggregate' one. Here's how it would look (using the 5 cases you describe): II. void foo(int x, int y); // #1 1: foo [nextredecl=null] void foo(int x, int y); Last one = #1 = aggregate III. void foo(int x, int y); // #2 1: foo [nextredecl=null] void foo(int x, int y); 2: foo [nextredecl=1] void foo(int x, int y); Last one = #1 = aggregate IV. void foo(int x, int y = 0); // #3 A: foo [nextredecl=null] void foo(int x, int y = 0); 1: foo [nextredecl=A] void foo(int x, int y); 2: foo [nextredecl=1] void foo(int x, int y); 3: foo [nextredecl=2] void foo(int x, int y = 0); Last one = #A = aggregate V. void foo(int x = 12, int y) // #4 A: foo [nextredecl=null] void foo(int x = 12, int y = 0); 1: foo [nextredecl=A] void foo(int x, int y); 2: foo [nextredecl=1] void foo(int x, int y); 3: foo [nextredecl=2] void foo(int x, int y = 0); 4: foo [nextredecl=3] void foo(int x = 12, int y); Last one = #A = aggregate --Another suggestion is to not add to the scope chain the redeclarations (at all 5 cases, Scope[foo] = 1), thus all calls to 'foo' will refer to the same FunctionDecl node. One way to go about this would be to have a double linked list of redeclarations, and the (V) case would be: A: foo [prevdecl=null, nextdecl=null] void foo(int x = 12, int y = 0); // does the 'aggregate' need to point to #1 as nextdecl ? 1: foo [prevdecl=A, nextdecl=2] void foo(int x, int y); // this is what all calls to 'foo' refer to 2: foo [prevdecl=1, nextdecl=3] void foo(int x, int y); 3: foo [prevdecl=2, nextdecl=4] void foo(int x, int y = 0); 4: foo [prevdecl=3, nextdecl=null] void foo(int x = 12, int y); following prevdecl, last one = #A = aggregate We could use two DenseMaps to store the redeclaration links, similar to the way Doug is using one in his patch, thus not bloating ScopeDecls. (just to be pedantic again, on the above case, if #A doesn't link to #1 as nextdecl, we save a link :) ) -Argiris Chris Lattner wrote: > On May 1, 2008, at 6:57 PM, Doug Gregor wrote: >> Chris - there's a patch in here that you probably want to take a look >> at. > > Reading the backthread, it is a very interesting and nasty topic. I > strongly agree with the meta points that Argiris is making: > > It is bad for clients if parsing new decls causes old decls to change > (e.g. due to swapping). Looking forward, this will be a significant > hindrance if we want to do things like incremental reparsing. It also > breaks a number of useful invariants for clients: the codegen > assertion in the testsuite is basically due to the argument list for a > functiondecl changing between the first time the client (codegen) sees > a decl and the second time it sees it [the first time it sees it, the > function is "void()" the second time it is "void(int)"]. Decls > changing is surprising and weird. :) > > It is also nice to be able to capture (as Argiris says) information > that is reasonably close to the way the user wrote it. This would > argue for representing: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); > > ... as two functiondecls, each of which that have *one* default > argument specified. > >> On Fri, Apr 25, 2008 at 4:42 AM, Argiris Kirtzidis >> wrote: >>> Doug Gregor wrote: >>> How about >>> >>> (1) -#2 is checked for error diagnostics by considering default args >>> from >>> both #1 and #2 >>> >>> >>> And in the end, only #1 gets the merged args. This is the same as >>> only #2 >>> gets the merged args. >> >> I went a slightly different way... both #1 and #2 get the merged args >> (so each shows the complete state at that time), but we use >> CXXDefauiltArgExpr for default arguments that were the result of >> merging default arguments. That way, we always know exactly which >> parameter (in which declaration) the default argument originally came >> from. > > Ok, so this is effectively handle the above as: > > void foo(int x, int y = 4); > void foo(int x = 12, int y = ); > > Where points to the '4' in the previous foo > prototype? This way you can distinguish between arguments that are > explicitly specified from those that were inherited from previous > declarations? How does this handle cases like this: > > void foo() > void foo(int x); > void foo() > > case? > >>> Actually, I'm thinking that maybe there could be a common convention >>> that, >>> not only functions, but vars and namespaces should follow. >>> An extended namespace definition could be regarded as a redeclaration. >> >> Yes, agreed. I've moved addRedeclaration (and its brethren) into >> ScopedDecl, so the same approach can apply to extern variables, >> functions, namespaces, classes, etc. > > Nice. I think it is very important to pick a model and be consistent. > >>> I agree that it is philosophical, thus I don't really care much >>> about the >>> name pointing to the first or the last decl. >>> What bothers me is the "swapping content" bit. It seems to me that >>> feeding >>> to the consumer the same pointer for, essentially, different >>> declaration objects is a source of complexity and possible subtle bugs, >>> without much benefit. >> >> I was never quite thrilled with the swapping bit, either. > > Unfortunately, I did a minor cleanup of the swapping code before > reading the whole thread. Maybe that code can just vanish :) > >>> I hope I'm not coming off as too pedantic :) >> >> It's a C++ front end, so it pays to be pedantic. > > :) > >> Anyway, you've convinced me. The attached patch makes a few changes to >> the way redeclarations are handled: >> >> 1) The swapping behavior is gone; instead, we merge into the >> redeclaration and into the original declaration. > > I'm concerned about this. It violates the sanctity of the original > declaration :) > >> 2) The addRedeclaration/getNextRedeclaration function has moved into >> ScopedDecl, so it can be used for other kinds of declarations >> 3) We use a DenseMap to store the redeclaration links, since so few >> actual ScopedDecls will have redeclarations >> 4) We keep track of where merged default arguments came from, so >> that we can provide better diagnostics for them. > > I like this approach much better than the 'swapping' approach. > However, it still has a some funny effects. > > Here is a crazy idea which may be completely impractical :). It seems > that we have two different desires here: 1) represent declarations as > the user wrote them so that clients can have a simple model where > decls are just streaming by. Some of these clients also want the > ability to walk the list of redefinitions. 2) provide sema (and some > other clients) an 'aggregate' view of the decl where all the info is > merged together into one place. In the previous example: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); > > We want the clients to be able to see those two decls, but also one > aggregate decl of: > void foo(int x = 12, int y = 4); > > This saves each client from having to walk the entire list of > redeclarations to pull together all the default arguments. > > > Lets assume that functions are usually not redefined, and when they > are that the redefinitions are often exactly the same (plus perhaps a > body). Given this, maybe a spin on the 'canonical type' idea would work. > > Ignoring efficiency, imagine if the first time we parsed a decl that > we actually created two versions of the decl: one version (the parsed > version) that represents the decl as written, and a second version > (the aggregate version) that is updated as other redecls are parsed. > After parsing the decl the first time, the two decls are exactly the > same: > > void foo(int x, int y = 4); void foo(int x, int y = 4); > > > > When parsing the redeclaration, we create another 'parse decl' for the > redeclaration: > > void foo(int x, int y = 4); void foo(int x, int y = 4); > void foo(int x = 12, int y); > > Then we call mergedecl, and it updates the aggregate version to > contain all the goop [this is a technical term] merged together for > the two declarations: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); void foo(int x = 12, int y = 4); > > If we later see a definition (e.g. "void foo(int x, int y) {}"), we > parse it, but add the actual body to the aggregate version: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); > void foo(int x, int y); void foo(int x = 12, int y = 4) {} > > Instead of adding the body to the "parsed version" of the proto, we > add it to the aggregate version, and use a flag on the 'parsed > version' to say that it was responsible for providing the definition > or not. > > Again, if we ignore efficiency, I think this provides an interesting > sweet spot: all of these [re]decls would be linked together, so > clients could walk the list. Also, parsing a redefinition does not > cause the previous definition to change. Clients who care about the > aggregate semantics of a decl could just always look at the 'aggregate > version' of the decl (similar to the 'canonical' version of a type), > which would always have the union of information known about the > decl. This is also reasonably easy to implement: merge decls just > handles the construction and updating of the aggregate version, and it > has all the information it could ever want to provide really accurate > diagnostics. > > To me, the only bad thing about this approach is that it is > (incredibly!) memory inefficient. Allocating twice the number of > decls in the common case (where there is only one decls for each > object) is badness. However, this case is also the trivial case where > the aggregate and the parsed decl are exactly the same. :) As such, I > propose that we add two fields to ScopeDecl (?): a 'next > redeclaration' pointer and 'is aggregate redecl' bool. > > It would actually be implemented as: > > I. Sema handles the ActOnFooDecl Action by creating a new decl object, > verifying that it is self consistent, and then calling mergedecls > unconditionally. > > II. Mergedecls gets the decl, sees that it is the first declaration > for the object. Since the 'parsed' and 'aggregate' declaration is the > same, set 'nextredecl' pointer to point to itself, and set > 'isaggregate' to true. This saves allocation of a second decl in the > common case. For example, imagine we just parsed: > > void foo(int x, int y); // #1 > > we now get a "foo" decl, which I'll write schematically as (numbering > redeclarations, really the numbers would be pointers of course): > > 1: foo [nextredecl=1, isaggregate=true] void foo(int x, int y); > > Scope[foo] = 1 // This is the scope chain for foo > > > III. The next time MergeDecls is called, there are two possibilities: > first if the redecl is exactly the same (no information is added) then > existing redecl is set as the next pointer and isaggregate is set to > false: > > void foo(int x, int y); // #2 > > 1: foo [nextredecl=1, isaggregate=true] void foo(int x, int y); > 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); > > > Scope[foo] = 2 > > At this point, the scope object says that "#2" is the last > redeclaration of 'foo'. From that, we can walk the 'nextredecl' list > to see #1. When we get to it, we see that the next link is the > aggregate version, so clients can decide whether they want to get the > aggregate version, or just stop if they want to see decls > corresponding to what the user wrote. > > > IV. When mergedecls is called and there *is* new information, two > things can happen. So far, there is not an explicit 'aggregate' > version. As such, we now create a new aggregate version of it ("A"), > and update things: > > void foo(int x, int y = 0); // #3 > > A: foo [nextredecl=A, isaggregate=true] void foo(int x, int y = 0); > 1: foo [nextredecl=A, isaggregate=true] void foo(int x, int y); > 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); > 3: foo [nextredecl=2, isaggregate=false] void foo(int x, int y = 0); > > Scope[foo] = 3 > > > At this point, users can walk the redefn list and see all the 'parsed' > definitions 1/2/3 and other clients can also get the full aggregate > version (A). We can distinguish between "recycled" aggregate versions > and explicit aggregate versions because 'nextredecl' points to 'this' > in the reuse case. > > > V. The second case when adding information is that the explicit > aggregate version already exists. In that case, we just update (A): > > void foo(int x = 12, int y) // #4 > > A: foo [nextredecl=A, isaggregate=true] void foo(int x = 12, int y = 0); > 1: foo [nextredecl=A, isaggregate=true] void foo(int x, int y); > 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); > 3: foo [nextredecl=2, isaggregate=false] void foo(int x, int y = 0); > 4: foo [nextredecl=3, isaggregate=false] void foo(int x = 12, int y); > > Scope[foo] = 4 > > etc. Of course, the bool and pointer can be swizzeled together so the > bool goes in the low bit of the pointer. > > I think this is a fairly reasonable sweet spot, what do you guys think? > > -Chris > From clattner at apple.com Sun May 4 13:19:33 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 11:19:33 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <481D76EE.4010106@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> Message-ID: <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> On May 4, 2008, at 1:42 AM, Argiris Kirtzidis wrote: > Hi Chris, > > I really like the idea of keeping the decls clean and having a > separate 'aggregate' decl. I have two suggestions for consideration: > > --The 'isaggregate' bool is not needed if we assume that, when we > walk the list of redecls, the last one is the 'aggregate' one. Makes sense to me. Alternatively... > --Another suggestion is to not add to the scope chain the > redeclarations (at all 5 cases, Scope[foo] = 1), thus all calls to > 'foo' will refer to the same FunctionDecl node. > One way to go about this would be to have a double linked list of > redeclarations, and the (V) case would be: > > A: foo [prevdecl=null, nextdecl=null] void foo(int x = 12, int y = > 0); // does the 'aggregate' need to point to #1 as nextdecl ? > 1: foo [prevdecl=A, nextdecl=2] void foo(int x, int y); // this is > what all calls to 'foo' refer to > 2: foo [prevdecl=1, nextdecl=3] void foo(int x, int y); > 3: foo [prevdecl=2, nextdecl=4] void foo(int x, int y = 0); > 4: foo [prevdecl=3, nextdecl=null] void foo(int x = 12, int y); > > following prevdecl, last one = #A = aggregate > > We could use two DenseMaps to store the redeclaration links, similar > to the way Doug is using one in his patch, thus not bloating > ScopeDecls. > (just to be pedantic again, on the above case, if #A doesn't link to > #1 as nextdecl, we save a link :) ) How about just storing the list in reverse order. This means that adding a new redecl would have to walk the list (to add the redecl to the end of the singly linked list), but that the aggregate version would always be at the head of the list. This would make redecls slower but would make references to the function constant time. Given that there are usually not hundreds of redecls of functions, I think it would be ok. If we find that walking the list *is* a problem, there are more complex/clever solutions possible too. -Chris From doug.gregor at gmail.com Sun May 4 13:42:15 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 14:42:15 -0400 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> Message-ID: <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> On Sun, May 4, 2008 at 2:59 AM, Chris Lattner wrote: > On May 1, 2008, at 6:57 PM, Doug Gregor wrote: > > > Chris - there's a patch in here that you probably want to take a look at. > > > It is also nice to be able to capture (as Argiris says) information that is > reasonably close to the way the user wrote it. This would argue for > representing: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); > > ... as two functiondecls, each of which that have *one* default argument > specified. It might be more accurate to represent the second "foo" as having a default argument on "y", with a way to tell that the default argument was actually merged from another declaration. Otherwise, the second "foo" in isolation is an ill-formed function declaration. > > I went a slightly different way... both #1 and #2 get the merged args > > (so each shows the complete state at that time), but we use > > CXXDefauiltArgExpr for default arguments that were the result of > > merging default arguments. That way, we always know exactly which > > parameter (in which declaration) the default argument originally came > > from. > > > > Ok, so this is effectively handle the above as: > > void foo(int x, int y = 4); > void foo(int x = 12, int y = ); > > Where points to the '4' in the previous foo prototype? would be a little more accurate, where y refers to the "y" parameter of the first declaration of "foo". > This way you can distinguish between arguments that are explicitly specified > from those that were inherited from previous declarations? How does this > handle cases like this: > > void foo() > void foo(int x); > void foo() > > case? It's keeping these foo's as exactly as described in the source, which isn't quite accurate. The last declaration should certainly know that we do have a prototype for "foo". > > 1) The swapping behavior is gone; instead, we merge into the > > redeclaration and into the original declaration. > > > > I'm concerned about this. It violates the sanctity of the original > declaration :) Yep, that's the trade-off in that patch. > void foo(int x, int y = 4); > void foo(int x = 12, int y); > > We want the clients to be able to see those two decls, but also one > aggregate decl of: > void foo(int x = 12, int y = 4); > > This saves each client from having to walk the entire list of > redeclarations to pull together all the default arguments. Okay. > Lets assume that functions are usually not redefined, and when they are > that the redefinitions are often exactly the same (plus perhaps a body). > Given this, maybe a spin on the 'canonical type' idea would work. Even if the definition is the same as the original declaration, the source locations for the parameters, function name, and so on will be different... so we'll want a second FunctionDecl.I'm guessing that the majority of (non-template) functions will have 2 FunctionDecl nodes associated with them. > Ignoring efficiency, imagine if the first time we parsed a decl that we > actually created two versions of the decl: one version (the parsed version) > that represents the decl as written, and a second version (the aggregate > version) that is updated as other redecls are parsed. After parsing the > decl the first time, the two decls are exactly the same: > > void foo(int x, int y = 4); void foo(int x, int y = 4); > > > > When parsing the redeclaration, we create another 'parse decl' for the > redeclaration: > > void foo(int x, int y = 4); void foo(int x, int y = 4); > void foo(int x = 12, int y); > > Then we call mergedecl, and it updates the aggregate version to contain all > the goop [this is a technical term] merged together for the two > declarations: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); void foo(int x = 12, int y = 4); > > If we later see a definition (e.g. "void foo(int x, int y) {}"), we parse > it, but add the actual body to the aggregate version: > > void foo(int x, int y = 4); > void foo(int x = 12, int y); > void foo(int x, int y); void foo(int x = 12, int y = 4) {} > > Instead of adding the body to the "parsed version" of the proto, we add it > to the aggregate version, and use a flag on the 'parsed version' to say that > it was responsible for providing the definition or not. > > Again, if we ignore efficiency, I think this provides an interesting sweet > spot: all of these [re]decls would be linked together, so clients could walk > the list. Also, parsing a redefinition does not cause the previous > definition to change. Clients who care about the aggregate semantics of a > decl could just always look at the 'aggregate version' of the decl (similar > to the 'canonical' version of a type), which would always have the union of > information known about the decl. This is also reasonably easy to > implement: merge decls just handles the construction and updating of the > aggregate version, and it has all the information it could ever want to > provide really accurate diagnostics. My gut tells me that most clients that are doing any kind of analysis will care only about the "aggregate" version of the decl. > 1: foo [nextredecl=1, isaggregate=true] void foo(int x, int y); > > Scope[foo] = 1 // This is the scope chain for foo > > > III. The next time MergeDecls is called, there are two possibilities: first > if the redecl is exactly the same (no information is added) then existing > redecl is set as the next pointer and isaggregate is set to false: > > void foo(int x, int y); // #2 > > 1: foo [nextredecl=1, isaggregate=true] void foo(int x, int y); > 2: foo [nextredecl=1, isaggregate=false] void foo(int x, int y); > > > Scope[foo] = 2 Like Argiris, I don't think we should be adding redeclarations to the scope chain. The only time the scope chain should contain multiple declarations of the same name is if those declarations actually refer to different entities. > I think this is a fairly reasonable sweet spot, what do you guys think? With Argiris' tweak, I almost like it. I'm concerned about the programmability of this system, where the FunctionDecl node that is found by name lookup is not the 'aggregate' node. As with the canonical type system, clients will have to be very careful to always map into the 'aggregate' node before querying any properties. With the canonical type system, we need to do the mapping because we need to preserve typedefs in the AST. With declarations, however, the complexity isn't coming directly from the language... it's coming from the representation. It will take a lot of discipline to use FunctionDecls properly if name lookup doesn't find the 'aggregate' FunctionDecl. (Among other things, we'll have to audit the whole Clang code-base to see where we need to add GetAggregateDecl calls. Errors here are likely to be subtle.). It occurs to me that using FunctionDecl for each of the redeclarations isn't even as efficient as we could be. For example, the redeclarations don't need to have scope or name information, because we know they're the same as the aggregate FunctionDecl, nor do they need information about the body of the function. On the other hand, they do need some additional flags, such as "am I the definition?" One could imagine that each function only has a single FunctionDecl, which represents the aggregate declaration. That FunctionDecl contains a SmallVector, where each FunctionRedecl contains minimal information about a redeclaration... the exact types used in the parameter-type-list, the attributes, whether it was the definition, etc. All of semantic analysis sees just that one FunctionDecl, and those clients interested in dealing with redeclarations can walk that redeclaration list; One could also consider adding a callback that is invoked on each redeclaration. Time to catch a flight... - Doug From mrs at apple.com Sun May 4 13:53:13 2008 From: mrs at apple.com (Mike Stump) Date: Sun, 4 May 2008 11:53:13 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: On May 4, 2008, at 11:42 AM, Doug Gregor wrote: > My gut tells me that most clients that are doing any kind of > analysis will care only about the "aggregate" version of the decl. Yeah, I wonder if the memory saving of having the client declare up front what they want would outweigh the downside of having the code only collects what the client is interested in. Having two decls when a client only really needs one would be unfortunate. If downside isn't a big deal, having the client essentially say, I only need the merged decls would be best. From clattner at apple.com Sun May 4 15:58:52 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 13:58:52 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: <88C324AD-406D-475A-9D34-133FD1BBCE28@apple.com> On May 4, 2008, at 11:53 AM, Mike Stump wrote: > On May 4, 2008, at 11:42 AM, Doug Gregor wrote: >> My gut tells me that most clients that are doing any kind of >> analysis will care only about the "aggregate" version of the decl. > > Yeah, I wonder if the memory saving of having the client declare up > front what they want would outweigh the downside of having the code > only collects what the client is interested in. Having two decls > when a client only really needs one would be unfortunate. If > downside isn't a big deal, having the client essentially say, I only > need the merged decls would be best. This doesn't work if you want clang to stream out ASTs to disk as a side effect of compilation. Because you don't know the ultimate client of those ASTs, you end up having to save everything anyway. -Chris From clattner at apple.com Sun May 4 16:30:26 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 14:30:26 -0700 Subject: [cfe-dev] Message send to super In-Reply-To: <0FABC18D-581A-4A95-96CE-2BF6E41C117F@swan.ac.uk> References: <0FABC18D-581A-4A95-96CE-2BF6E41C117F@swan.ac.uk> Message-ID: <2F286328-B45F-48D8-B04E-A83487A143B8@apple.com> On May 2, 2008, at 5:15 PM, David Chisnall wrote: > Hi, > > It appears that, in generating the AST, the following expression: > > [super msg]; > > is being translated to: > > [(superclass*)self msg]; This is bad. 'super' 'self' '_cmd' as well as 'this' in C++ etc should all be handled with PreDefinedExpr, just like __func__ is. > Running clang -ast-print confirms this. These two expressions have > completely different semantics in Objective-C, and since there appears > to be no way of determining whether the user actually performed a cast > on self (which is uncommon, but does happen in real code) or sent a > message to super (which happens a lot more frequently). Right. Use of PreDefinedExpr would be be much more explicit and better in general for clients. -Chris From akyrtzi at gmail.com Sun May 4 16:37:11 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 14:37:11 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> Message-ID: <481E2C87.9040005@gmail.com> Chris Lattner wrote: >> --Another suggestion is to not add to the scope chain the >> redeclarations (at all 5 cases, Scope[foo] = 1), thus all calls to >> 'foo' will refer to the same FunctionDecl node. >> One way to go about this would be to have a double linked list of >> redeclarations, and the (V) case would be: >> >> A: foo [prevdecl=null, nextdecl=null] void foo(int x = 12, int y = >> 0); // does the 'aggregate' need to point to #1 as nextdecl ? >> 1: foo [prevdecl=A, nextdecl=2] void foo(int x, int y); // this is >> what all calls to 'foo' refer to >> 2: foo [prevdecl=1, nextdecl=3] void foo(int x, int y); >> 3: foo [prevdecl=2, nextdecl=4] void foo(int x, int y = 0); >> 4: foo [prevdecl=3, nextdecl=null] void foo(int x = 12, int y); >> >> following prevdecl, last one = #A = aggregate >> >> We could use two DenseMaps to store the redeclaration links, similar >> to the way Doug is using one in his patch, thus not bloating ScopeDecls. >> (just to be pedantic again, on the above case, if #A doesn't link to >> #1 as nextdecl, we save a link :) ) > > How about just storing the list in reverse order. This means that > adding a new redecl would have to walk the list (to add the redecl to > the end of the singly linked list), but that the aggregate version > would always be at the head of the list. This would make redecls > slower but would make references to the function constant time. Given > that there are usually not hundreds of redecls of functions, I think > it would be ok. If we find that walking the list *is* a problem, > there are more complex/clever solutions possible too. The issue with the reverse order single list is that when the consumer receives a redecl it cannot know if it's a redecl (the last decl doesn't point to another decl), and even if we add a flag to indicate it, the consumer will know that it's a redecl but it will not be able to get the original decl and/or the aggregate one from the latest redecl. -Argiris From clattner at apple.com Sun May 4 16:42:12 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 14:42:12 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <481E2C87.9040005@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> <481E2C87.9040005@gmail.com> Message-ID: <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> On May 4, 2008, at 2:37 PM, Argiris Kirtzidis wrote: > Chris Lattner wrote: >>> --Another suggestion is to not add to the scope chain the >>> redeclarations (at all 5 cases, Scope[foo] = 1), thus all calls to >>> 'foo' will refer to the same FunctionDecl node. >>> One way to go about this would be to have a double linked list of >>> redeclarations, and the (V) case would be: >>> >>> A: foo [prevdecl=null, nextdecl=null] void foo(int x = 12, int y >>> = 0); // does the 'aggregate' need to point to #1 as nextdecl ? >>> 1: foo [prevdecl=A, nextdecl=2] void foo(int x, int y); // this >>> is what all calls to 'foo' refer to >>> 2: foo [prevdecl=1, nextdecl=3] void foo(int x, int y); >>> 3: foo [prevdecl=2, nextdecl=4] void foo(int x, int y = 0); >>> 4: foo [prevdecl=3, nextdecl=null] void foo(int x = 12, int y); >>> >>> following prevdecl, last one = #A = aggregate >>> >>> We could use two DenseMaps to store the redeclaration links, >>> similar to the way Doug is using one in his patch, thus not >>> bloating ScopeDecls. >>> (just to be pedantic again, on the above case, if #A doesn't link >>> to #1 as nextdecl, we save a link :) ) >> >> How about just storing the list in reverse order. This means that >> adding a new redecl would have to walk the list (to add the redecl >> to the end of the singly linked list), but that the aggregate >> version would always be at the head of the list. This would make >> redecls slower but would make references to the function constant >> time. Given that there are usually not hundreds of redecls of >> functions, I think it would be ok. If we find that walking the >> list *is* a problem, there are more complex/clever solutions >> possible too. > > The issue with the reverse order single list is that when the > consumer receives a redecl it cannot know if it's a redecl (the last > decl doesn't point to another decl), and even if we add a flag to > indicate it, the consumer will know that it's a redecl but it will > not be able to get the original decl and/or the aggregate one from > the latest redecl. Do you mean that you can't tell if you are looking at the aggregate decl or a real decl? If so, the aggregate decl could just have a null source location, to indicate that it is synthetic. -Chris From akyrtzi at gmail.com Sun May 4 16:55:57 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 14:55:57 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> <481E2C87.9040005@gmail.com> <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> Message-ID: <481E30ED.60209@gmail.com> Chris Lattner wrote: > > On May 4, 2008, at 2:37 PM, Argiris Kirtzidis wrote: > >> Chris Lattner wrote: >>>> --Another suggestion is to not add to the scope chain the >>>> redeclarations (at all 5 cases, Scope[foo] = 1), thus all calls to >>>> 'foo' will refer to the same FunctionDecl node. >>>> One way to go about this would be to have a double linked list of >>>> redeclarations, and the (V) case would be: >>>> >>>> A: foo [prevdecl=null, nextdecl=null] void foo(int x = 12, int y = >>>> 0); // does the 'aggregate' need to point to #1 as nextdecl ? >>>> 1: foo [prevdecl=A, nextdecl=2] void foo(int x, int y); // this >>>> is what all calls to 'foo' refer to >>>> 2: foo [prevdecl=1, nextdecl=3] void foo(int x, int y); >>>> 3: foo [prevdecl=2, nextdecl=4] void foo(int x, int y = 0); >>>> 4: foo [prevdecl=3, nextdecl=null] void foo(int x = 12, int y); >>>> >>>> following prevdecl, last one = #A = aggregate >>>> >>>> We could use two DenseMaps to store the redeclaration links, >>>> similar to the way Doug is using one in his patch, thus not >>>> bloating ScopeDecls. >>>> (just to be pedantic again, on the above case, if #A doesn't link >>>> to #1 as nextdecl, we save a link :) ) >>> >>> How about just storing the list in reverse order. This means that >>> adding a new redecl would have to walk the list (to add the redecl >>> to the end of the singly linked list), but that the aggregate >>> version would always be at the head of the list. This would make >>> redecls slower but would make references to the function constant >>> time. Given that there are usually not hundreds of redecls of >>> functions, I think it would be ok. If we find that walking the list >>> *is* a problem, there are more complex/clever solutions possible too. >> >> The issue with the reverse order single list is that when the >> consumer receives a redecl it cannot know if it's a redecl (the last >> decl doesn't point to another decl), and even if we add a flag to >> indicate it, the consumer will know that it's a redecl but it will >> not be able to get the original decl and/or the aggregate one from >> the latest redecl. > > Do you mean that you can't tell if you are looking at the aggregate > decl or a real decl? If so, the aggregate decl could just have a null > source location, to indicate that it is synthetic. Following your example, at first there is: void foo(int x, int y); // #1 1: foo [nextredecl=null] void foo(int x, int y); the consumer receives #1 (through HandleTopLevelDecl() ) Then: void foo(int x, int y=2); // #2 A: foo [nextredecl=1] void foo(int x, int y=2); 1: foo [nextredecl=2] void foo(int x, int y); 2: foo [nextredecl=null] void foo(int x, int y=2); The consumer receives #2 : 2: foo [nextredecl=null] void foo(int x, int y=2); How can it find out whether #2 is a redecl and that #A is the aggregate ? -Argiris From clattner at apple.com Sun May 4 17:09:56 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 15:09:56 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <481E30ED.60209@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> <481E2C87.9040005@gmail.com> <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> <481E30ED.60209@gmail.com> Message-ID: <6ECDD1B0-25DB-4EF0-92A1-E8341D219819@apple.com> On May 4, 2008, at 2:55 PM, Argiris Kirtzidis wrote: > Chris Lattner wrote: >> On May 4, 2008, at 2:37 PM, Argiris Kirtzidis wrote: >> Do you mean that you can't tell if you are looking at the aggregate >> decl or a real decl? If so, the aggregate decl could just have a >> null source location, to indicate that it is synthetic. > > Following your example, at first there is: > > void foo(int x, int y); // #1 > 1: foo [nextredecl=null] void foo(int x, int y); > > the consumer receives #1 (through HandleTopLevelDecl() ) > > Then: > > void foo(int x, int y=2); // #2 > > A: foo [nextredecl=1] void foo(int x, int y=2); > 1: foo [nextredecl=2] void foo(int x, int y); > 2: foo [nextredecl=null] void foo(int x, int y=2); > > The consumer receives #2 : > 2: foo [nextredecl=null] void foo(int x, int y=2); > > How can it find out whether #2 is a redecl and that #A is the > aggregate ? Ah, you're saying that because you are given the end of the list, you can't just walk it. Great point :). Maybe the right solution is to build it so that the list is in the current order but that the aggregate version (if present) is always at the start of the list. Subsequent redecls would be added right after the aggregate? In fact, if the list was circular, clients could then walk all of them. In this case we would end up with: void foo(int x, int y); // #1 1: foo [nextredecl=null] void foo(int x, int y); the consumer receives #1 (through HandleTopLevelDecl() ) Then: void foo(int x, int y=2); // #2 1: foo [nextredecl=A] void foo(int x, int y); 2: foo [nextredecl=1] void foo(int x, int y=2); A: foo [nextredecl=2] void foo(int x, int y=2); The consumer receives #2 : 2: foo [nextredecl=null] void foo(int x, int y=2); This way they can get #1 from #2, and they can even get A from 2... but that Sema can poke A efficiently. -Chris From akyrtzi at gmail.com Sun May 4 17:55:58 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 15:55:58 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <6ECDD1B0-25DB-4EF0-92A1-E8341D219819@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> <481E2C87.9040005@gmail.com> <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> <481E30ED.60209@gmail.com> <6ECDD1B0-25DB-4EF0-92A1-E8341D219819@apple.com> Message-ID: <481E3EFE.4090203@gmail.com> Chris Lattner wrote: > > Ah, you're saying that because you are given the end of the list, you > can't just walk it. Great point :). Maybe the right solution is to > build it so that the list is in the current order but that the > aggregate version (if present) is always at the start of the list. > Subsequent redecls would be added right after the aggregate? In fact, > if the list was circular, clients could then walk all of them. In > this case we would end up with: > > void foo(int x, int y); // #1 > 1: foo [nextredecl=null] void foo(int x, int y); > > the consumer receives #1 (through HandleTopLevelDecl() ) > > > Then: > > void foo(int x, int y=2); // #2 > > 1: foo [nextredecl=A] void foo(int x, int y); > 2: foo [nextredecl=1] void foo(int x, int y=2); > A: foo [nextredecl=2] void foo(int x, int y=2); > > The consumer receives #2 : > 2: foo [nextredecl=null] void foo(int x, int y=2); > > This way they can get #1 from #2, and they can even get A from 2... > but that Sema can poke A efficiently. Nice, the circular list idea sounds good! -Argiris From akyrtzi at gmail.com Sun May 4 17:57:06 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 15:57:06 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: <481E3F42.2050103@gmail.com> Doug Gregor wrote: > With Argiris' tweak, I almost like it. I'm concerned about the > programmability of this system, where the FunctionDecl node that is > found by name lookup is not the 'aggregate' node. As with the > canonical type system, clients will have to be very careful to always > map into the 'aggregate' node before querying any properties. With the > canonical type system, we need to do the mapping because we need to > preserve typedefs in the AST. With declarations, however, the > complexity isn't coming directly from the language... it's coming from > the representation. It will take a lot of discipline to use > FunctionDecls properly if name lookup doesn't find the 'aggregate' > FunctionDecl. (Among other things, we'll have to audit the whole Clang > code-base to see where we need to add GetAggregateDecl calls. Errors > here are likely to be subtle.). > That's a good point. I don't have a strong opinion on whether there should be a separate 'aggregate' node or not, but assuming that there is, can the semantic info for a function decl refer to the aggregate one for all redecls but for decl specific info you'll use separate methods ? I mean, getbody() would return the body of the aggregate for all redecls, but you would also have a isThisDefinition() to probe the specific decl. Same for default arguments. Is this practical ? -Argiris From doug.gregor at gmail.com Sun May 4 18:16:39 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 17:16:39 -0600 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <481E3F42.2050103@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <481E3F42.2050103@gmail.com> Message-ID: <24b520d20805041616l70077a69l6c3d319a3b5ff551@mail.gmail.com> On Sun, May 4, 2008 at 4:57 PM, Argiris Kirtzidis wrote: > Doug Gregor wrote: > > > With Argiris' tweak, I almost like it. I'm concerned about the > > programmability of this system, where the FunctionDecl node that is > > found by name lookup is not the 'aggregate' node. As with the > > canonical type system, clients will have to be very careful to always > > map into the 'aggregate' node before querying any properties. With the > > canonical type system, we need to do the mapping because we need to > > preserve typedefs in the AST. With declarations, however, the > > complexity isn't coming directly from the language... it's coming from > > the representation. It will take a lot of discipline to use > > FunctionDecls properly if name lookup doesn't find the 'aggregate' > > FunctionDecl. (Among other things, we'll have to audit the whole Clang > > code-base to see where we need to add GetAggregateDecl calls. Errors > > here are likely to be subtle.). > > > > > > That's a good point. I don't have a strong opinion on whether there should > be a separate 'aggregate' node or not, but assuming that there is, can the > semantic info for a function decl refer to the aggregate one for all redecls > but for decl specific info you'll use separate methods ? I think, from the programmer's standpoint, it doesn't matter so much whether we have an 'aggregate' decl or not, so long as it looks like we have an aggregate declaration. However, if we have to reconstruct the contents of the aggregate declaration every time we query the FunctionDecl, it's going to get expensive because we'll be traversing the list of redeclarations and accumulating information. (For example, think of merging attributes every time we query an attribute!) > I mean, getbody() would return the body of the aggregate for all redecls, > but you would also have a isThisDefinition() to probe the specific decl. > Same for default arguments. > Is this practical ? It's a little trickier for default arguments, because the default argument is stored as part of ParmVarDecl rather than as part of the FunctionDecl. The DefaultArgExpr trick can get around that specific issue, of course. Aside from that... it effectively doubles the size of the interface to FunctionDecl (isInline and isThisInline, getAttrs and getThisAttrs, etc.), but it does solve the issue without losing any source information. It's harder to write bad code in this case, but one is going to have to be very methodical when maintaining source information to avoid writing isFoo rather than isThisFoo. Overall, I think this approach is an improvement. - Doug From akyrtzi at gmail.com Sun May 4 18:17:11 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 16:17:11 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <6ECDD1B0-25DB-4EF0-92A1-E8341D219819@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> <481E2C87.9040005@gmail.com> <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> <481E30ED.60209@gmail.com> <6ECDD1B0-25DB-4EF0-92A1-E8341D219819@apple.com> Message-ID: <481E43F7.4060407@gmail.com> Chris Lattner wrote: > > Ah, you're saying that because you are given the end of the list, you > can't just walk it. Great point :). Maybe the right solution is to > build it so that the list is in the current order but that the > aggregate version (if present) is always at the start of the list. > Subsequent redecls would be added right after the aggregate? In fact, > if the list was circular, clients could then walk all of them. In > this case we would end up with: > > void foo(int x, int y); // #1 > 1: foo [nextredecl=null] void foo(int x, int y); > > the consumer receives #1 (through HandleTopLevelDecl() ) > > > Then: > > void foo(int x, int y=2); // #2 > > 1: foo [nextredecl=A] void foo(int x, int y); > 2: foo [nextredecl=1] void foo(int x, int y=2); > A: foo [nextredecl=2] void foo(int x, int y=2); > > The consumer receives #2 : > 2: foo [nextredecl=null] void foo(int x, int y=2); > > This way they can get #1 from #2, and they can even get A from 2... > but that Sema can poke A efficiently. Oh, if a separate aggregate is not created because #1 is the aggregate: 1: foo [nextredecl=2] void foo(int x, int y); 2: foo [nextredecl=1] void foo(int x, int y); You need a 'isAggregate' bool then; this seems to negate the benefit over the double linked list, is this correct ? From doug.gregor at gmail.com Sun May 4 18:19:08 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 17:19:08 -0600 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: <24b520d20805041619g92225a9ja48203a82276cf6c@mail.gmail.com> On Sun, May 4, 2008 at 12:53 PM, Mike Stump wrote: > On May 4, 2008, at 11:42 AM, Doug Gregor wrote: > > > My gut tells me that most clients that are doing any kind of analysis will > care only about the "aggregate" version of the decl. > > > > Yeah, I wonder if the memory saving of having the client declare up front > what they want would outweigh the downside of having the code only collects > what the client is interested in. Having two decls when a client only > really needs one would be unfortunate. If downside isn't a big deal, having > the client essentially say, I only need the merged decls would be best. I think the interface would be simpler if we could avoid such modes. Every mode setting like this increases the testing burden quite a bit. - Doug From doug.gregor at gmail.com Sun May 4 18:23:42 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 17:23:42 -0600 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: <24b520d20805041623g258b1b75h5385c513e32f1773@mail.gmail.com> On Sun, May 4, 2008 at 12:42 PM, Doug Gregor wrote: > One > could imagine that each function only has a single FunctionDecl, which > represents the aggregate declaration. That FunctionDecl contains a > SmallVector, where each FunctionRedecl contains > minimal information about a redeclaration... The poster child for this kind of approach is RecordDecl. Right now, we completely ignore re-declarations of tag types, e.g., "class foo;" However, if we're to represent the source accurately, we should have redeclaration chains for these, too. Unfortunately, RecordDecl is pretty large, and most of that information only applies to the definition---not to the redeclarations. This could be addressed in a few different ways, e.g., by moving the definition into a separate structure or by making redeclarations use a smaller decl node. - Doug From clattner at apple.com Sun May 4 18:23:57 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 16:23:57 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <481E43F7.4060407@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <48104F39.4010409@gmail.com> <24b520d20804240610l53f6d54ar203f853a30a16584@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <481D76EE.4010106@gmail.com> <58DC189F-C64B-4AF0-9BB9-0E3C490B9AE1@apple.com> <481E2C87.9040005@gmail.com> <0A863B90-8A05-437A-AED5-9F96366B32B4@apple.com> <481E30ED.60209@gmail.com> <6ECDD1B0-25DB-4EF0-92A1-E8341D219819@apple.com> <481E43F7.4060407@gmail.com> Message-ID: <703EA9E8-6E34-4054-8963-C35870F2836A@apple.com> On May 4, 2008, at 4:17 PM, Argiris Kirtzidis wrote: >> The consumer receives #2 : >> 2: foo [nextredecl=null] void foo(int x, int y=2); >> >> This way they can get #1 from #2, and they can even get A from 2... >> but that Sema can poke A efficiently. > > Oh, if a separate aggregate is not created because #1 is the > aggregate: > > 1: foo [nextredecl=2] void foo(int x, int y); > 2: foo [nextredecl=1] void foo(int x, int y); > > You need a 'isAggregate' bool then; this seems to negate the benefit > over the double linked list, is this correct ? Makes sense to me! An isAggregate bool is cheaper than the memory cost of a doubly linked list. -Chris From clattner at apple.com Sun May 4 18:29:23 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 16:29:23 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041616l70077a69l6c3d319a3b5ff551@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <481E3F42.2050103@gmail.com> <24b520d20805041616l70077a69l6c3d319a3b5ff551@mail.gmail.com> Message-ID: On May 4, 2008, at 4:16 PM, Doug Gregor wrote: >> That's a good point. I don't have a strong opinion on whether there >> should >> be a separate 'aggregate' node or not, but assuming that there is, >> can the >> semantic info for a function decl refer to the aggregate one for >> all redecls >> but for decl specific info you'll use separate methods ? > > I think, from the programmer's standpoint, it doesn't matter so much > whether we have an 'aggregate' decl or not, so long as it looks like > we have an aggregate declaration. Right. I look at the aggregate decl is just a cache. When it comes to serialization/deserialization, we may choose to not even write it out, because it can be fully reconstructed from the information available in the other decls. > However, if we have to reconstruct the contents of the aggregate > declaration every time we query the FunctionDecl, it's going to get > expensive because we'll be traversing the list of redeclarations and > accumulating information. (For example, think of merging attributes > every time we query an attribute!) Exactly! >> I mean, getbody() would return the body of the aggregate for all >> redecls, >> but you would also have a isThisDefinition() to probe the specific >> decl. >> Same for default arguments. >> Is this practical ? > > It's a little trickier for default arguments, because the default > argument is stored as part of ParmVarDecl rather than as part of the > FunctionDecl. The DefaultArgExpr trick can get around that specific > issue, of course. Incidentally, I think all 'semantic' clients will really just jump to the aggregate decl and use it directly. This would encourage an idiom like: D->getAggregateDecl()->isInline() vs D->isInline(), etc. This is similar to the canonical type system. For client that really do what to see what the user wrote, I think it is reasonable to expect them to handle the cases that can actually occur in code. For example, if you want to rename a function f -> g, you have to walk the list of all redeclarations and rename each one of them. This does mean that any code that does this will need to tolerate things like "int foo(int x = 4, int y)" though. > Aside from that... it effectively doubles the size of the interface to > FunctionDecl (isInline and isThisInline, getAttrs and getThisAttrs, > etc.), but it does solve the issue without losing any source > information. It's harder to write bad code in this case, but one is > going to have to be very methodical when maintaining source > information to avoid writing isFoo rather than isThisFoo. To avoid doubling the interface, we can just require clients to get the aggregate version first, do you like this style or prefer to double the interface? > Overall, I think this approach is an improvement. Great! Incidentally, is there a better name than 'aggregate' for these? :) How about Accumulated or something else? -Chris From clattner at apple.com Sun May 4 18:30:44 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 16:30:44 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041623g258b1b75h5385c513e32f1773@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <24b520d20805041623g258b1b75h5385c513e32f1773@mail.gmail.com> Message-ID: <3C0B9C99-296D-4147-8E89-9ED01B60BA80@apple.com> On May 4, 2008, at 4:23 PM, Doug Gregor wrote: > On Sun, May 4, 2008 at 12:42 PM, Doug Gregor > wrote: >> One >> could imagine that each function only has a single FunctionDecl, >> which >> represents the aggregate declaration. That FunctionDecl contains a >> SmallVector, where each FunctionRedecl contains >> minimal information about a redeclaration... > > The poster child for this kind of approach is RecordDecl. Right now, > we completely ignore re-declarations of tag types, e.g., "class foo;" > However, if we're to represent the source accurately, we should have > redeclaration chains for these, too. Unfortunately, RecordDecl is > pretty large, and most of that information only applies to the > definition---not to the redeclarations. This could be addressed in a > few different ways, e.g., by moving the definition into a separate > structure or by making redeclarations use a smaller decl node. Yep, this is also an issue we'll have to solve someday. I'm happy to defer fighting that battle until when it comes up though :) -Chris From clattner at apple.com Sun May 4 18:34:02 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 16:34:02 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810940C.6010600@gmail.com> <4810A596.8080005@gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: On May 4, 2008, at 11:42 AM, Doug Gregor wrote: >> I think this is a fairly reasonable sweet spot, what do you guys >> think? > > With Argiris' tweak, I almost like it. I'm concerned about the > programmability of this system, where the FunctionDecl node that is > found by name lookup is not the 'aggregate' node. As with the > canonical type system, clients will have to be very careful to always > map into the 'aggregate' node before querying any properties. With the > canonical type system, we need to do the mapping because we need to > preserve typedefs in the AST. With declarations, however, the > complexity isn't coming directly from the language... it's coming from > the representation. It will take a lot of discipline to use > FunctionDecls properly if name lookup doesn't find the 'aggregate' > FunctionDecl. That is a great point, would it be reasonable to just always make the aggregate version by the one returned by scope lookups? > It occurs to me that using FunctionDecl for each of the redeclarations > isn't even as efficient as we could be. For example, the > redeclarations don't need to have scope or name information, because > we know they're the same as the aggregate FunctionDecl, nor do they > need information about the body of the function. On the other hand, > they do need some additional flags, such as "am I the definition?" One > could imagine that each function only has a single FunctionDecl, which > represents the aggregate declaration. That FunctionDecl contains a > SmallVector, where each FunctionRedecl contains > minimal information about a redeclaration... the exact types used in > the parameter-type-list, the attributes, whether it was the > definition, etc. All of semantic analysis sees just that one > FunctionDecl, and those clients interested in dealing with > redeclarations can walk that redeclaration list; One could also > consider adding a callback that is invoked on each redeclaration. That is an interesting idea. I think that it could be done as a refinement on the first step of getting the current plan in place, do you agree? -Chris From doug.gregor at gmail.com Sun May 4 19:10:00 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 18:10:00 -0600 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> Message-ID: <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> On Sun, May 4, 2008 at 5:34 PM, Chris Lattner wrote: > On May 4, 2008, at 11:42 AM, Doug Gregor wrote: > That is a great point, would it be reasonable to just always make the > aggregate version by the one returned by scope lookups? That would make me very, very happy, if that also implies that everything in the AST will refer to the aggregate version (except, of course, the list of redeclarations). Can we do this without always having an aggregate node? I'm not sure we can, because we run into trouble like this: void foo(int); // #1: no aggregate void bar(int i) { foo(i); } void foo(int x) { ... } // #2: now we create the aggregate, but bar() points at #2. This is why I ended up with the swapping mess in the first place. Argh! > > One > > could imagine that each function only has a single FunctionDecl, which > > represents the aggregate declaration. That FunctionDecl contains a > > SmallVector, where each FunctionRedecl contains > > minimal information about a redeclaration... the exact types used in > > the parameter-type-list, the attributes, whether it was the > > definition, etc. > > > > That is an interesting idea. I think that it could be done as a refinement > on the first step of getting the current plan in place, do you agree? Yes, it could. > Great! Incidentally, is there a better name than 'aggregate' for these? :) How about > Accumulated or something else? Well, there's always "Canonical" :) - Doug From clattner at apple.com Sun May 4 19:25:23 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 17:25:23 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> Message-ID: <8240301A-BC4E-44A8-844C-FE74ABC973ED@apple.com> On May 4, 2008, at 5:10 PM, Doug Gregor wrote: > On Sun, May 4, 2008 at 5:34 PM, Chris Lattner > wrote: >> On May 4, 2008, at 11:42 AM, Doug Gregor wrote: >> That is a great point, would it be reasonable to just always make the >> aggregate version by the one returned by scope lookups? > > That would make me very, very happy Ok :) > , if that also implies that > everything in the AST will refer to the aggregate version (except, of > course, the list of redeclarations). Can we do this without always > having an aggregate node? I'm not sure we can, because we run into > trouble like this: > > void foo(int); // #1: no aggregate > void bar(int i) { foo(i); } > void foo(int x) { ... } // #2: now we create the aggregate, but > bar() points at #2. > > This is why I ended up with the swapping mess in the first place. > Argh! I don't think there is a really easy way to do that. Is there any harm in having the call to foo in bar pointing to the old decl? What client would be affected by this, sema wouldn't be, right? >> Great! Incidentally, is there a better name than 'aggregate' for >> these? :) How about >> Accumulated or something else? > > Well, there's always "Canonical" :) Good point, I like that better than 'aggregate' :) -Chris From akyrtzi at gmail.com Sun May 4 19:33:53 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 04 May 2008 17:33:53 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <24b520d20804240848r559ae7fdlac448f8937487ff6@mail.gmail.com> <4810E5DC.2080600@gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> Message-ID: <481E55F1.609@gmail.com> Doug Gregor wrote: > On Sun, May 4, 2008 at 5:34 PM, Chris Lattner wrote: > >> On May 4, 2008, at 11:42 AM, Doug Gregor wrote: >> That is a great point, would it be reasonable to just always make the >> aggregate version by the one returned by scope lookups? >> > > That would make me very, very happy, if that also implies that > everything in the AST will refer to the aggregate version (except, of > course, the list of redeclarations). Can we do this without always > having an aggregate node? I'm not sure we can, because we run into > trouble like this: > > void foo(int); // #1: no aggregate > void bar(int i) { foo(i); } > void foo(int x) { ... } // #2: now we create the aggregate, but > bar() points at #2. > > This is why I ended up with the swapping mess in the first place. Argh! > Another idea to throw around.. If there is a need, instead of creating a separate aggregate we create a separate decl to store the source info of the first decl, and have the first decl be the aggregate one. I mean, the first decl will always be the aggregate and if the client wants the source info for it he would do: aggregate_decl->getSourceDecl()->isInline() -Argiris From doug.gregor at gmail.com Sun May 4 19:35:59 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 18:35:59 -0600 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <8240301A-BC4E-44A8-844C-FE74ABC973ED@apple.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> <8240301A-BC4E-44A8-844C-FE74ABC973ED@apple.com> Message-ID: <24b520d20805041735g39f4d802ob6978311f18f0631@mail.gmail.com> On Sun, May 4, 2008 at 6:25 PM, Chris Lattner wrote: > On May 4, 2008, at 5:10 PM, Doug Gregor wrote: > > , if that also implies that > > everything in the AST will refer to the aggregate version (except, of > > course, the list of redeclarations). Can we do this without always > > having an aggregate node? I'm not sure we can, because we run into > > trouble like this: > > > > void foo(int); // #1: no aggregate > > void bar(int i) { foo(i); } > > void foo(int x) { ... } // #2: now we create the aggregate, but > > bar() points at #2. > > > > This is why I ended up with the swapping mess in the first place. Argh! > > > > I don't think there is a really easy way to do that. Is there any harm in > having the call to foo in bar pointing to the old decl? What client would > be affected by this, sema wouldn't be, right? The harm in having bar's call to foo point to the old decl is that it forces the D->getCanonicalDecl() convention onto every use of the AST that is interested in getting at the semantic interface, including all of Sema. That kind of interface convention invites a lot of programming errors; we've already seen some confusion with canonical types, but there at least people expect to have a different representation when typedefs are preserved. - Doug From doug.gregor at gmail.com Sun May 4 19:42:02 2008 From: doug.gregor at gmail.com (Doug Gregor) Date: Sun, 4 May 2008 18:42:02 -0600 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <481E55F1.609@gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> <481E55F1.609@gmail.com> Message-ID: <24b520d20805041742u5284ab7bgb60c96b2312b629f@mail.gmail.com> On Sun, May 4, 2008 at 6:33 PM, Argiris Kirtzidis wrote: > Another idea to throw around.. > If there is a need, instead of creating a separate aggregate we create a > separate decl to store the source info of the first decl, and have the first > decl be the aggregate one. > I mean, the first decl will always be the aggregate and if the client wants > the source info for it he would do: > > aggregate_decl->getSourceDecl()->isInline() Hmmm, I like that. Actually, why not replace the term "redeclaration" with "source declaration"? The 'aggregate' ('canonical'?) node will be the first node created and the one used throughout the AST. If one is interested in source information, browse the source-declarations list. For functions that have only one declaration (ever), we would just have the one FunctionDecl. However, once we see a second declaration, we create two source declarations: one for the original declaration (basically, a clone of the aggregate decl) and one for the redeclaration. - Doug From neil at daikokuya.co.uk Sun May 4 19:56:57 2008 From: neil at daikokuya.co.uk (Neil Booth) Date: Mon, 5 May 2008 09:56:57 +0900 Subject: [cfe-dev] [cfe-commits] r50538 - /cfe/trunk/lib/Sema/SemaDecl.cpp In-Reply-To: <58BF496A-2B87-405E-AEEC-F66C6D5BC57C@apple.com> References: <200805012104.m41L4HkI019130@zion.cs.uiuc.edu> <20080501221941.GE22431@daikokuya.co.uk> <58BF496A-2B87-405E-AEEC-F66C6D5BC57C@apple.com> Message-ID: <20080505005657.GI22431@daikokuya.co.uk> Chris Lattner wrote:- > On May 1, 2008, at 3:19 PM, Neil Booth wrote: > >> Does this give them local or global scope? They should only have local >> scope; it sounds like you're making them global. > > 'context' is related but different than 'scope'. I think Argiris' fix is > the right one for this, but we still have the scope issue. > > What do you suggest here Neil? Right now we walk up the scope chain and > insert the implicit definition at global scope, which prevents it from > getting popped off the scope chain. Is it best to mark it somehow as being > locally defined or something, or is it best to remove it from the global > scope chain when popped? I honestly don't know what's best; this is a messy area as the ongoing other thread makes clear. What I did, which was OK for C but I'm not so sure about C++, is separate decls from what they declare. So I had a function entry in the IR, say, of which only one exists for any given function declaration; it has the composite type. Each declaration gets a decl at the correct scope which points to the function, but also has its own type override according to what the type was when that decl was complete. This means you can always see the correct type of any particular decl, and obtaining the full composite type is also easy. But the type override is a pain. And in C++ the problem is harder because it's not just type - default args are not part of the type I believe? The way I did my symbol table meant that finding a decl even when not in scope wasn't a problem, so they were looked up and diagnosed regardless of visibility. I don't think there is an ideal solution to this issue; I wish the language standards didn't permit block-scope declarations of things with external linkage. But in C++ you have the default args issue so it still wouldn't be solved there. > Alternatively, perhaps it shouldn't be added to global scope. This would > make diagnosing the undefined cases harder, but they could be handled with > other hacks. Right. If you do it at global scope it should be fine, but needs to be not visible to language name lookup when it goes out of scope. Neil. From clattner at apple.com Sun May 4 23:35:46 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 21:35:46 -0700 Subject: [cfe-dev] PATCH: Diagnosing use of C++ default arguments outside of a function declaration In-Reply-To: <24b520d20805011951x6c6c1913sd361cc344497bf4d@mail.gmail.com> References: <24b520d20805011951x6c6c1913sd361cc344497bf4d@mail.gmail.com> Message-ID: <7F30970B-EA08-4A91-8C71-C8CE3F85810D@apple.com> On May 1, 2008, at 7:51 PM, Doug Gregor wrote: > The attached patch diagnoses attempts to use C++ default arguments > outside of a parameter-declaration of a function declaration, e.g., > > void foo(int (*p)(int x = 5)); // ill-formed: p's parameters are not > allowed to have default arguments > > I believe that this wraps up support for default arguments until Clang > gets templates or member functions. Otherwise, this is a pretty boring > patch. Looks great to me, please apply! -Chris From clattner at apple.com Mon May 5 00:20:23 2008 From: clattner at apple.com (Chris Lattner) Date: Sun, 4 May 2008 22:20:23 -0700 Subject: [cfe-dev] PATCH: Semantic analysis and representation of C++ base classes In-Reply-To: <24b520d20805021808v5347ae0do7108797caf127b79@mail.gmail.com> References: <24b520d20805021808v5347ae0do7108797caf127b79@mail.gmail.com> Message-ID: <71AA0B13-77F6-4DC1-A196-9A3B553EE767@apple.com> On May 2, 2008, at 6:08 PM, Doug Gregor wrote: > This patch adds the missing semantic analysis and representation for > C++ base classes. Nice! > As part of this, there's a new IdentifierNamespace enumerator > called IDNS_Ignore_nontypes, which tells the IdentifierResolver to > ignore non-type names. This means that IdentifierNamespace is becoming > much more like a set of flags dictating name lookup rules rather than > the 4 C namespaces it started at. Expect more movement in this > direction as Clang gets more C++-specific name lookup behavior. Ok, that makes sense to me. As its capabilities grow, we can always refactor it (again!) in the future. Some comments: +++ include/clang/AST/Decl.h (working copy) @@ -15,6 +15,7 @@ #define LLVM_CLANG_AST_DECL_H #include "clang/AST/DeclBase.h" +#include "clang/Parse/AccessSpecifier.h" This is a layering violation. The AST library should not depend on the Parser specific structures. It is up to Sema to translate parser versions of these data structures into something permanent that lives in the AST. The logic here is that we want [de]serialization to be able to construct ASTs without going through the parser, and enforcing the layering helps ensure there is a clean way to produce the ASTs. Also, the obvious solution of moving the enum to the AST headers won't work either, as the parser is not allowed to know about ASTs. This comes up in a variety of places, with different levels of weirdness. For example, attributes and decl specs are parsed into a temporary structure and then sema moves them to a more permanent one. ObjC has a similar issue with its ObjCDeclSpec enums. I think the best way to handle this is to have a switch statement that translates the enums from the parser version to the ast version. +/// BaseClassDecl - Represents a base class of a C++ class. +class BaseClassDecl : public Decl { +protected: + QualType Type; + SourceRange Range; + bool Virtual; + AccessSpecifier Access; + As with the ObjC-specific decl stuff, please expand the class comment to be more detailed, please give an example. Also, for each member, please add a short doxygen comment explaining what they are. Also, does it make sense for this to inherit from Decl? My understanding of base classes is that they qualify a class, but they aren't themselves decls. This may just be my misunderstanding, but what benefit does this get from deriving from Decl? + /// BaseClasses/NumBaseClasses - This is a new[]'d array of points to + /// base class decls. (C++ only). + BaseClassDecl **BaseClasses; // Null if not defined. + int NumBaseClasses; // -1 if not defined. + Please make this be 0 if not defined. I'm trying to stamp out use of -1 as a sentinal, as it makes definitions of iterators awkward. Also, instead of (or in addition to) these members: + /// getNumBaseClasses - Return the number of base classes, or -1 if + /// this is a forward declaration. + int getNumBaseClasses() const { return NumBaseClasses; } + const BaseClassDecl *getBaseClass(unsigned i) const { return BaseClasses[i]; } + BaseClassDecl *getBaseClass(unsigned i) { return BaseClasses[i]; } Please add iterator definitions. For example, ObjCCategoryImplDecl defines instmeth_* and classmeth_*. /// isTypeName - Return non-null if the specified identifier is a typedef name + /// in the current scope. If IgnoreNonTypes is true, then name lookup will + /// completely ignore any declarations it finds that are not types, and + /// continue to search for the name of a type in outer scopes (used in C++). + virtual DeclTy *isTypeName(const IdentifierInfo &II, Scope *S, + bool IgnoreNonTypes = false) = 0; This makes sense to me, but please drop the "(used in C++)" part from the comment. +++ lib/Sema/SemaDeclCXX.cpp (working copy) ... + // Base class types cannot be cv-qualified (because the type is a + // class-name). + if (BaseType.getCanonicalType().getCVRQualifiers() != 0) { + Diag(BaseLoc, diag::err_cv_qualified_base_class, BaseType.getAsString(), + SpecifierRange); + BaseType = BaseType.getUnqualifiedType(); + } + Does this also need to check for address-space qualifiers? I think something like "if (BaseType.getCanonicalType().getAddressSpace() != 0)" should be sufficient. 'getUnqualifiedType' strips off addrspace qualifiers also. +void Sema::ActOnBaseSpecifiers(DeclTy *classdecl, + DeclTy **basedecls, unsigned NumBaseDecls) { + unsigned NextBaseDecl = 0; + llvm::DenseMap DirectBases; Using a densemap to detect duplicates is overkill here. DenseMap is best when you plan to insert lots of stuff. What we really want is some sort of "SmallDenseMap" which avoids the heap in the common case where there are only a few cases. We don't have that, though, so three options: 1) Use SmallPtrSet, which is efficient when below a threshold, and also scales very well. The problem is that you lose the ability to map to an index. 2) Use a vector and just sort the vector (by record) and scan the vector in order to detect duplicates 3) Just don't worry about it :), add a fixme and move on. + for (unsigned i = 0; i < NumBaseDecls; ++i) { It's a very minor issue, and not a big deal, but I tend to prefer using 'i != NumBaseDecls' just as a general form of strength reduction and because it is more idiomatic when faced with lots of iterator comparisons. Feel free to ignore me if you like <. @@ -201,8 +201,11 @@ NamedDecl *IdentifierResolver::Lookup(co for (IdDeclInfo::ShadowedIter SI = IDI->shadowed_end(); SI != IDI->shadowed_begin(); --SI) { NamedDecl *D = *(SI-1); - if (D->getIdentifierNamespace() & NS) - return D; + if (D->getIdentifierNamespace() & NS) { + if (!(NS & Decl::IDNS_Ignore_nontypes) + || dyn_cast_or_null(D)) + return D; D can't be null here if you got 'D->getIdentifierNamespace()' successfully. Also, if you don't use the pointer result, please use isa instead of dyn_cast. Thus the check can be "|| isa(D)". +++ lib/Parse/ParseDeclCXX.cpp (working copy) + Actions.ActOnBaseSpecifiers(ClassDecl, + BaseDecls.empty()? 0 : &BaseDecls[0], + BaseDecls.size()); SmallVector (unlike std::vector) does allow you to take the address of element 0 when the array is empty, so you can drop the check if you prefer. Overall, the patch looks great. Please update it and commit when you're happy with it. We can continue any remaining discussion after the patch goes in. Thanks Doug, -Chris From Sanjiv.Gupta at microchip.com Mon May 5 00:55:40 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Sun, 4 May 2008 22:55:40 -0700 Subject: [cfe-dev] PATCH: Clang DebugInfo for StopPoints. Message-ID: Posting to cfe-dev as well. - Sanjiv -----Original Message----- From: cfe-commits-bounces at cs.uiuc.edu [mailto:cfe-commits-bounces at cs.uiuc.edu] On Behalf Of Sanjiv.Gupta at microchip.com Sent: Friday, May 02, 2008 5:09 PM To: cfe-commits at cs.uiuc.edu Subject: [cfe-commits] Clang DebugInfo Patch for StopPoints. Hi, Please find the patch attached for stoppoint debug info generation in clang. It also includes a command line switch called -g. There are two new files to be placed in lib/CodeGen , those files are attached here. Thanks, Sanjiv -------------- next part -------------- A non-text attachment was scrubbed... Name: patch.stoppoint.50575 Type: application/octet-stream Size: 7266 bytes Desc: patch.stoppoint.50575 Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080504/ac80e697/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: CodeGenDebugInfo.cpp Type: application/octet-stream Size: 4993 bytes Desc: CodeGenDebugInfo.cpp Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080504/ac80e697/attachment-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: CodeGenDebugInfo.h Type: application/octet-stream Size: 2606 bytes Desc: CodeGenDebugInfo.h Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080504/ac80e697/attachment-0002.obj -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ATT1063913.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080504/ac80e697/attachment.txt From Alireza.Moshtaghi at microchip.com Mon May 5 11:33:45 2008 From: Alireza.Moshtaghi at microchip.com (Alireza.Moshtaghi at microchip.com) Date: Mon, 5 May 2008 09:33:45 -0700 Subject: [cfe-dev] stack-less model on small devices (patch) In-Reply-To: References: <5DEED6CB-7673-481F-922B-8B7D47A1D6C2@apple.com> <0C080190-B0E1-4DA4-BE80-D0AA27F91764@apple.com> <2CC37DFD-CFEC-436C-AF74-C5F24AFA2020@apple.com> Message-ID: Thank you Chris, I'll apply the changes and resubmit a new patch. Regards. Ali -----Original Message----- From: Chris Lattner [mailto:clattner at apple.com] Sent: Friday, May 02, 2008 10:42 AM To: Alireza Moshtaghi - C13012 Cc: cfe-dev at cs.uiuc.edu Subject: Re: [cfe-dev] stack-less model on small devices (patch) On Apr 30, 2008, at 11:31 AM, Alireza.Moshtaghi at microchip.com wrote: > I created the patch for my target specific modifications and sent it > to > cfe-commits. Since this is my first time to send a patch I don't > know if > I have submitted my changes to the right place or not and of course > what > is the turnaround time. You did exactly the right thing. I've been bogged down with other things lately and haven't had much time to stay on top of clang, this will hopefully be fixed next week, I apologize for the delay. > I also have attached it to this email just in case. > Please let me know if I have to do it differently. The patch looks great. Some specific comments: /// getPointerWidth - Return the width of pointers on this target, for the /// specified address space. FIXME: implement correctly. - uint64_t getPointerWidth(unsigned AddrSpace) const { return 32; } - uint64_t getPointerAlign(unsigned AddrSpace) const { return 32; } + virtual uint64_t getPointerWidth(unsigned AddrSpace) const { return 32; } + virtual uint64_t getPointerAlign(unsigned AddrSpace) const { return 32; } /// getIntWidth/Align - Return the size of 'signed int' and 'unsigned int' for /// this target, in bits. - unsigned getIntWidth() const { return 32; } // FIXME - unsigned getIntAlign() const { return 32; } // FIXME + virtual unsigned getIntWidth() const { return 32; } // FIXME + virtual unsigned getIntAlign() const { return 32; } // FIXME Instead of making these virtual, please add instance variables for these like double and wchar are handled. You can also remove the FIXMEs. Thanks for doing this. +++ lib/Basic/Targets.cpp (working copy) @@ -863,6 +863,28 @@ +class PIC16TargetInfo : public TargetInfo{ +public: + virtual const char *getVAListDeclaration() const { return "";} + virtual const char *getClobbers() const {return "";} + virtual const char *getTargetPrefix() const {return "";} + virtual void getGCCRegNames(const char * const *&Names, unsigned &NumNames) const {} + virtual bool validateAsmConstraint(char c, TargetInfo::ConstraintInfo &info) const {return true;} + virtual void getGCCRegAliases(const GCCRegAlias *&Aliases, unsigned &NumAliases) const {} +}; +} Please make sure the code fits in 80 columns. +++ lib/CodeGen/CGDecl.cpp (working copy) @@ -15,6 +15,7 @@ + if (strncmp (this->Target.getTargetTriple(), "pic16-", 6) == 0) { + const llvm::Type *LTy = CGM.getTypes().ConvertTypeForMem(Ty); The preferred way to do a target check like this is to add some new property to TargetInfo with an accessor like "Target.useGlobalsForAutomaticVariables()" or something like that. PIC16 can return true, all other targets return false. Can you just use the code path for static variables to handle the LLVM IR emission? That would avoid duplicating the code. -Chris From csdavec at swansea.ac.uk Mon May 5 12:05:18 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Mon, 5 May 2008 18:05:18 +0100 Subject: [cfe-dev] @defs() question Message-ID: <661EA660-3A82-40BF-940E-33D0CDE8A46B@swan.ac.uk> Hi, Aside from one outstanding bug with the premature insertion of implicit types, the last remaining thing I need to get clang compiling real Objective-C programs is support for the @defs() declaration. This is used like this: struct { @defs(ObjCInterface); }; It ought to find the ObjCInterfaceDecl and insert a FieldDecl for every ObjCIvarDecl in this interface and any superclasses. I've had a little look at the parser code, but can't quite work out where this should go. Can anyone give me a hint? David From snaroff at apple.com Mon May 5 12:30:09 2008 From: snaroff at apple.com (Steve Naroff) Date: Mon, 5 May 2008 10:30:09 -0700 Subject: [cfe-dev] @defs() question In-Reply-To: <661EA660-3A82-40BF-940E-33D0CDE8A46B@swan.ac.uk> References: <661EA660-3A82-40BF-940E-33D0CDE8A46B@swan.ac.uk> Message-ID: <2498684C-603A-4BC4-A861-C6CB1D13F245@apple.com> See Parser::ParseStructUnionBody()...there is a rule/todo there. snaroff On May 5, 2008, at 10:05 AM, David Chisnall wrote: > Hi, > > Aside from one outstanding bug with the premature insertion of > implicit types, the last remaining thing I need to get clang compiling > real Objective-C programs is support for the @defs() declaration. > This is used like this: > > struct { > @defs(ObjCInterface); > }; > > It ought to find the ObjCInterfaceDecl and insert a FieldDecl for > every ObjCIvarDecl in this interface and any superclasses. I've had a > little look at the parser code, but can't quite work out where this > should go. Can anyone give me a hint? > > David > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From snaroff at apple.com Mon May 5 13:22:40 2008 From: snaroff at apple.com (Steve Naroff) Date: Mon, 5 May 2008 11:22:40 -0700 Subject: [cfe-dev] Message send to super In-Reply-To: <0FABC18D-581A-4A95-96CE-2BF6E41C117F@swan.ac.uk> References: <0FABC18D-581A-4A95-96CE-2BF6E41C117F@swan.ac.uk> Message-ID: Hi David, As Chris said, this isn't great. The AST reflects a "quick hack" to avoid making API changes (which required more thought). I should have put a FIXME in the code. Are you going to clean this up? If not, I am happy to put it on my todo list. Let me know, snaroff On May 2, 2008, at 5:15 PM, David Chisnall wrote: > Hi, > > It appears that, in generating the AST, the following expression: > > [super msg]; > > is being translated to: > > [(superclass*)self msg]; > > Running clang -ast-print confirms this. These two expressions have > completely different semantics in Objective-C, and since there appears > to be no way of determining whether the user actually performed a cast > on self (which is uncommon, but does happen in real code) or sent a > message to super (which happens a lot more frequently). > > Can anyone suggest a way of distinguishing these two? If not, is it > possible to introduce a separate kind of AST node, or a flag in the > ObjCMessageExpr indicating if super is the receiver? I have now > written the code to produce the correct output for the GNU runtime > with this kind of expression, but am currently unable to connect it to > anything. > > David > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From clattner at apple.com Mon May 5 16:15:35 2008 From: clattner at apple.com (Chris Lattner) Date: Mon, 5 May 2008 14:15:35 -0700 Subject: [cfe-dev] PATCH: Cleanup function redeclaration representations In-Reply-To: <24b520d20805041742u5284ab7bgb60c96b2312b629f@mail.gmail.com> References: <24b520d20804200915h1ce36fa4y1eff6941a1748252@mail.gmail.com> <4810EF57.3030803@gmail.com> <24b520d20804241658j2d9fc131t72422d4ea62146b0@mail.gmail.com> <4811998B.1020000@gmail.com> <24b520d20805011857t42c7a3bakd421b015d70e5a2c@mail.gmail.com> <399B2FAD-F7E9-4074-90A0-D525AF2234AD@apple.com> <24b520d20805041142u343ed1afqb828640b4cb6bf4f@mail.gmail.com> <24b520d20805041710j7c1de8b4n16d2e49939db1893@mail.gmail.com> <481E55F1.609@gmail.com> <24b520d20805041742u5284ab7bgb60c96b2312b629f@mail.gmail.com> Message-ID: On May 4, 2008, at 5:42 PM, Doug Gregor wrote: > On Sun, May 4, 2008 at 6:33 PM, Argiris Kirtzidis > wrote: >> Another idea to throw around.. >> If there is a need, instead of creating a separate aggregate we >> create a >> separate decl to store the source info of the first decl, and have >> the first >> decl be the aggregate one. >> I mean, the first decl will always be the aggregate and if the >> client wants >> the source info for it he would do: >> >> aggregate_decl->getSourceDecl()->isInline() > > Hmmm, I like that. Actually, why not replace the term "redeclaration" > with "source declaration"? The 'aggregate' ('canonical'?) node will be > the first node created and the one used throughout the AST. If one is > interested in source information, browse the source-declarations list. > > For functions that have only one declaration (ever), we would just > have the one FunctionDecl. However, once we see a second declaration, > we create two source declarations: one for the original declaration > (basically, a clone of the aggregate decl) and one for the > redeclaration. This is a great idea, sounds good to me. -Chris From csdavec at swansea.ac.uk Mon May 5 17:20:55 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Mon, 5 May 2008 23:20:55 +0100 Subject: [cfe-dev] Objective-C patches Message-ID: Hi Everyone, Sorry for the patch size... Please find attached: sema.diff includes small changes required to make constant strings and message sends to super work. It fixes the AST so super is now a built in decl and it eliminates the warning if NSConstantString is not declared (which, it turns out, it isn't in most real code by the time the first constant string is found). This is not a completely sensible check anyway, since most people cast them to NSString* at creation time. If NSConstantString is declared, it is used, otherwise constant strings are of type id. ccc.diff includes improved option parsing for a couple of things in ccc that apparently are need to build a lot of real ObjC programs. isa.diff fixes a small bug where the implicit isa pointer was added to instance variable lists twice (I think this is caused by someone fixing the bug where it wasn't added to the AST at all, but it might just be that I'm incompetent...) objc.diff is a big diff (sorry!) which tidies up most of Objective-C code generation, provides working implementations for the GNU runtime and mostly-working implementations for the ?toil? runtime. Feel free to edit out anything from GCObjCEtoile.cpp (there are a few changes to match the updated interfaces, but the implementation can all be left out for now since it isn't very heavily tested). This includes support for classes, categories, protocols, selector caching, message sends to the superclass, and constant Objective-C strings. David -------------- next part -------------- A non-text attachment was scrubbed... Name: sema.diff Type: application/octet-stream Size: 2688 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080505/95990af4/attachment-0004.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: ccc.diff Type: application/octet-stream Size: 676 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080505/95990af4/attachment-0005.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: isa.diff Type: application/octet-stream Size: 713 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080505/95990af4/attachment-0006.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: objc.diff Type: application/octet-stream Size: 85671 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080505/95990af4/attachment-0007.obj From csdavec at swansea.ac.uk Mon May 5 17:52:33 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Mon, 5 May 2008 23:52:33 +0100 Subject: [cfe-dev] Objective-C patches In-Reply-To: References: Message-ID: <0189AC64-571B-488D-91D0-3137E040378C@swan.ac.uk> As a result of off-list comments, I have fixed a number of 'if(' formatting issues and a couple of line-wrap problems, typedef'd llvm::SmallVector to ConstantVector, and removed the StructGEP helper function which is left over from before that functionality was added to IRBuilder. I don't want to spam everyone, so I'll hold off sending an updated patch until I've received more comments. David On 5 May 2008, at 23:20, David Chisnall wrote: > Hi Everyone, > > Sorry for the patch size... > > Please find attached: > > sema.diff includes small changes required to make constant strings > and message sends to super work. It fixes the AST so super is now a > built in decl and it eliminates the warning if NSConstantString is > not declared (which, it turns out, it isn't in most real code by the > time the first constant string is found). This is not a completely > sensible check anyway, since most people cast them to NSString* at > creation time. If NSConstantString is declared, it is used, > otherwise constant strings are of type id. > > ccc.diff includes improved option parsing for a couple of things in > ccc that apparently are need to build a lot of real ObjC programs. > > isa.diff fixes a small bug where the implicit isa pointer was added > to instance variable lists twice (I think this is caused by someone > fixing the bug where it wasn't added to the AST at all, but it might > just be that I'm incompetent...) > > objc.diff is a big diff (sorry!) which tidies up most of Objective-C > code generation, provides working implementations for the GNU > runtime and mostly-working implementations for the ?toil? runtime. > Feel free to edit out anything from GCObjCEtoile.cpp (there are a > few changes to match the updated interfaces, but the implementation > can all be left out for now since it isn't very heavily tested). > This includes support for classes, categories, protocols, selector > caching, message sends to the superclass, and constant Objective-C > strings. > > David > > > > < > sema > .diff > > > < > ccc > .diff > >_______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From clattner at apple.com Mon May 5 20:11:09 2008 From: clattner at apple.com (Chris Lattner) Date: Mon, 5 May 2008 18:11:09 -0700 Subject: [cfe-dev] Objective-C patches In-Reply-To: <0189AC64-571B-488D-91D0-3137E040378C@swan.ac.uk> References: <0189AC64-571B-488D-91D0-3137E040378C@swan.ac.uk> Message-ID: <10F2F842-BCFB-4D85-B119-E004AD0892AB@apple.com> On May 5, 2008, at 3:52 PM, David Chisnall wrote: > As a result of off-list comments, I have fixed a number of 'if(' > formatting issues and a couple of line-wrap problems, typedef'd > llvm::SmallVector to ConstantVector, and removed > the StructGEP helper function which is left over from before that > functionality was added to IRBuilder. I don't want to spam everyone, > so I'll hold off sending an updated patch until I've received more > comments. Please don't name it ConstantVector. llvm::ConstantVector is something very different. What's wrong with using SmallVector explicitly? The sema, CCC, isa, looks great, plz apply but remove this line: + //IdentifierInfo *NSIdent = &Context.Idents.get("NSConstantString"); and indent this properly: + ExprResult ReceiverExpr = new PreDefinedExpr(SourceLocation(), superTy, + PreDefinedExpr::ObjCSuper); Please resend the updated sema patch. Code review is kinda hard if not reviewing the actual code :) Thanks David, -Chris From clattner at apple.com Mon May 5 20:13:57 2008 From: clattner at apple.com (Chris Lattner) Date: Mon, 5 May 2008 18:13:57 -0700 Subject: [cfe-dev] [cfe-commits] Clang DebugInfo Patch for StopPoints. In-Reply-To: References: Message-ID: <0FEC804C-0B06-4ED4-A4F8-8EE336E4B75C@apple.com> On May 5, 2008, at 6:07 AM, Sanjiv.Gupta at microchip.com wrote: >> It looks like clang is currently inconsistent here, but >> option descriptions should not end with a ".". >> > I will cleanup other things here as well to remove the ending "." Thanks! This can be a separate patch obviously :) >> + CodeGenDebugInfo *TheDebugInfo; >> >> Please make TheDebugInfo private and add an accessor, >> following the lead of other instance vars in CodeGenModule. >> > > I am renaming the class to CGDebugInfo and files to CGDebugInfo. > [h,cpp] Ok, sound sgreat. >> #include "llvm/CodeGen/MachineModuleInfo.h" >> >> Is there any way to avoid bringing in MachineModuleInfo.h? >> It is a really gross header that needs a lot of cleanup. Can >> you avoid #including it by forward declaring what ever you >> need and then only #including it into CodeGenDebugInfo.cpp? >> I think it is worth it, even if it means making 'SR' by a >> pointer instead of holding the DISerializer by value. >> > Holding the 'SR' by pointer works. Great >> // Don't bother if things are the same as last time. >> if (CurLoc == PrevLoc && PrevBB == CurBB) >> return; >> >> Why are you checking for BB equality here? > > This is on the lines of llvm-gcc. > I think the reasoning behind is that a source line may result into > multiple basic blocks and we may need to put the stoppoint in each > basic > block. Is that really desired? It seems strange to stop multiple times even if there is a ?: for example. I don't think GDB will do something useful. >> Why emit a stop point on "}"? >> > Again, this is on the lines of llvm-gcc. > Also, the example http://www.llvm.org/docs/SourceLevelDebugging.html > mentions that. > > 1. void foo() { > 2. int X = ...; > 3. int Y = ...; > 4. { > 5. int Z = ...; > 6. ... > 7. } > 8. ... > 9. } > > call void %llvm.dbg.stoppoint( uint 7, uint 2, %llvm.dbg.compile_unit* > %llvm.dbg.compile_unit ) > call void %llvm.region.end() I'd suggest just not doing that :) Thanks for working on this. Debug info generation is incredibly important! -Chris From cworce at 126.com Mon May 5 21:01:08 2008 From: cworce at 126.com (caiwei) Date: Tue, 6 May 2008 10:01:08 +0800 (CST) Subject: [cfe-dev] add source file to clang project?? Message-ID: <1169707.550221210039268593.JavaMail.coremail@bj126app66.126.com> Hi all, I am new to the list and have been lurking a few days. I want to add some source file(.h,cpp) to the clang. How do I modify the project file in clang.xcodeproj when I work on linux. thanks for you help. wei -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080506/acd094e8/attachment.html From clattner at apple.com Mon May 5 21:32:20 2008 From: clattner at apple.com (Chris Lattner) Date: Mon, 5 May 2008 19:32:20 -0700 Subject: [cfe-dev] add source file to clang project?? In-Reply-To: <1169707.550221210039268593.JavaMail.coremail@bj126app66.126.com> References: <1169707.550221210039268593.JavaMail.coremail@bj126app66.126.com> Message-ID: On May 5, 2008, at 7:01 PM, caiwei wrote: > > Hi all, > I am new to the list and have been lurking a few days. I want to > add some source file(.h,cpp) to the clang. > How do I modify the project file in clang.xcodeproj when I work on > linux. I just wouldn't worry about it. People using MSVC or Xcode are expected to update their projects as the source base changes. -Chris From matthijs at stdin.nl Tue May 6 02:22:14 2008 From: matthijs at stdin.nl (Matthijs Kooijman) Date: Tue, 6 May 2008 09:22:14 +0200 Subject: [cfe-dev] [cfe-commits] Clang DebugInfo Patch for StopPoints. In-Reply-To: <0FEC804C-0B06-4ED4-A4F8-8EE336E4B75C@apple.com> References: <0FEC804C-0B06-4ED4-A4F8-8EE336E4B75C@apple.com> Message-ID: <20080506072214.GC27575@katherina.student.utwente.nl> Hi, > >> Why emit a stop point on "}"? > > Again, this is on the lines of llvm-gcc. > > Also, the example http://www.llvm.org/docs/SourceLevelDebugging.html > > mentions that. > I'd suggest just not doing that :) It's probably a good idea to still stop on the closing } of a function. Normal gcc + gdb also does that IIRC and it helps to see when you're exiting a function. Not sure if this is also under discussion, though. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080506/1ac3cbd7/attachment.bin From csdavec at swansea.ac.uk Tue May 6 10:37:13 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Tue, 6 May 2008 16:37:13 +0100 Subject: [cfe-dev] @defs() support Message-ID: <039F53F7-57B2-4C2B-A0EC-7D5438809CE8@swan.ac.uk> Here is a little patch for adding support for @defs(). It works, but doesn't generate very friendly errors yet. David -------------- next part -------------- A non-text attachment was scrubbed... Name: defs.diff Type: application/octet-stream Size: 4409 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080506/1ea489b8/attachment-0001.obj -------------- next part -------------- From kremenek at apple.com Tue May 6 14:05:35 2008 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 6 May 2008 12:05:35 -0700 Subject: [cfe-dev] -emit-html example In-Reply-To: <481A31D7.7000601@gmail.com> References: <63E11935-E280-4C58-8D7D-0F50A407D223@apple.com> <480BCFBC.6090703@gmail.com> <734C263F-54CF-4B54-9C7F-A744BEF545A2@apple.com> <481A31D7.7000601@gmail.com> Message-ID: <64885ADB-0C57-4E40-8EC4-931AB2AEFC03@apple.com> On May 1, 2008, at 2:10 PM, Argiris Kirtzidis wrote: > My motivation to propose the Annotator lib wasn't specifically to > apply it for HTMLPrinter, that was more like an example. > The Annotator's purpose would be to verify clang's suitability for > an IDE, at least from the aspect of syntax/semantic colorizing. For > example it would answer questions like: > -Can I colorize all variable names ? (with exclusive color) > -Can I colorize all type names ? > -Can I associate opening/closing braces for all kinds of blocks > (namespaces, functions etc.) ? > -Does the AST carry enough information for doing [insert task] ? > > Now, assuming that you have a working Annotator lib, the best way to > put it to use (without messing with some IDE) would be to make a > HTMLAnnotator. > HTMLAnnotator would be a client of Annotator and HTML Rewrite API. I think have a playground for such things is useful, but I know if we need a separate library at this point. Probably just adding the Annotator class to the Driver would be sufficient for now. We can then easily move it out. I also don't know if the extra layer of indirection is needed until we have another Annotator in mind besides HTMLAnnotator (i.e., can we just use the HTMLPrinter directly to explore your above questions?). I'm not strongly objecting against adding Annotator; it's just not clear to me that there are other clients that would use it. > What do you think about the above ? > >> I don't believe that an IDE would be an ASTConsumer (in the clang >> driver sense) either, but would rather interact with the clang >> libraries interactively to regenerate ASTs on-the-fly. > > I was thinking that in the specific task of semantic colorizing, you > would have to utilize Preprocessor+Parser+Sema for a particular > source file, > so the Annotator being an ASTConsumer, that handles the declarations > that the parser gives it, seemed reasonable, do you have something > else in mind ? I think this pipeline works fine for playing around with things; an IDE would interactively parse different parts of a file, incrementally rebuilding ASTs, etc., and thus the ASTConsumer/Annotator interface would probably not be ideal. This pipeline also basically assumes the workflow in the Driver, which is why I think putting the Annotator class in the Driver makes more sense than creating a separate library. From akyrtzi at gmail.com Tue May 6 21:06:15 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Tue, 06 May 2008 19:06:15 -0700 Subject: [cfe-dev] [PATCH] : Proper name lookup for namespaces Message-ID: <48210E97.3040808@gmail.com> Hi, The attached patch contains changes to support proper name lookup for namespaces. It is mostly an overhaul of the IdentifierResolver: -It exposes an iterator interface to get all decls through the scope chain: for (IdentifierResolver::iterator I = IdResolver.begin(II, CurContext), E = IdResolver.end(II); I != E; ++I) if ((*I)->getIdentifierNamespace() & NS) return *I; -The semantic staff (checking IdentifierNamespace and Doug's checking for shadowed tags were moved out of IdentifierResolver and back into Sema. IdentifierResolver just gives an iterator for iterating over all reachable decls of an identifier. -Fixes bug: http://llvm.org/bugs/show_bug.cgi?id=2275 -Argiris -------------- next part -------------- A non-text attachment was scrubbed... Name: cxx-namespaces-2.patch Type: text/x-diff Size: 31299 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080506/94bf15bb/attachment-0001.bin From jhwoodyatt at mac.com Wed May 7 10:58:31 2008 From: jhwoodyatt at mac.com (james woodyatt) Date: Wed, 7 May 2008 08:58:31 -0700 Subject: [cfe-dev] architecture endianness and preprocessor defines Message-ID: <56FE94E9-AAD2-49D7-A0F6-6F86E3303E43@mac.com> everyone-- One of the things I've long disliked about how GCC works is that its developers have still not really sorted out how to handle architectures that can operate in either big- or little-endian mode. I'd like to know if the LLVM CFE developers have any thoughts on how to improve matters here. Here's what GCC does today, and how that situation produces consequences downstream: + The various architecture configurations define built-in preprocessor definitions like __BIG_ENDIAN__ and __LITTLE_ENDIAN__. + These are hard-coded for architectures that don't have any choice, e.g. IA32, but they're switched by the -mbig-endian and -mlittle- endian on architectures that can be configured to run in either mode. + These built-in definitions aren't consistently defined across all the architectures either, so on some architectures you get __BIG_ENDIAN and on others you get __BIG_ENDIAN__. Isn't that wonderful? One of the additional hassles with GCC is that its "multilib" feature doesn't consistently build the C runtime environment, i.e. crtstuff.c, for both big- and little-endian modes. This is why there are all those GCC target triples that look like "armeb-netbsd-elf" and "mipsel- wrs-vxworks" and "armle-linux-gnu" in the configure script. Notice that the suffixes aren't used consistently across operating system platforms? The suffix on the architecture name ends up getting translated into the endianness of the C runtime environment modules used by the linker (except when -nostdlib is used... sigh). If it weren't for this, you'd be able to build GCC for ARM or MIPS or whatever, without adding that suffix to the architecture part of the triple, and the -mbig- endian and -mlittle-endian switches would select the proper C runtime environment. Sadly, that doesn't happen like it should. I'm not sure how much Clang should need to know about the C runtime environment that will eventually get linked up with final executable machine objects, but it would be nice if you didn't have to apply this horrible corruption to the architecture part of the target triple. I'd rather the command driver were responsible for sorting out which runtime environments to link into what executables, and it should be able to do the right thing with just the command line switches. That still leaves the C preprocessor built-ins, which are clearly in Clang's domain to manage. Here's what I propose: Clang should define a small set of general preprocessor built-ins that identify the CPU architecture family specified in the target triple, e.g. __ia32__, __x86_64__, __arm__, __powerpc__, __mips__, etc; it should also define __LITTLE_ENDIAN__ and __BIG_ENDIAN__ as appropriate, and it should offer the -mbig-endian and -mlittle-endian switches for explicitly specifying the endianness on architectures that can execute in either mode. The command driver can then do the right thing (or the wrong thing) as necessary. I'd like to know if the Clang developers are interested in resisting the endianness suffixes on the architecture parts of the target triple specification. I hope the answer is yes. ? j h woodyatt http://jhw.vox.com/ -- j h woodyatt From echristo at apple.com Wed May 7 11:30:25 2008 From: echristo at apple.com (Eric Christopher) Date: Wed, 7 May 2008 09:30:25 -0700 Subject: [cfe-dev] architecture endianness and preprocessor defines In-Reply-To: <56FE94E9-AAD2-49D7-A0F6-6F86E3303E43@mac.com> References: <56FE94E9-AAD2-49D7-A0F6-6F86E3303E43@mac.com> Message-ID: I'll apologize that you've had such a hard time with this, but I've also never seen an email from you asking about these things. That said, many things you mentioned are just wrong. > One of the additional hassles with GCC is that its "multilib" feature > doesn't consistently build the C runtime environment, i.e. crtstuff.c, > for both big- and little-endian modes. This is why there are all > those GCC target triples that look like "armeb-netbsd-elf" and > "mipsel- > wrs-vxworks" and "armle-linux-gnu" in the configure script. Notice > that the suffixes aren't used consistently across operating system > platforms? > This is true, however, the suffixes are usually created by the people doing the work for the platform for the specific target triple... > The suffix on the architecture name ends up getting translated into > the endianness of the C runtime environment modules used by the linker > (except when -nostdlib is used... sigh). If it weren't for this, > you'd be able to build GCC for ARM or MIPS or whatever, without adding > that suffix to the architecture part of the triple, and the -mbig- > endian and -mlittle-endian switches would select the proper C runtime > environment. Sadly, that doesn't happen like it should. Interestingly enough you're absolutely wrong here. The suffixes merely allow you to select a different default. They are an alias. Nothing else. The switches allow you, if the target has support for it, to change mode. Many of the OS targets aren't bi-endian and don't support changing. For the last 5 years at least almost all of the preprocessor builtins are done using a standard method that will have them in 3 different canonical forms, 2 of which you mentioned in your mail for big endian. This is also mostly a complaint about configure and not gcc and mostly not applicable to cfe either (though it may be a general llvm discussion). If you want to respond I suggest we take this discussion to private email or the llvm list (or the gcc list if you want to change how gcc does it). -eric From clattner at apple.com Wed May 7 12:44:58 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 7 May 2008 10:44:58 -0700 Subject: [cfe-dev] architecture endianness and preprocessor defines In-Reply-To: <56FE94E9-AAD2-49D7-A0F6-6F86E3303E43@mac.com> References: <56FE94E9-AAD2-49D7-A0F6-6F86E3303E43@mac.com> Message-ID: <17D04025-F862-4B56-8DBA-DBC7D0925A9C@apple.com> On May 7, 2008, at 8:58 AM, james woodyatt wrote: > One of the things I've long disliked about how GCC works is that its > developers have still not really sorted out how to handle > architectures that can operate in either big- or little-endian mode. > I'd like to know if the LLVM CFE developers have any thoughts on how > to improve matters here. You bring up a lot of interesting issues. Some meta answers :) > > Here's what GCC does today, and how that situation produces > consequences downstream: > > + The various architecture configurations define built-in preprocessor > definitions like __BIG_ENDIAN__ and __LITTLE_ENDIAN__. We aim to be GCC compatible with preprocessor directives. This is important for compatibility with existing code. > One of the additional hassles with GCC is that its "multilib" feature > doesn't consistently build the C runtime environment, i.e. crtstuff.c, > for both big- and little-endian modes. This is why there are all > those GCC target triples that look like "armeb-netbsd-elf" and > "mipsel- > wrs-vxworks" and "armle-linux-gnu" in the configure script. Notice > that the suffixes aren't used consistently across operating system > platforms? I agree that this is irritating. Two issues: 1) we will support the GCC target triples, at least when/if people contribute support for them. 2) clang is explicitly designed to support building a single tool chain in place that supports multiple targets. The ultimate goal is that you should be able to configure clang with "-- targets='armeb-netbsd-elf mipsel-wrs-vxworks armle-linux-gnu'" and get support in the toolchain for all of them. We already have support for handling this (-arch option and friends). When we bring up the "libgcc" runtime library stuff, we'll make sure it can be built for multiple targets. > The suffix on the architecture name ends up getting translated into > the endianness of the C runtime environment modules used by the linker > (except when -nostdlib is used... sigh). If it weren't for this, > you'd be able to build GCC for ARM or MIPS or whatever, without adding > that suffix to the architecture part of the triple, and the -mbig- > endian and -mlittle-endian switches would select the proper C runtime > environment. Sadly, that doesn't happen like it should. Just because we will support the existing GCC target triples (again, when/if people contribute support for them) it doesn't mean we can't support simplified triples also. > That still leaves the C preprocessor built-ins, which are clearly in > Clang's domain to manage. Here's what I propose: Clang should define > a small set of general preprocessor built-ins that identify the CPU > architecture family specified in the target triple, e.g. __ia32__, > __x86_64__, __arm__, __powerpc__, __mips__, etc; it should also define > __LITTLE_ENDIAN__ and __BIG_ENDIAN__ as appropriate, and it should > offer the -mbig-endian and -mlittle-endian switches for explicitly > specifying the endianness on architectures that can execute in either > mode. The command driver can then do the right thing (or the wrong > thing) as necessary. We have to support the existing ones. Requiring people to 'port' their code to clang from GCC is not desirable. That said, we *can* support nicer and cleaner interfaces as well for feature queries. Over time, we can encourage people (who don't care about writing portable code (?)) to use these and/or try to get the GCC folks to adopt similar features. -Chris From kremenek at apple.com Wed May 7 13:59:38 2008 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 7 May 2008 11:59:38 -0700 Subject: [cfe-dev] new failures in test/Sema: Message-ID: I'm seeing these when doing "make test:" ---- Sema/objc-property-1.m failed ---- ---- Sema/objc-property-2.m failed ---- I don't think I saw these an hour ago. Fariborz: Could this be related to your recent patch? http://llvm.org/viewvc/llvm-project?rev=50818&view=rev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080507/0017e2f9/attachment.html From fjahanian at apple.com Wed May 7 15:03:16 2008 From: fjahanian at apple.com (Fariborz Jahanian) Date: Wed, 7 May 2008 13:03:16 -0700 Subject: [cfe-dev] new failures in test/Sema: In-Reply-To: References: Message-ID: On May 7, 2008, at 11:59 AM, Ted Kremenek wrote: > I'm seeing these when doing "make test:" > > ---- Sema/objc-property-1.m failed ---- > ---- Sema/objc-property-2.m failed ---- > > I don't think I saw these an hour ago. > > Fariborz: Could this be related to your recent patch? Author: fjahanian Date: Wed May 7 12:43:59 2008 New Revision: 50818 URL: http://llvm.org/viewvc/llvm-project?rev=50818&view=rev Log: This patch introduces declaration of getter methods for ObjC2's properties. Couple of property tests will fail with this patch. Will fix them next. > - fariborz > http://llvm.org/viewvc/llvm-project?rev=50818&view=rev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080507/5d36a036/attachment.html From kremenek at apple.com Wed May 7 15:07:37 2008 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 7 May 2008 13:07:37 -0700 Subject: [cfe-dev] new failures in test/Sema: In-Reply-To: References: Message-ID: <4C3D722C-3967-4911-9C0D-EAB30395DDA9@apple.com> OK. If only I could read. On May 7, 2008, at 1:03 PM, Fariborz Jahanian wrote: > > On May 7, 2008, at 11:59 AM, Ted Kremenek wrote: >> I'm seeing these when doing "make test:" >> >> ---- Sema/objc-property-1.m failed ---- >> ---- Sema/objc-property-2.m failed ---- >> >> I don't think I saw these an hour ago. >> >> Fariborz: Could this be related to your recent patch? > > Author: fjahanian > Date: Wed May 7 12:43:59 2008 > New Revision: 50818 > > URL: http://llvm.org/viewvc/llvm-project?rev=50818&view=rev > Log: > This patch introduces declaration of getter methods for ObjC2's > properties. Couple of property tests will fail with this patch. > Will fix them next. >> > > - fariborz > >> http://llvm.org/viewvc/llvm-project?rev=50818&view=rev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080507/22264cb9/attachment.html From clattner at apple.com Thu May 8 01:08:56 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 7 May 2008 23:08:56 -0700 Subject: [cfe-dev] [cfe-commits] Clang DebugInfo Patch for StopPoints. In-Reply-To: <20080506072214.GC27575@katherina.student.utwente.nl> References: <0FEC804C-0B06-4ED4-A4F8-8EE336E4B75C@apple.com> <20080506072214.GC27575@katherina.student.utwente.nl> Message-ID: On May 6, 2008, at 12:22 AM, Matthijs Kooijman wrote: > Hi, > >>>> Why emit a stop point on "}"? >>> Again, this is on the lines of llvm-gcc. >>> Also, the example http://www.llvm.org/docs/SourceLevelDebugging.html >>> mentions that. >> I'd suggest just not doing that :) > It's probably a good idea to still stop on the closing } of a > function. Normal > gcc + gdb also does that IIRC and it helps to see when you're > exiting a > function. Not sure if this is also under discussion, though. > Stopping at the end of function makes a lot of sense to me. Random '}'s in the middle of the function seem less useful :) -Chris From fjahanian at apple.com Thu May 8 11:15:46 2008 From: fjahanian at apple.com (Fariborz Jahanian) Date: Thu, 8 May 2008 09:15:46 -0700 Subject: [cfe-dev] @defs() support In-Reply-To: <039F53F7-57B2-4C2B-A0EC-7D5438809CE8@swan.ac.uk> References: <039F53F7-57B2-4C2B-A0EC-7D5438809CE8@swan.ac.uk> Message-ID: <92EB8422-50A3-44CA-845E-34926734142A@apple.com> You are returning a SmallVector in ActOnDefs by value. Please pass a reference to SmallVector as argument to collect ivars. Also, we would like to have a test case with each new feature. Run the result through rewriter and make sure that it works. Also -ast-print should do the pretty-printing of the result as a test. - Fariborz On May 6, 2008, at 8:37 AM, David Chisnall wrote: > Here is a little patch for adding support for @defs(). It works, > but doesn't generate very friendly errors yet. > > David > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From cworce at 126.com Thu May 8 22:26:26 2008 From: cworce at 126.com (caiwei) Date: Fri, 9 May 2008 11:26:26 +0800 (CST) Subject: [cfe-dev] SourceLocation TODO Message-ID: <7983222.163791210303586842.JavaMail.coremail@bj126app13.126.com> Taking a look at the ObjC Rewriter for examples, I want to change the operator * to << (source to source) . I create a new BinaryOperator and replace the Expr . rhs = new IntegerLiteral( Res, BO->getRHS()->getType(), BO->getRHS()->getExprLoc() ); // change * to << Replacement = new BinaryOperator(BO->getLHS(), rhs, BO->Shl, BO->getType(), BO->getLHS()->getExprLoc()); //replace the expression assert( !Rewrite.ReplaceStmt(BO, Replacement) ); but the output SourceLocation has some problem in the result source code, when there are two * Operateor more. ================================== Stmt *RewriteCai::RewriteBinaryOperator(BinaryOperator *BO) { Expr *Replacement, *rhs, *temp; temp = BO->getRHS(); if( BO->isMultiplicativeOp() ) { llvm::APSInt Res(32); SourceLocation ll = BO->getRHS()->getExprLoc(); if( (BO->getRHS())->isIntegerConstantExpr( Res, *Context, &ll, true) ) { int a = Res.getZExtValue(); if((a&(a-1))==0) Res = Res.logBase2(); //new right-child subExpr rhs = new IntegerLiteral( Res, BO->getRHS()->getType(), BO->getRHS()->getExprLoc() ); assert(rhs); } else { rhs = temp; } // change * to << Replacement = new BinaryOperator(BO->getLHS(), rhs, BO->Shl, BO->getType(), BO->getLHS()->getExprLoc()); assert(!Rewrite.ReplaceStmt(BO, Replacement)); delete BO; return Replacement; } return 0; } -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080509/ae8393e4/attachment.html From clattner at apple.com Thu May 8 23:37:28 2008 From: clattner at apple.com (Chris Lattner) Date: Thu, 8 May 2008 21:37:28 -0700 Subject: [cfe-dev] SourceLocation TODO In-Reply-To: <7983222.163791210303586842.JavaMail.coremail@bj126app13.126.com> References: <7983222.163791210303586842.JavaMail.coremail@bj126app13.126.com> Message-ID: <42449183-2D33-4930-8FBA-D3C031232383@apple.com> On May 8, 2008, at 8:26 PM, caiwei wrote: > Taking a look at the ObjC Rewriter for examples, I want to change > the operator * to << (source to source) . > > I create a new BinaryOperator and replace the Expr . > > > rhs = new IntegerLiteral( Res, BO->getRHS()->getType(), BO- > >getRHS()->getExprLoc() ); > // change * to << > Replacement = new BinaryOperator(BO->getLHS(), rhs, > BO->Shl, BO->getType(), BO- > >getLHS()->getExprLoc()); > //replace the expression > assert( !Rewrite.ReplaceStmt(BO, Replacement) ); > > > but the output SourceLocation has some problem in the result source > code, when there are two * Operateor more. I assume that you mean that you have problems rewriting things like "(x*2)*2". This is because you're deleting the AST: > // change * to << > Replacement = new BinaryOperator(BO->getLHS(), rhs, > BO->Shl, BO->getType(), BO- > >getLHS()->getExprLoc()); Here the old multiply node and the << both point to the LHS AST. > > delete BO; This deletes the "*" AST and the LHS/RHS. To avoid having it delete the LHS, set the LHS of the multiply to null before you delete it. -Chris From clattner at apple.com Fri May 9 00:39:56 2008 From: clattner at apple.com (Chris Lattner) Date: Thu, 8 May 2008 22:39:56 -0700 Subject: [cfe-dev] [PATCH] : Proper name lookup for namespaces In-Reply-To: <48210E97.3040808@gmail.com> References: <48210E97.3040808@gmail.com> Message-ID: <2D58517B-EC4E-4C3D-B2C7-4D8523BE8E7F@apple.com> On May 6, 2008, at 7:06 PM, Argiris Kirtzidis wrote: > Hi, > > The attached patch contains changes to support proper name lookup > for namespaces. > It is mostly an overhaul of the IdentifierResolver: > > -It exposes an iterator interface to get all decls through the scope > chain: > for (IdentifierResolver::iterator > I = IdResolver.begin(II, CurContext), E = IdResolver.end(II); > I != E; ++I) > if ((*I)->getIdentifierNamespace() & NS) > return *I; > > -The semantic staff (checking IdentifierNamespace and Doug's > checking for shadowed tags were moved out of IdentifierResolver and > back into Sema. IdentifierResolver just gives an iterator for > iterating over all reachable decls of an identifier. > > -Fixes bug: http://llvm.org/bugs/show_bug.cgi?id=2275 Looks great to me, thanks Argiris! -Chris From akyrtzi at gmail.com Fri May 9 18:48:53 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Fri, 09 May 2008 16:48:53 -0700 Subject: [cfe-dev] [PATCH] : Proper name lookup for namespaces In-Reply-To: <2D58517B-EC4E-4C3D-B2C7-4D8523BE8E7F@apple.com> References: <48210E97.3040808@gmail.com> <2D58517B-EC4E-4C3D-B2C7-4D8523BE8E7F@apple.com> Message-ID: <4824E2E5.7040904@gmail.com> >> >> The attached patch contains changes to support proper name lookup for >> namespaces. >> It is mostly an overhaul of the IdentifierResolver: >> >> -It exposes an iterator interface to get all decls through the scope >> chain: >> for (IdentifierResolver::iterator >> I = IdResolver.begin(II, CurContext), E = IdResolver.end(II); I >> != E; ++I) >> if ((*I)->getIdentifierNamespace() & NS) >> return *I; >> >> -The semantic staff (checking IdentifierNamespace and Doug's checking >> for shadowed tags were moved out of IdentifierResolver and back into >> Sema. IdentifierResolver just gives an iterator for iterating over >> all reachable decls of an identifier. >> >> -Fixes bug: http://llvm.org/bugs/show_bug.cgi?id=2275 > > Looks great to me, thanks Argiris! Applied here: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080505/005668.html with only a couple of changes: -IdentifierResolver::AddShadowedDecl is passed a decl instead of an ctx_iterator -Moved Sema::isDeclInScope to IdentifierResolver::isDeclInScope, so that it can use LookupContext class to check whether the decl belongs to the decl context (LookupContext, when encountering a EnumConstantDecl, uses the decl context that the EnumConstantDecl's EnumDecl belongs to). -Argiris From akyrtzi at gmail.com Fri May 9 18:49:06 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Fri, 09 May 2008 16:49:06 -0700 Subject: [cfe-dev] -emit-html example In-Reply-To: <64885ADB-0C57-4E40-8EC4-931AB2AEFC03@apple.com> References: <63E11935-E280-4C58-8D7D-0F50A407D223@apple.com> <480BCFBC.6090703@gmail.com> <734C263F-54CF-4B54-9C7F-A744BEF545A2@apple.com> <481A31D7.7000601@gmail.com> <64885ADB-0C57-4E40-8EC4-931AB2AEFC03@apple.com> Message-ID: <4824E2F2.4040208@gmail.com> Hi Ted, Ted Kremenek wrote: > > On May 1, 2008, at 2:10 PM, Argiris Kirtzidis wrote: > >> My motivation to propose the Annotator lib wasn't specifically to >> apply it for HTMLPrinter, that was more like an example. >> The Annotator's purpose would be to verify clang's suitability for an >> IDE, at least from the aspect of syntax/semantic colorizing. For >> example it would answer questions like: >> -Can I colorize all variable names ? (with exclusive color) >> -Can I colorize all type names ? >> -Can I associate opening/closing braces for all kinds of blocks >> (namespaces, functions etc.) ? >> -Does the AST carry enough information for doing [insert task] ? >> >> Now, assuming that you have a working Annotator lib, the best way to >> put it to use (without messing with some IDE) would be to make a >> HTMLAnnotator. >> HTMLAnnotator would be a client of Annotator and HTML Rewrite API. > > I think have a playground for such things is useful, but I know if we > need a separate library at this point. Probably just adding the > Annotator class to the Driver would be sufficient for now. We can > then easily move it out. I also don't know if the extra layer of > indirection is needed until we have another Annotator in mind besides > HTMLAnnotator (i.e., can we just use the HTMLPrinter directly to > explore your above questions?). I'm not strongly objecting against > adding Annotator; it's just not clear to me that there are other > clients that would use it. Yes, I see your point, not much of need for a separate library at the moment. -Argiris From akyrtzi at gmail.com Fri May 9 20:25:22 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Fri, 09 May 2008 18:25:22 -0700 Subject: [cfe-dev] Suggestion for Scope class Message-ID: <4824F982.2080408@gmail.com> Hi, The Parser is passing the current scope to many of Sema's methods, while, at the same time, Sema is managing the current decl context state through the CurFunctionDecl, CurMethodDecl, and CurContext variables. I suggest adding a decl context member to the Scope class like this: + /// ParentDecl - The declaration that this scope is created for. + /// It is up to the current Action implementation to implement the semantics. + Action::DeclTy *ParentDecl; [...] + /// getParentDecl - Return the decl that this scope is created for. + /// It is for use by the Action implementation. + /// + Action::DeclTy *getParentDecl() const { return ParentDecl; } + + /// setParentDecl - Set the decl that this scope is created for. + /// It is for use by the Action implementation. + /// + void setParentDecl(Action::DeclTy *D) { ParentDecl = D; } That way the code gets simplified a bit in a few places: -CurFunctionDecl, CurMethodDecl, and CurContext will be removed, they will be retrieved from the scope passed by the parser -Calls to IdentifierResolver::isDeclInScope will pass only a scope, not a scope and a decl context -Sema::LookupDecl will actually take into account the scope passed to it, and not assume doing a name lookup from the current decl context -The code in Sema::ImplicitlyDefineFunction will change from: > // Insert this function into translation-unit scope. > > DeclContext *PrevDC = CurContext; > CurContext = Context.getTranslationUnitDecl(); > > FunctionDecl *FD = > dyn_cast(static_cast(ActOnDeclarator(TUScope, > D, 0))); > FD->setImplicit(); > > CurContext = PrevDC; to > // Insert this function into translation-unit scope. > > FunctionDecl *FD = > dyn_cast(static_cast(ActOnDeclarator(TUScope, > D, 0))); > FD->setImplicit(); Any thoughts ? -Argiris From Sanjiv.Gupta at microchip.com Mon May 12 06:15:45 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Mon, 12 May 2008 04:15:45 -0700 Subject: [cfe-dev] Missing API Type::getAsTypedefType Message-ID: We have APIs like getAsBuiltintype(), getAsFunctionType() in the Type class defined in AST/Type.h. Should we also have getAsTypedefType() in there? I need it for handling typedefs during debug info generation. - Sanjiv From pingu219 at gmail.com Mon May 12 08:59:46 2008 From: pingu219 at gmail.com (pingu219 at gmail.com) Date: Mon, 12 May 2008 21:59:46 +0800 Subject: [cfe-dev] Clang's Semantic Analysis Message-ID: <528b9ee80805120659j382ea904r10e16708fb5e19a5@mail.gmail.com> Sorry if this question's a little inane but does the Clang frontend currently perform sufficient semantic analysis to build a symbol table for cross-referencing? From snaroff at apple.com Mon May 12 11:01:45 2008 From: snaroff at apple.com (Steve Naroff) Date: Mon, 12 May 2008 09:01:45 -0700 Subject: [cfe-dev] Clang's Semantic Analysis In-Reply-To: <528b9ee80805120659j382ea904r10e16708fb5e19a5@mail.gmail.com> References: <528b9ee80805120659j382ea904r10e16708fb5e19a5@mail.gmail.com> Message-ID: It does sufficient analysis for building a full cross-reference, however it doesn't build one (by default). snaroff On May 12, 2008, at 6:59 AM, pingu219 at gmail.com wrote: > Sorry if this question's a little inane but does the Clang frontend > currently perform sufficient semantic analysis to build a symbol table > for cross-referencing? > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From kremenek at apple.com Mon May 12 11:23:23 2008 From: kremenek at apple.com (Ted Kremenek) Date: Mon, 12 May 2008 09:23:23 -0700 Subject: [cfe-dev] Missing API Type::getAsTypedefType In-Reply-To: References: Message-ID: On May 12, 2008, at 4:15 AM, Sanjiv.Gupta at microchip.com wrote: > We have APIs like getAsBuiltintype(), getAsFunctionType() in the Type > class defined in AST/Type.h. > Should we also have getAsTypedefType() in there? > I need it for handling typedefs during debug info generation. > > - Sanjiv > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev Done: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080512/005675.html From Sanjiv.Gupta at microchip.com Mon May 12 12:09:21 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Mon, 12 May 2008 10:09:21 -0700 Subject: [cfe-dev] Missing API Type::getAsTypedefType References: Message-ID: Cool. -Sanjiv ________________________________ From: Ted Kremenek [mailto:kremenek at apple.com] Sent: Mon 5/12/2008 9:53 PM To: Sanjiv Kumar Gupta - I00171 Cc: cfe-dev at cs.uiuc.edu Subject: Re: [cfe-dev] Missing API Type::getAsTypedefType On May 12, 2008, at 4:15 AM, Sanjiv.Gupta at microchip.com wrote: > We have APIs like getAsBuiltintype(), getAsFunctionType() in the Type > class defined in AST/Type.h. > Should we also have getAsTypedefType() in there? > I need it for handling typedefs during debug info generation. > > - Sanjiv > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev Done: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080512/005675.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080512/7fe1c65a/attachment.html From clattner at apple.com Mon May 12 14:05:27 2008 From: clattner at apple.com (Chris Lattner) Date: Mon, 12 May 2008 12:05:27 -0700 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: References: Message-ID: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> On May 10, 2008, at 10:49 AM, David Chisnall wrote: > Hi Chris, > > Here's the latest diff - it's getting quite big, but you can ignore > most of the stuff in CGObjCEtoile.cpp (it's bitrotted a bit, since > I've changed some interfaces). Hi David, This is huge. I can't review this. I see a whole bunch of unrelated changes, please split this out into one patch per change. This is important for review, but it is also important for revision control purposes (you can revert one patch without all the others). Also, this allows patch review to be done by different people which might be specialized in certain areas. Example pieces: 1) the changes for your runtime should be split out 2) the self/_cmd/super changes should be split out. 3) the @defs support should be split out 4) the ccc changes should be split out 5) the codegen refactoring (ConvertReturnType etc) should be split out. 6) the various random codegen improvements should be split out. Questions: What is the idea with ImplicitParamInfo in ObjCMethodDecl? Should isObjCPointerType be a method in codegen, or something in the AST? +++ lib/AST/StmtPrinter.cpp (working copy) @@ -496,6 +496,9 @@ case PreDefinedExpr::PrettyFunction: OS << "__PRETTY_FUNCTION__"; break; + case PreDefinedExpr::ObjCSuper: + OS << "super"; + break; this should handle self/_cmd also. // typedef struct objc_class *Class; const PointerType *ptr = TD->getUnderlyingType()- >getAsPointerType(); - assert(ptr && "'Class' incorrectly typed"); + //assert(ptr && "'Class' incorrectly typed"); const RecordType *rec = ptr->getPointeeType()->getAsStructureType(); - assert(rec && "'Class' incorrectly typed"); + //assert(rec && "'Class' incorrectly typed"); ClassStructType = rec; Why are you commenting out code? @@ -721,16 +721,32 @@ // Parse all the comma separated declarators. DeclSpec DS; FieldDeclarators.clear(); + if(Tok.is(tok::at)) { + ConsumeToken(); + //FIXME: Turn these into helpful errors + assert(Tok.isObjCAtKeyword(tok::objc_defs) && "defs expected"); + ConsumeToken(); + assert(Tok.is(tok::l_paren) && "( expected"); Please do the fixme: checking in code that is known broken is badness. Also please move the @ handling part to the "else" clause of the if, so that the normal struct case comes before @defs handling. This is great work and I'm thrilled that you're making such huge enhancements to clang, but I would also really like to get it checked into clang... and we can't do that until it is split up a bit more. -Chris > > > There is lots of nasty stuff in there to do with casting at the > moment. In Objective-C, you can implicitly cast object pointers to > other object pointers with some very lax type checking (any object > pointer can be mapped to or from id without an explict cast), which > doesn't really map well onto LLVM. I've added special cases to all > of the bits of code where asserts were failing when building > GNUstep. Some of these can be done cleanly - when we still have > QualTypes around we can check if they casting between ObjC pointer - > but in a couple I've had to just allow any pointer conversion (the > one that comes to mind is constructing Phi nodes after a shorthand- > if statement, where both sides return some kind of object) and hope > that Sema has already caught mismatched types. > > Casting a scalar to a union containing that scalar is partially > working, but not in all cases. The remaining big issue with > compiling GNUstep is the lack of support for generating l-values as > a result of cast expressions. I haven't tackled this at all because > I am not sure what the semantics of a taking the address of the > result of a cast are meant to be. > > I've tried to comment everything, but let me know if you have any of > it doesn't make sense. > > David > From eli.friedman at gmail.com Mon May 12 14:15:34 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 12 May 2008 12:15:34 -0700 Subject: [cfe-dev] isIntegerType vs. isIntegralType? Message-ID: isIntegerType and isIntegralType are very confusingly named, and I think most of the places that currently call isIntegerType actually want to be calling isIntegralType. Some cleanup here is definitely needed. I have a patch that changes most of the uses of isIntegerType to isIntegralType when it is appropriate; should I commit that? Or should one of these methods be renamed? Since it isn't entirely obvious, the difference is that isIntegerType includes vector types, which aren't appropriate in a lot of places, like array indexing and case statements. -Eli From stephan.creutz at inf.tu-dresden.de Tue May 13 07:55:10 2008 From: stephan.creutz at inf.tu-dresden.de (Stephan Creutz) Date: Tue, 13 May 2008 14:55:10 +0200 Subject: [cfe-dev] Rewriting Types Message-ID: <20080513125510.GA3502@mars> Hi, I'm trying to rewrite types in declaration statements, e.g. "int a, b, c" to "long a; long b; long c;". To do this I use the rewriter as follows: void TypeVisitor::VisitDeclStmt(DeclStmt *ds) { ScopedDecl *sd = ds->getDecl(); while (sd) { if (VarDecl *vd = dyn_cast(sd)) { vd->setType(context.LongTy); } sd = sd->getNextDeclarator(); } if (rewriter.ReplaceStmt(ds, ds)) llvm::cerr << "cannot rewrite\n"; } In my resulting file I get something like this: "long a; long b; long c; a, b, c;". That is almost the expected output except the extra "a, b, c;". What is wrong in my code? Or is it in general the wrong approach to rewrite types? Thank you very much in advance, Stephan From eli.friedman at gmail.com Tue May 13 09:13:11 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 13 May 2008 07:13:11 -0700 Subject: [cfe-dev] Rewriting Types In-Reply-To: <20080513125510.GA3502@mars> References: <20080513125510.GA3502@mars> Message-ID: (Resend; I forgot to cc the list.) In clang/lib/Parse/ParseStmt.cpp, I see a comment "// FIXME: Pass in the right location for the end of the declstmt.". That seems likely to be causing your issue. -Eli On Tue, May 13, 2008 at 5:55 AM, Stephan Creutz wrote: > Hi, > > I'm trying to rewrite types in declaration statements, e.g. "int a, b, > c" to "long a; long b; long c;". To do this I use the rewriter as > follows: > > void TypeVisitor::VisitDeclStmt(DeclStmt *ds) > { > ScopedDecl *sd = ds->getDecl(); > while (sd) { > if (VarDecl *vd = dyn_cast(sd)) { > vd->setType(context.LongTy); > } > sd = sd->getNextDeclarator(); > } > > if (rewriter.ReplaceStmt(ds, ds)) > llvm::cerr << "cannot rewrite\n"; > } > > In my resulting file I get something like this: "long a; long b; long c; > a, b, c;". That is almost the expected output except the extra > "a, b, c;". > > What is wrong in my code? Or is it in general the wrong approach to > rewrite types? > > Thank you very much in advance, > Stephan > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > From csdavec at swansea.ac.uk Tue May 13 11:22:36 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Tue, 13 May 2008 17:22:36 +0100 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> Message-ID: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Hi Chris, On 12 May 2008, at 20:05, Chris Lattner wrote: > This is huge. I can't review this. I see a whole bunch of > unrelated changes, please split this out into one patch per change. > This is important for review, but it is also important for revision > control purposes (you can revert one patch without all the others). > Also, this allows patch review to be done by different people which > might be specialized in certain areas. Sorry about the size. It's a bit hard to split it up, since many of the changes depend on others. I'm now using svk for clang so I can split future diffs up more easily once my tree is a bit better sync'd with trunk. > Example pieces: > 1) the changes for your runtime should be split out The runtime-specific code is all in GCObjCGNU.cpp and CGObjCEtoile.cpp. It's hard to split these out as separate diffs because these changes depend on changes to the interface declared in CGObjCRuntime.h, which in turn depend on changes in other bits of the code that call it. The ?toil? runtime stuff is still in a state of flux, so I have enclosed a simplified version of the diff that only changes the interfaces to match the new versions. (etoile.diff) The GNU runtime specific stuff is all in gnu.diff. The new runtime interface is in runtimeif.diff. This can not be committed separately from the changes to the runtime implementations and the things in CGObjC.cpp that call it. > 2) the self/_cmd/super changes should be split out. Super is in super.diff. Implicit parameters, which include self and _cmd are in implicit.diff. > 3) the @defs support should be split out This is in defs.diff. > 4) the ccc changes should be split out ccc.diff (These are unchanged from the ccc.diff in the earlier email). > 5) the codegen refactoring (ConvertReturnType etc) should be split > out. types.diff contains the changes to CodeGenTypes which allow converting a return type. This is required for the changes to CodeGenFunction::GenerateObjCMethod(). function.diff pulls shared code from CodeGenFunction::GenrateCode and GenerateObjCMethod out. It also removes the method for generating ObjC methods from CodeGenFunction. objc.diff puts this method and other ObjC specific parts of CodeGenFunction into GCObjC.cpp (where I should have put them to start with). module.diff contains the runtime-agnostic code for generating module- level ObjC constructs (classes, categories and protocols). This depends on the new runtime interface since the old one did not have methods for doing any of this. static.diff fixes static variables in ObjC methods. override.diff allows the types of id and Class to be redefined. This happens whenever a runtime-specific header file is included. I think GCC avoids this problem by including these headers itself and picking up the declarations from there. union.diff adds support for GCC's cast-to-union extension, which is used in a depressing number of places in the GNUstep code. cast.diff fixes a number of cases where implicit casts are not correctly codegen'd, and allows Objective-C const id to have messages sent to it. expr.diff contains a load of small tidies (i.e. everything I couldn't thing of which diff to put it in) > 6) the various random codegen improvements should be split out. const_str.diff adds a call to the runtime-specific method for generating constant ObjC strings. It also relaxes the string class type checking. If NSConstantString has not been declared when a constant string is encountered (which happens quite often) then the constant string is an id. If it has, then it is an NSConstantString. aggmsg.diff generates message sends that return aggregate types by calling runtime-specific methods. There are corresponding changes in the runtime-specific code to make this actually work. > Questions: > What is the idea with ImplicitParamInfo in ObjCMethodDecl? It allows enumeration of things like self and _cmd. For the ?toil? runtime it will also provide support for _call (which is mainly used by prototype-base languages such as Io and JavaScript, but might be useful to export to Objective-C programmers). In a future patch I will factor the code that sets these out into a runtime-specific class. These are just function parameters, so they do not need any special handling (the existing code for accessing local variables works on them in the codegen). This is basically a generalisation of the existing selfDecl stuff. At the moment, self and _cmd are hard-coded. When this is committed I will allow other implicit parameters to be defined per-runtime. > Should isObjCPointerType be a method in codegen, or something in the > AST? It is in CodeGen because it is only used in CodeGen. It is used to decide when two pointers which point to different LLVM types can be implicitly bitcast - AST is unaware of LLVM types so it doesn't make this distinction. This is probably not the ideal solution, so suggestion are welcome. > +++ lib/AST/StmtPrinter.cpp (working copy) > @@ -496,6 +496,9 @@ > case PreDefinedExpr::PrettyFunction: > OS << "__PRETTY_FUNCTION__"; > break; > + case PreDefinedExpr::ObjCSuper: > + OS << "super"; > + break; > > this should handle self/_cmd also. self and _cmd are handled by the implicit parameter code now. They work as any other variable references, both when accessing them in codegen and when doing an AST print. super is different because it is an alias for self with special semantics. These should not have been left in Expr.h diff, since they are never used. > // typedef struct objc_class *Class; > const PointerType *ptr = TD->getUnderlyingType()- > >getAsPointerType(); > - assert(ptr && "'Class' incorrectly typed"); > + //assert(ptr && "'Class' incorrectly typed"); > const RecordType *rec = ptr->getPointeeType()->getAsStructureType(); > - assert(rec && "'Class' incorrectly typed"); > + //assert(rec && "'Class' incorrectly typed"); > ClassStructType = rec; > > Why are you commenting out code? Because it was breaking something else which I have now fixed and I forgot to uncomment them. The asserts requiring SEL to be a pointer type will have to be changed at some point since SEL is an i32 in the ?toil? runtime. (SEL is an opaque type with respect to the language. The fact that it is a pointer in the GNU and NeXT runtimes is an implementation detail and doesn't belong in the AST). > @@ -721,16 +721,32 @@ > // Parse all the comma separated declarators. > DeclSpec DS; > FieldDeclarators.clear(); > + if(Tok.is(tok::at)) { > + ConsumeToken(); > + //FIXME: Turn these into helpful errors > + assert(Tok.isObjCAtKeyword(tok::objc_defs) && "defs expected"); > + ConsumeToken(); > + assert(Tok.is(tok::l_paren) && "( expected"); > > Please do the fixme: checking in code that is known broken is badness. Sorry, I hadn't read the diagnostics code when I wrote this. Fixed now. I've also replaced the corresponding assert in SemaDecl with an error. > Also please move the @ handling part to the "else" clause of the if, > so that the normal struct case comes before @defs handling. Done. > This is great work and I'm thrilled that you're making such huge > enhancements to clang, but I would also really like to get it > checked into clang... and we can't do that until it is split up a > bit more. I've stopped working on new things to focus on getting the current changes merged so I don't diverge any further. Let me know what needs changing to get it committed. David -------------- next part -------------- A non-text attachment was scrubbed... Name: cast.diff Type: application/octet-stream Size: 4440 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0019.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: ccc.diff Type: application/octet-stream Size: 676 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0020.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: const_str.diff Type: application/octet-stream Size: 1807 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0021.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: defs.diff Type: application/octet-stream Size: 5643 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0022.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: etoile.diff Type: application/octet-stream Size: 6101 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0023.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: expr.diff Type: application/octet-stream Size: 9726 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0024.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: function.diff Type: application/octet-stream Size: 6428 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0025.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: gnu.diff Type: application/octet-stream Size: 39856 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0026.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: implicit.diff Type: application/octet-stream Size: 4231 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0027.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: module.diff Type: application/octet-stream Size: 10431 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0028.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: objc.diff Type: application/octet-stream Size: 8706 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0029.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: override.diff Type: application/octet-stream Size: 5106 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0030.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: runtimeif.diff Type: application/octet-stream Size: 5574 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0031.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: static.diff Type: application/octet-stream Size: 616 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0032.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: super.diff Type: application/octet-stream Size: 1948 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0033.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: types.diff Type: application/octet-stream Size: 3207 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0034.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: union.diff Type: application/octet-stream Size: 1652 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0035.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: vla.diff Type: application/octet-stream Size: 587 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0036.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: aggmesg.diff Type: application/octet-stream Size: 1856 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/d6020a91/attachment-0037.obj From eli.friedman at gmail.com Tue May 13 13:15:07 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 13 May 2008 11:15:07 -0700 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On Tue, May 13, 2008 at 9:22 AM, David Chisnall wrote: > Hi Chris, > > On 12 May 2008, at 20:05, Chris Lattner wrote: > >> This is huge. I can't review this. I see a whole bunch of unrelated >> changes, please split this out into one patch per change. This is important >> for review, but it is also important for revision control purposes (you can >> revert one patch without all the others). Also, this allows patch review to >> be done by different people which might be specialized in certain areas. > > Sorry about the size. It's a bit hard to split it up, since many of the > changes depend on others. I'm now using svk for clang so I can split future > diffs up more easily once my tree is a bit better sync'd with trunk. Please include testcases with all of the patches; it makes things easier to review, and they're necessary to commit fixes. Please put fixes that don't have any dependencies into separate emails; it makes them easier to track. A quick review of some of diffs: vla.diff: Nowhere near a complete implementation of vlas; better to error out than generate bad code. I think I sent a partial patch to this list at one point, but I never finished it. aggmesg.diff: Please split this patch into its independent parts. The VisitObjCMessageExpr implementation looks correct, although the FIXME doesn't seem relevant, and there's no point to constructing the RValue. The VisitCastExpr implementation is clearly wrong; you're throwing out the emitted value. Also, you should add some assertions to make sure it's a scalar to union cast. (Also, don't resubmit this until the Sema changes this depends on are committed.) union.diff: You definitely need to implement that FIXME before the patch goes in. Also, you want to be checking for type compatibility, not pointer equality. types.diff: Please split this patch into its independent parts. The change for "case Type::ObjCQualifiedId" looks fine. I'm a bit concerned that the implementation of ConvertReturnType might not be appropriate to call from within ConvertNewType; that code tends to be fragile. super.diff: Please make a new expession type instead of overloading PreDefinedExpr. static.diff: Please put the logic ObjCMethodDecl, alongside getSynthesizedMethodSize(). (Or is that logic already elsewhere?) ccc.diff: Send this patch in a separate email, so that the ccc maintainer sees it. cast.diff: This is mixing fixes. Please separate. If two types are compatible, they should have the same LLVM type, I think; is the ObjC code abusing type compatibility? Maybe we need to be emitting more ImplicitCasts into the AST? -Eli From eli.friedman at gmail.com Tue May 13 19:22:49 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 13 May 2008 17:22:49 -0700 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On Tue, May 13, 2008 at 11:15 AM, Eli Friedman wrote: > Please include testcases with all of the patches; it makes things > easier to review, and they're necessary to commit fixes. Please put > fixes that don't have any dependencies into separate emails; it makes > them easier to track. Hmm, I guess I really ought the go into more detail on this bit. First off, I'm sure this work will be very useful for people using ObjC, so keep up the good work. If the review comments seem a bit harsh, it's just the natural writing style for reviews, and if you disagree with a comment, feel free to say so. The key to making things move as quickly as possible is to make small, independent patches which can be committed separately without breaking the build. For example, take const_str.diff. The Sema changes are independent of the other changes. So send one email, with one patch, including a testcase, which can be committed separately from the others. That should be reviewed and committed quickly. Then, send one email, with one patch, which allows the codegen of ObjC string literals. This is more work than submitting multiple fixes together, but once things start moving, you'll find that everything runs a lot smoother. If you haven't read http://llvm.org/docs/DeveloperPolicy.html, you might also find that useful. -Eli From eli.friedman at gmail.com Tue May 13 20:48:55 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 13 May 2008 18:48:55 -0700 Subject: [cfe-dev] Patch to add __builtin_shufflevector In-Reply-To: References: <876D763D-69D8-4282-9E85-0A9FC2B584B5@apple.com> Message-ID: Updated version of __builtin_shufflevector patch; should address review comments. (Sorry this took so long; you probably forgot what your review comments were.) -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: clangshufflevector.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080513/31b9a35a/attachment.txt From Sanjiv.Gupta at microchip.com Wed May 14 01:41:54 2008 From: Sanjiv.Gupta at microchip.com (Sanjiv.Gupta at microchip.com) Date: Tue, 13 May 2008 23:41:54 -0700 Subject: [cfe-dev] [cfe-commits] PATCH : Function Start Debug Info In-Reply-To: References: Message-ID: > -----Original Message----- > From: cfe-commits-bounces at cs.uiuc.edu > [mailto:cfe-commits-bounces at cs.uiuc.edu] On Behalf Of > Sanjiv.Gupta at microchip.com > Sent: Monday, May 12, 2008 5:56 PM > To: cfe-commits at cs.uiuc.edu > Subject: [cfe-commits] PATCH : Function Start Debug Info > > Please find the patch attached for generating function start > debug info. > > -Sanjiv > Just being little impatient here :) - Sanjiv From eli.friedman at gmail.com Wed May 14 19:04:24 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 14 May 2008 17:04:24 -0700 Subject: [cfe-dev] Implementation of stddef.h Message-ID: Per subject, implementation of stddef.h. This implementation is correct per C99, but glibc does some funny stuff including certain headers, so I'm not sure if this is completely correct. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: stddef.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080514/59307d06/attachment.txt From dpatel at apple.com Wed May 14 19:55:33 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 14 May 2008 17:55:33 -0700 Subject: [cfe-dev] vla.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: David, Is this patch sufficient to support VLAs ? Your patch removes an assert but somehow it is not visible in diffs. - Devang [I'm intentionally using separate emails for each diffs. I may not reply to all diffs. Please do not mix discussion for multiple diffs in replies. Thanks!] From dpatel at apple.com Wed May 14 20:01:01 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 14 May 2008 18:01:01 -0700 Subject: [cfe-dev] union.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <20D6CDFA-D2C9-4F7D-86D1-43FC98F00D28@apple.com> David, On May 13, 2008, at 9:22 AM, David Chisnall wrote: > union.diff adds support for GCC's cast-to-union extension, which is > used in a depressing number of places in the GNUstep code. It is a good idea to include a patch here to add the switch to enable this extension. I did not know about this extension until now!. Do you need a codegen patch to complete this support ? - Devang From dpatel at apple.com Wed May 14 20:05:51 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 14 May 2008 18:05:51 -0700 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <2B2ADDF8-D7E8-43E5-8E19-944ACAB95285@apple.com> David, On May 13, 2008, at 9:22 AM, David Chisnall wrote: > types.diff contains the changes to CodeGenTypes which allow > converting a return type. This is required for the changes to > CodeGenFunction::GenerateObjCMethod(). > @@ -139,7 +144,7 @@ > /// memory representation is usually i8 or i32, depending on the > target. > const llvm::Type *ConvertTypeForMem(QualType T); > > - void CollectObjCIvarTypes(ObjCInterfaceDecl *ObjCClass, > + void CollectObjCIvarTypes(const ObjCInterfaceDecl *ObjCClass, > std::vector &IvarTypes); > > const CGRecordLayout *getCGRecordLayout(const TagDecl*) const; > Index: lib/CodeGen/CodeGenTypes.cpp > =================================================================== > --- lib/CodeGen/CodeGenTypes.cpp (revision 51026) > +++ lib/CodeGen/CodeGenTypes.cpp (working copy) > @@ -168,7 +168,7 @@ > /// Produces a vector containing the all of the instance variables > in an > /// Objective-C object, in the order that they appear. Used to > create LLVM > /// structures corresponding to Objective-C objects. > -void CodeGenTypes::CollectObjCIvarTypes(ObjCInterfaceDecl *ObjCClass, > +void CodeGenTypes::CollectObjCIvarTypes(const ObjCInterfaceDecl > *ObjCClass, > std::vector > &IvarTypes) { > ObjCInterfaceDecl *SuperClass = ObjCClass->getSuperClass(); > if (SuperClass) This is a separate and obvious patch. > +const llvm::Type *CodeGenTypes::ConvertReturnType(QualType T) { > + if (T->isVoidType()) > + { > + return llvm::Type::VoidTy; // Result of function uses llvm > void. > + } > + else > + return ConvertType(T); > +} > + > + Please add doxygen style comment and drop extra { and } > @@ -320,8 +325,8 @@ > break; > > case Type::ObjCQualifiedId: > - assert(0 && "FIXME: add missing functionality here"); > - break; > + // For CodeGen purposes, any id type is an opque pointer > + return ConvertTypeRecursive(Context.getObjCIdType()); > > case Type::Tagged: { > const TagDecl *TD = cast(Ty).getDecl(); This is independent and looks ok. - Devang From dpatel at apple.com Wed May 14 20:13:57 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 14 May 2008 18:13:57 -0700 Subject: [cfe-dev] static.diff [ Re: Continuing Adventures with Objective-C ] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <241D37B4-BE57-4A9F-ABB3-E8A4F966B273@apple.com> David, On May 13, 2008, at 9:22 AM, David Chisnall wrote: > static.diff fixes static variables in ObjC methods. You don't need separator between class and selector names ? Otherwise, this looks obvious. - Devang From eli.friedman at gmail.com Thu May 15 16:57:51 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Thu, 15 May 2008 14:57:51 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 Message-ID: Potential patch attached. Testcase (currently crashes with clang -emit-llvm): int a(); int b(int); int c() {return 1 ? a : b;} Anyone have any better suggestions for how to fix this? -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: tttt.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080515/429547d2/attachment.txt From csdavec at swansea.ac.uk Fri May 16 07:36:56 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 16 May 2008 13:36:56 +0100 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: References: Message-ID: <279AC2BD-DCE6-4199-9433-012952E1A965@swan.ac.uk> I encountered a very similar bug caused by the same bit of code. In my case the issue was the LHS and RHS having different Objective-C pointer types (e.g. id and NSString*) which AST is treating as equivalent but LLVM regards as incompatible. Your solution seems cleaner than mine (which only handled mismatched pointer types correctly), so I'll revert mine before I send updated versions of the diff to this list (hopefully soon...). David On 15 May 2008, at 22:57, Eli Friedman wrote: > Potential patch attached. > > Testcase (currently crashes with clang -emit-llvm): > int a(); > int b(int); > int c() {return 1 ? a : b;} > > Anyone have any better suggestions for how to fix this? > > -Eli > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From mymlreader at gmail.com Fri May 16 09:32:54 2008 From: mymlreader at gmail.com (Zhongxing Xu) Date: Fri, 16 May 2008 22:32:54 +0800 Subject: [cfe-dev] How to know which edge the path is traversing? Message-ID: <4619993f0805160732s59f465b5t28ad21d23873c743@mail.gmail.com> I use GRCoreEngine to do a path sensitive analysis. For example, I can get two paths for the program below: int f(int n) { if (n > 0) ... else ... } I can use the nodes in EndNodes to get these two paths (by backtracking from endnodes). There are two BlockEdgeDst nodes after the block containing the IfStmt "if (n>0)". How can I know which path is led by the condition n > 0 or n <= 0? That is, how can I know which is the "true"/"false" branch edge? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080516/fde795d8/attachment.html From kremenek at apple.com Fri May 16 11:10:40 2008 From: kremenek at apple.com (Ted Kremenek) Date: Fri, 16 May 2008 09:10:40 -0700 Subject: [cfe-dev] How to know which edge the path is traversing? In-Reply-To: <4619993f0805160732s59f465b5t28ad21d23873c743@mail.gmail.com> References: <4619993f0805160732s59f465b5t28ad21d23873c743@mail.gmail.com> Message-ID: <140B3744-F7C1-4335-8823-5F52AA26A5C4@apple.com> On May 16, 2008, at 7:32 AM, Zhongxing Xu wrote: > I use GRCoreEngine to do a path sensitive analysis. For example, I > can get two paths for the program below: > > int f(int n) { > if (n > 0) > ... > else > ... > } > > I can use the nodes in EndNodes to get these two paths (by > backtracking from endnodes). > There are two BlockEdgeDst nodes after the block containing the > IfStmt "if (n>0)". > How can I know which path is led by the condition n > 0 or n <= 0? > That is, how can I know which is the "true"/"false" branch edge? Hi Zhongxing, You will want to inspect the successors CFBBlocks of the "source" in the BlockEdge, and compare it against the destination. For CFGBlock's whose terminator is a branch (if statements, loops, etc), the first successor block is the true branch, the and second successor block is the false branch. For example: ProgramPoint P = N->getLocation(); // N is an ExplodedNode<...> if (BlockEdge* BE = dyn_cast(&P)) { CFGBlock* Src = BE->getSrc(); CFGBlock* Dst = BE->getDst(); // Test if we are at a (binary) branch. if (Src.hasBinaryBranchTerminator()) { if (*Src.succ_begin() == Dst) { // We took the true branch. } else { assert (*(Src.succ_begin()+1) == Dst); // We took the false branch. } } } Note that "hasBinaryBranchTerminator" only returns true for terminators that are "ForStmt", "WhileStmt", "DoStmt", "IfStmt", "ChooseExpr", "ConditionalOperator", and "BinaryOperator" (for '&&' and '||'). IndirectGotoStmt and SwitchStmt work differently. IndirectGotoStmt always branches to a special block where the actual indirect goto takes place (do a CFG dump of code with a labeled goto to see what I mean). Blocks that have a SwitchStmt terminator have as their successor blocks the targets of the switch. In that case, each successor block should have "getLabel()" return a SwitchStmt, with the exception of the last successor. The last successor is always the "default" branch, which may be explicit (with a "default:" label) or implicit (in the case of fall-through to the code after the switch block). BTW, "hasBinaryBranchTerminator" was a recently added predicate method. Most code that inspects terminators actually uses a switch statement on the statement class of the terminator to handle both binary branches and other terminator types. Ted From dpatel at apple.com Fri May 16 11:23:20 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 09:23:20 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: References: Message-ID: <449438F7-C0B3-42C5-9EBD-C70E286C7142@apple.com> On May 15, 2008, at 2:57 PM, Eli Friedman wrote: > Potential patch attached. > > Testcase (currently crashes with clang -emit-llvm): > int a(); > int b(int); > int c() {return 1 ? a : b;} > > Anyone have any better suggestions for how to fix this? Your patch looks good. Pl. apply. Thanks! - Devang From dpatel at apple.com Fri May 16 11:54:57 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 09:54:57 -0700 Subject: [cfe-dev] cast.diff [ Re: Continuing Adventures with Objective-C] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <60C58801-352B-4703-9543-D6925A3C94A5@apple.com> On May 13, 2008, at 9:22 AM, David Chisnall wrote: > cast.diff fixes a number of cases where implicit casts are not > correctly codegen'd, and allows Objective-C const id to have > messages sent to it. > Index: lib/CodeGen/CodeGenFunction.h > =================================================================== > --- lib/CodeGen/CodeGenFunction.h (revision 51026) > +++ lib/CodeGen/CodeGenFunction.h (working copy) > @@ -67,7 +67,9 @@ > class ChooseExpr; > class PreDefinedExpr; > class ObjCStringLiteral; > + class ObjCSelectorExpr; > class ObjCIvarRefExpr; > + class ObjCMessageExpr; > class MemberExpr; > > class VarDecl; > @@ -296,11 +298,15 @@ Do you need this ? I don't see any use in this patch. > > > void GenerateObjCMethod(const ObjCMethodDecl *OMD); > void GenerateCode(const FunctionDecl *FD); > > const llvm::Type *ConvertType(QualType T); > > llvm::Value *LoadObjCSelf(); > > + /// isObjCPointerType - Return true if the specificed AST type > will map onto > + /// some Objective-C pointer type. > + static bool isObjCPointerType(QualType T); > /// hasAggregateLLVMType - Return true if the specified AST type > will map into > /// an aggregate LLVM type or is void. > static bool hasAggregateLLVMType(QualType T); > Index: lib/CodeGen/CodeGenFunction.cpp > =================================================================== > --- lib/CodeGen/CodeGenFunction.cpp (revision 51026) > +++ lib/CodeGen/CodeGenFunction.cpp (working copy) > @@ -50,70 +54,20 @@ > return CGM.getTypes().ConvertType(T); > } > > +bool CodeGenFunction::isObjCPointerType(QualType T) { > + // All Objective-C types are pointers. > + return T->isObjCInterfaceType() || > + T->isObjCQualifiedInterfaceType() || T->isObjCQualifiedIdType(); > +} > + > bool CodeGenFunction::hasAggregateLLVMType(QualType T) { > - return !T->isRealType() && !T->isPointerLikeType() && > - !T->isVoidType() && !T->isVectorType() && !T- > >isFunctionType(); > + return !isObjCPointerType(T) &&!T->isRealType() && !T- > >isPointerLikeType() && > + !T->isVoidType() && !T->isVectorType() && !T->isFunctionType(); > } This is part of patch OK. I think, Eli addressed other part. - Devang From dpatel at apple.com Fri May 16 11:57:49 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 09:57:49 -0700 Subject: [cfe-dev] const_str.diff [ Re: Continuing Adventures with Objective-C ] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On May 13, 2008, at 9:22 AM, David Chisnall wrote: > const_str.diff adds a call to the runtime-specific method for > generating constant ObjC strings. It also relaxes the string class > type checking. If NSConstantString has not been declared when a > constant string is encountered (which happens quite often) then the > constant string is an id. If it has, then it is an NSConstantString. This looks OK. Thanks! - Devang From csdavec at swansea.ac.uk Fri May 16 11:59:42 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 16 May 2008 17:59:42 +0100 Subject: [cfe-dev] cast.diff [ Re: Continuing Adventures with Objective-C] In-Reply-To: <60C58801-352B-4703-9543-D6925A3C94A5@apple.com> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> <60C58801-352B-4703-9543-D6925A3C94A5@apple.com> Message-ID: <87B174D3-2452-402B-8F95-BFDBA4CA31D0@swan.ac.uk> On 16 May 2008, at 17:54, Devang Patel wrote: >> >> Index: lib/CodeGen/CodeGenFunction.h >> =================================================================== >> --- lib/CodeGen/CodeGenFunction.h (revision 51026) >> +++ lib/CodeGen/CodeGenFunction.h (working copy) >> @@ -67,7 +67,9 @@ >> class ChooseExpr; >> class PreDefinedExpr; >> class ObjCStringLiteral; >> + class ObjCSelectorExpr; >> class ObjCIvarRefExpr; >> + class ObjCMessageExpr; >> class MemberExpr; >> >> class VarDecl; >> @@ -296,11 +298,15 @@ > > Do you need this ? I don't see any use in this patch. Ooops, those should have been in a different diff (it's hard to keep track of which bits depend on which other bits). These are needed here for the declaration of EmitObjCSelectorExpr() and EmitObjCMessageExpr(). It seems I messed up quite badly splitting the diff into small parts, since these are used in the code in expr.diff and defined in objc.diff. Sorry. David From dpatel at apple.com Fri May 16 12:03:37 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 10:03:37 -0700 Subject: [cfe-dev] aggmsg.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <9E9E8F81-FEF7-4EB1-AEA2-D710C97186B3@apple.com> On May 13, 2008, at 9:22 AM, David Chisnall wrote: > aggmsg.diff generates message sends that return aggregate types by > calling runtime-specific methods. There are corresponding changes > in the runtime-specific code to make this actually work. > Index: lib/CodeGen/CGExprAgg.cpp > =================================================================== > --- lib/CodeGen/CGExprAgg.cpp (revision 51026) > +++ lib/CodeGen/CGExprAgg.cpp (working copy) > @@ -77,12 +77,15 @@ > // case Expr::UnaryOperatorClass: > // case Expr::CastExprClass: > void VisitImplicitCastExpr(ImplicitCastExpr *E); > + void VisitCastExpr(const CastExpr *E); > void VisitCallExpr(const CallExpr *E); > void VisitStmtExpr(const StmtExpr *E); > void VisitBinaryOperator(const BinaryOperator *BO); > void VisitBinAssign(const BinaryOperator *E); > void VisitOverloadExpr(const OverloadExpr *E); > > + void VisitObjCMessageExpr(ObjCMessageExpr *E); > + > Avoid xtra white spaces > void VisitConditionalOperator(const ConditionalOperator *CO); > void VisitInitListExpr(InitListExpr *E); > @@ -182,9 +185,13 @@ > assert(CGF.getContext().typesAreCompatible( > STy.getUnqualifiedType(), Ty.getUnqualifiedType()) > && "Implicit cast types must be compatible"); > - > Visit(E->getSubExpr()); > } > +// This should only be used when constructing a cast to a union type > +void AggExprEmitter::VisitCastExpr(const CastExpr *E) { > + // Does this work on big-endian archs? why not ? Patch is ok. - Devang From dpatel at apple.com Fri May 16 12:30:39 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 10:30:39 -0700 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <36B8E273-5ED4-459F-B6A4-6283A2BB829B@apple.com> On May 13, 2008, at 9:22 AM, David Chisnall wrote: > expr.diff contains a load of small tidies (i.e. everything I > couldn't thing of which diff to put it in) It is easier for everyone if you get such patches in as soon as you run into them. > Index: lib/CodeGen/CGExprScalar.cpp > =================================================================== > --- lib/CodeGen/CGExprScalar.cpp (revision 51026) > +++ lib/CodeGen/CGExprScalar.cpp (working copy) > @@ -125,6 +125,7 @@ > return EmitLoadOfLValue(E); > } > Value *VisitObjCMessageExpr(ObjCMessageExpr *E); > + Value *VisitObjCProtocolExpr(ObjCProtocolExpr *E); > Value *VisitObjCIvarRefExpr(ObjCIvarRefExpr *E) { return > EmitLoadOfLValue(E);} > Value *VisitArraySubscriptExpr(ArraySubscriptExpr *E); > Value *VisitMemberExpr(Expr *E) { return > EmitLoadOfLValue(E); } > @@ -298,6 +299,9 @@ > Value *VisitChooseExpr(ChooseExpr *CE); > Value *VisitOverloadExpr(OverloadExpr *OE); > Value *VisitVAArgExpr(VAArgExpr *VE); > + Value *VisitObjCSelectorExpr(const ObjCSelectorExpr *E) { > + return CGF.EmitObjCSelectorExpr(E); > + } > Value *VisitObjCStringLiteral(const ObjCStringLiteral *E) { > return CGF.EmitObjCStringLiteral(E); > } ok > @@ -374,6 +378,9 @@ > } > > if (isa(SrcType)) { > + if(CGF.isObjCPointerType(DstType)) { > + return Builder.CreateBitCast(Src, DstTy, "conv"); > + } ok > // Must be an ptr to int cast. > assert(isa(DstTy) && "not ptr->int?"); > return Builder.CreatePtrToInt(Src, DstTy, "conv"); > @@ -452,46 +459,7 @@ > return llvm::UndefValue::get(CGF.ConvertType(E->getType())); > } > > -Value *ScalarExprEmitter::VisitObjCMessageExpr(ObjCMessageExpr *E) { > - // Only the lookup mechanism and first two arguments of the method > - // implementation vary between runtimes. We can get the receiver > and > - // arguments in generic code. > - > - // Find the receiver > - llvm::Value *Receiver = CGF.EmitScalarExpr(E->getReceiver()); > > - // Process the arguments > - unsigned ArgC = E->getNumArgs(); > - llvm::SmallVector Args; > - for (unsigned i = 0; i != ArgC; ++i) { > - Expr *ArgExpr = E->getArg(i); > - QualType ArgTy = ArgExpr->getType(); > - if (!CGF.hasAggregateLLVMType(ArgTy)) { > - // Scalar argument is passed by-value. > - Args.push_back(CGF.EmitScalarExpr(ArgExpr)); > - } else if (ArgTy->isAnyComplexType()) { > - // Make a temporary alloca to pass the argument. > - llvm::Value *DestMem = > CGF.CreateTempAlloca(ConvertType(ArgTy)); > - CGF.EmitComplexExprIntoAddr(ArgExpr, DestMem, false); > - Args.push_back(DestMem); > - } else { > - llvm::Value *DestMem = > CGF.CreateTempAlloca(ConvertType(ArgTy)); > - CGF.EmitAggExpr(ArgExpr, DestMem, false); > - Args.push_back(DestMem); > - } > - } > - > - // Get the selector string > - std::string SelStr = E->getSelector().getName(); > - llvm::Constant *Selector = CGF.CGM.GetAddrOfConstantString(SelStr); > - > - llvm::Value *SelPtr = Builder.CreateStructGEP(Selector, 0); > - return Runtime->generateMessageSend(Builder, ConvertType(E- > >getType()), > - CGF.LoadObjCSelf(), > - Receiver, SelPtr, > - &Args[0], Args.size()); > -} > - I guess this is moved somewhere else ? > Value > *ScalarExprEmitter::VisitArraySubscriptExpr(ArraySubscriptExpr *E) { > // Emit subscript expressions in rvalue context's. For most > cases, this just > // loads the lvalue formed by the subscript expr. However, we > have to be > @@ -522,11 +490,11 @@ > // will not true when we add support for VLAs. > Value *V = EmitLValue(Op).getAddress(); // Bitfields can't be > arrays. > > - assert(isa(V->getType()) && > - isa(cast(V->getType()) > - ->getElementType()) && > - "Doesn't support VLAs yet!"); > - V = Builder.CreateStructGEP(V, 0, "arraydecay"); > + if (isa(V->getType()) && > + isa(cast(V->getType()) > + ->getElementType())) { > + V = Builder.CreateStructGEP(V, 0, "arraydecay"); > + } > > // The resultant pointer type can be implicitly casted to other > pointer > // types as well, for example void*. > @@ -660,6 +628,13 @@ > if (TypeToSize->isVoidType()) > return llvm::ConstantInt::get(llvm::APInt(ResultWidth, 1)); > > + // Get the size of VLAs > + if (TypeToSize->isVariablyModifiedType() && isSizeOf) { > + const VariableArrayType *VLA = TypeToSize- > >getAsVariableArrayType(); > + Value *ElementSize = EmitSizeAlignOf(VLA->getElementType(), > RetType, true); > + Value *Elements = CGF.EmitScalarExpr(VLA->getSizeExpr()); > + return Builder.CreateMul(Elements, ElementSize); > + } > /// FIXME: This doesn't handle VLAs yet! > std::pair Info = > CGF.getContext().getTypeInfo(TypeToSize); Please include these with VLA patch so that it is possible to review VLA support. > > @@ -908,6 +883,11 @@ > LHS, RHS, "cmp"); > } else { > // Signed integers and pointers. > + const llvm::Type *LHSTy = LHS->getType(); > + // Are we comparing pointers to different types? > + if (RHS->getType() != LHSTy) { > + RHS = Builder.CreateBitCast(RHS, LHSTy); createPointerCast ? > @@ -629,9 +637,11 @@ > > // Handle struct-return functions by passing a pointer to the > location that > // we would like to return into. > + int RealArgStart = 0; > if (hasAggregateLLVMType(ResultType)) { > // Create a temporary alloca to hold the result of the call. :( > Args.push_back(CreateTempAlloca(ConvertType(ResultType))); > + RealArgStart++; > // FIXME: set the stret attribute on the argument. > } > > @@ -640,7 +650,16 @@ > > if (!hasAggregateLLVMType(ArgTy)) { > // Scalar argument is passed by-value. > - Args.push_back(EmitScalarExpr(ArgExprs[i])); > + llvm::Value *ArgVal = EmitScalarExpr(ArgExprs[i]); > + const llvm::FunctionType *CalleeTy = cast( > + cast(Callee->getType())- > >getElementType()); > + if (i < CalleeTy->getNumParams()) { use assert instead of this check. Otherwise this part is OK. > + const llvm::Type *ArgTy = CalleeTy->getParamType(i + > RealArgStart); > + if (ArgTy && (ArgVal->getType() != ArgTy)) { > + ArgVal = Builder.CreateBitCast(ArgVal, ArgTy); > + } > + } > + Args.push_back(ArgVal); > } else if (ArgTy->isAnyComplexType()) { > // Make a temporary alloca to pass the argument. > llvm::Value *DestMem = CreateTempAlloca(ConvertType(ArgTy)); LLVM and clangs development style prefers small and incremental changes. We are able to introduce big features and huge improvements using this style. If you adhere to this style then it'll be easier for you to make progress faster. - Devang From dpatel at apple.com Fri May 16 12:40:37 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 10:40:37 -0700 Subject: [cfe-dev] function.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: <90853F5A-ED3E-4FA0-BB38-849B1150B8F4@apple.com> On May 13, 2008, at 9:22 AM, David Chisnall wrote: > function.diff pulls shared code from CodeGenFunction::GenrateCode > and GenerateObjCMethod out. Yay! Pl. co-ordinate with Sanjiv who is adding debug info support and updating this part in recent patch. > It also removes the method for generating ObjC methods from > CodeGenFunction. > -llvm::Value *CodeGenFunction::LoadObjCSelf(void) > -{ > - if(const ObjCMethodDecl *OMD = > dyn_cast(CurFuncDecl)) { > - llvm::Value *SelfPtr = LocalDeclMap[&(*OMD->getSelfDecl())]; > - return Builder.CreateLoad(SelfPtr, "self"); > - } > - return NULL; > -} What about LoadObjCSelf uses ? Otherwise, patch is OK. Thanks! - Devang From theraven at sucs.org Fri May 16 12:44:05 2008 From: theraven at sucs.org (David Chisnall) Date: Fri, 16 May 2008 18:44:05 +0100 Subject: [cfe-dev] Continuing Adventures with Objective-C In-Reply-To: <36B8E273-5ED4-459F-B6A4-6283A2BB829B@apple.com> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> <36B8E273-5ED4-459F-B6A4-6283A2BB829B@apple.com> Message-ID: <6F6BF83C-CC1E-457D-B5C2-4CEFF091A6CB@sucs.org> On 16 May 2008, at 18:30, Devang Patel wrote: > I guess this is moved somewhere else ? I've moved the Objective-C-specific stuff into CGObjC.cpp. This should make future ObjC-related diffs a bit easier to read. > Please include these with VLA patch so that it is possible to review > VLA support. Sorry - copy-and-paste error. >> >> @@ -908,6 +883,11 @@ >> LHS, RHS, "cmp"); >> } else { >> // Signed integers and pointers. >> + const llvm::Type *LHSTy = LHS->getType(); >> + // Are we comparing pointers to different types? >> + if (RHS->getType() != LHSTy) { >> + RHS = Builder.CreateBitCast(RHS, LHSTy); > > createPointerCast ? Makes sense. >> @@ -629,9 +637,11 @@ > >> >> // Handle struct-return functions by passing a pointer to the >> location that >> // we would like to return into. >> + int RealArgStart = 0; >> if (hasAggregateLLVMType(ResultType)) { >> // Create a temporary alloca to hold the result of the call. :( >> Args.push_back(CreateTempAlloca(ConvertType(ResultType))); >> + RealArgStart++; >> // FIXME: set the stret attribute on the argument. >> } >> >> @@ -640,7 +650,16 @@ >> >> if (!hasAggregateLLVMType(ArgTy)) { >> // Scalar argument is passed by-value. >> - Args.push_back(EmitScalarExpr(ArgExprs[i])); >> + llvm::Value *ArgVal = EmitScalarExpr(ArgExprs[i]); >> + const llvm::FunctionType *CalleeTy = cast( >> + cast(Callee->getType())- >> >getElementType()); >> + if (i < CalleeTy->getNumParams()) { > > use assert instead of this check. Otherwise this part is OK. I don't believe this should be an assert - i can be greater than the number of parameters in a variadic function and this is not an error. Variadic parameters do not been implicit casting because they are not type-checked. >> + const llvm::Type *ArgTy = CalleeTy->getParamType(i + >> RealArgStart); >> + if (ArgTy && (ArgVal->getType() != ArgTy)) { >> + ArgVal = Builder.CreateBitCast(ArgVal, ArgTy); >> + } >> + } >> + Args.push_back(ArgVal); >> } else if (ArgTy->isAnyComplexType()) { >> // Make a temporary alloca to pass the argument. >> llvm::Value *DestMem = CreateTempAlloca(ConvertType(ArgTy)); > > LLVM and clangs development style prefers small and incremental > changes. We are able to introduce big features and huge improvements > using this style. If you adhere to this style then it'll be easier > for you to make progress faster. Sorry - I lost track of how big my out-of-tree changes were getting. David From csdavec at swansea.ac.uk Fri May 16 12:46:16 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 16 May 2008 18:46:16 +0100 Subject: [cfe-dev] function.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <90853F5A-ED3E-4FA0-BB38-849B1150B8F4@apple.com> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> <90853F5A-ED3E-4FA0-BB38-849B1150B8F4@apple.com> Message-ID: On 16 May 2008, at 18:40, Devang Patel wrote: > > On May 13, 2008, at 9:22 AM, David Chisnall wrote: > >> function.diff pulls shared code from CodeGenFunction::GenrateCode >> and GenerateObjCMethod out. > > Yay! Pl. co-ordinate with Sanjiv who is adding debug info support > and updating this part in recent patch. > >> It also removes the method for generating ObjC methods from >> CodeGenFunction. > >> -llvm::Value *CodeGenFunction::LoadObjCSelf(void) >> -{ >> - if(const ObjCMethodDecl *OMD = >> dyn_cast(CurFuncDecl)) { >> - llvm::Value *SelfPtr = LocalDeclMap[&(*OMD->getSelfDecl())]; >> - return Builder.CreateLoad(SelfPtr, "self"); >> - } >> - return NULL; >> -} > > What about LoadObjCSelf uses ? This method is moved into CGObjC.cpp. (objc.diff) David From csdavec at swansea.ac.uk Fri May 16 12:49:11 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 16 May 2008 18:49:11 +0100 Subject: [cfe-dev] union.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <20D6CDFA-D2C9-4F7D-86D1-43FC98F00D28@apple.com> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> <20D6CDFA-D2C9-4F7D-86D1-43FC98F00D28@apple.com> Message-ID: <21F07E51-37C6-4DFB-9F02-B1E68EB12B0B@swan.ac.uk> On 15 May 2008, at 02:01, Devang Patel wrote: > David, > > On May 13, 2008, at 9:22 AM, David Chisnall wrote: > >> union.diff adds support for GCC's cast-to-union extension, which is >> used in a depressing number of places in the GNUstep code. > > It is a good idea to include a patch here to add the switch to > enable this extension. I did not know about this extension until now!. I've not looked at the dialect options code at all yet. Is there an existing flag I should test to see if we are in GNU-compatible mode? > Do you need a codegen patch to complete this support ? I think I accidentally put the codegen part of this in expr.diff. (VisitCastExpr) David From csdavec at swansea.ac.uk Fri May 16 12:53:12 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Fri, 16 May 2008 18:53:12 +0100 Subject: [cfe-dev] vla.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On 15 May 2008, at 01:55, Devang Patel wrote: > David, > > Is this patch sufficient to support VLAs ? Your patch removes an > assert but somehow it is not visible in diffs. No. I accidentally put the implementation of sizeof() for VLAs into a different patch. I think the included patch now includes both parts. -------------- next part -------------- A non-text attachment was scrubbed... Name: vla.diff Type: application/octet-stream Size: 2174 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080516/d935ffb0/attachment.obj -------------- next part -------------- From eli.friedman at gmail.com Fri May 16 13:15:47 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 11:15:47 -0700 Subject: [cfe-dev] union.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <21F07E51-37C6-4DFB-9F02-B1E68EB12B0B@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> <20D6CDFA-D2C9-4F7D-86D1-43FC98F00D28@apple.com> <21F07E51-37C6-4DFB-9F02-B1E68EB12B0B@swan.ac.uk> Message-ID: On Fri, May 16, 2008 at 10:49 AM, David Chisnall wrote: > On 15 May 2008, at 02:01, Devang Patel wrote: > >> David, >> >> On May 13, 2008, at 9:22 AM, David Chisnall wrote: >> >>> union.diff adds support for GCC's cast-to-union extension, which is >>> used in a depressing number of places in the GNUstep code. >> >> It is a good idea to include a patch here to add the switch to >> enable this extension. I did not know about this extension until now!. > > I've not looked at the dialect options code at all yet. Is there an > existing flag I should test to see if we are in GNU-compatible mode? Just make the diagnostic of type EXTENSION; those are automatically hidden outside of strict compliance mode. clang is basically always in GNU-compatible mode, at least at the moment. -Eli From eli.friedman at gmail.com Fri May 16 13:21:23 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 11:21:23 -0700 Subject: [cfe-dev] vla.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On Fri, May 16, 2008 at 10:53 AM, David Chisnall wrote: > On 15 May 2008, at 01:55, Devang Patel wrote: > >> David, >> >> Is this patch sufficient to support VLAs ? Your patch removes an assert >> but somehow it is not visible in diffs. > > No. I accidentally put the implementation of sizeof() for VLAs into a > different patch. I think the included patch now includes both parts. > I haven't looked at your patch closely, but you might want to take a look at http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-February/001069.html, and also the C99 standard for some of the edge cases, like typedefs. -Eli From eli.friedman at gmail.com Fri May 16 13:22:03 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 11:22:03 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: <449438F7-C0B3-42C5-9EBD-C70E286C7142@apple.com> References: <449438F7-C0B3-42C5-9EBD-C70E286C7142@apple.com> Message-ID: On Fri, May 16, 2008 at 9:23 AM, Devang Patel wrote: > Your patch looks good. Pl. apply. > Thanks! Applied. -Eli From eli.friedman at gmail.com Fri May 16 13:32:34 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 11:32:34 -0700 Subject: [cfe-dev] vla.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <387804B9-B23D-4100-AA9C-6CCE3637D688@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> <387804B9-B23D-4100-AA9C-6CCE3637D688@swan.ac.uk> Message-ID: On Fri, May 16, 2008 at 11:26 AM, David Chisnall wrote: > This looks more complete than mine - is there a reason it hasn't made it > into trunk yet? I didn't get around to finishing it; it needs some cleanup, and it doesn't implement typedefs correctly. -Eli From clattner at apple.com Fri May 16 14:18:14 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 16 May 2008 12:18:14 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: References: Message-ID: <22D07B10-8BEB-47C1-983A-6DD0DAE0CA46@apple.com> On May 15, 2008, at 2:57 PM, Eli Friedman wrote: > Potential patch attached. > > Testcase (currently crashes with clang -emit-llvm): > int a(); > int b(int); > int c() {return 1 ? a : b;} > > Anyone have any better suggestions for how to fix this? Should sema be inserting an implicit cast to the common type? -Chris From eli.friedman at gmail.com Fri May 16 14:22:12 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 12:22:12 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: <22D07B10-8BEB-47C1-983A-6DD0DAE0CA46@apple.com> References: <22D07B10-8BEB-47C1-983A-6DD0DAE0CA46@apple.com> Message-ID: On Fri, May 16, 2008 at 12:18 PM, Chris Lattner wrote: >> Testcase (currently crashes with clang -emit-llvm): >> int a(); >> int b(int); >> int c() {return 1 ? a : b;} > > Should sema be inserting an implicit cast to the common type? I suppose that's a possibility; which way are users outside of codegen likely to prefer? -Eli From clattner at apple.com Fri May 16 14:26:34 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 16 May 2008 12:26:34 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: References: <22D07B10-8BEB-47C1-983A-6DD0DAE0CA46@apple.com> Message-ID: On May 16, 2008, at 12:22 PM, Eli Friedman wrote: > On Fri, May 16, 2008 at 12:18 PM, Chris Lattner > wrote: >>> Testcase (currently crashes with clang -emit-llvm): >>> int a(); >>> int b(int); >>> int c() {return 1 ? a : b;} >> >> Should sema be inserting an implicit cast to the common type? > > I suppose that's a possibility; which way are users outside of codegen > likely to prefer? It would be a nice invariant for the LHS/RHS of ?: to have the same type as the result of the ?: -Chris From kremenek at apple.com Fri May 16 14:30:24 2008 From: kremenek at apple.com (Ted Kremenek) Date: Fri, 16 May 2008 12:30:24 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: References: <22D07B10-8BEB-47C1-983A-6DD0DAE0CA46@apple.com> Message-ID: <009E6D41-37A5-4654-897D-A7E2F0832853@apple.com> On May 16, 2008, at 12:22 PM, Eli Friedman wrote: > On Fri, May 16, 2008 at 12:18 PM, Chris Lattner > wrote: >>> Testcase (currently crashes with clang -emit-llvm): >>> int a(); >>> int b(int); >>> int c() {return 1 ? a : b;} >> >> Should sema be inserting an implicit cast to the common type? > > I suppose that's a possibility; which way are users outside of codegen > likely to prefer? An implicit cast is preferable to other clients (the static analyzer being one of them). It puts the logic of such type munging into a single place, and makes the invariants of the ASTs more logically consistent. From mrs at apple.com Fri May 16 15:23:46 2008 From: mrs at apple.com (Mike Stump) Date: Fri, 16 May 2008 13:23:46 -0700 Subject: [cfe-dev] vla.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On May 16, 2008, at 11:21 AM, Eli Friedman wrote: > On Fri, May 16, 2008 at 10:53 AM, David Chisnall > wrote: >> >> No. I accidentally put the implementation of sizeof() for VLAs >> into a >> different patch. I think the included patch now includes both parts. >> > > I haven't looked at your patch closely, but you might want to take a > look at http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-February/001069.html > , > and also the C99 standard for some of the edge cases, like typedefs. vla-*.c from the gcc testsuite also contains some nice testcases. From eli.friedman at gmail.com Fri May 16 15:28:12 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 13:28:12 -0700 Subject: [cfe-dev] Possible fix for codegen bug with cond ? func1 : func2 In-Reply-To: <009E6D41-37A5-4654-897D-A7E2F0832853@apple.com> References: <22D07B10-8BEB-47C1-983A-6DD0DAE0CA46@apple.com> <009E6D41-37A5-4654-897D-A7E2F0832853@apple.com> Message-ID: On Fri, May 16, 2008 at 12:30 PM, Ted Kremenek wrote: >>>> Testcase (currently crashes with clang -emit-llvm): >>>> int a(); >>>> int b(int); >>>> int c() {return 1 ? a : b;} >>> >>> Should sema be inserting an implicit cast to the common type? > > An implicit cast is preferable to other clients (the static analyzer being > one of them). It puts the logic of such type munging into a single place, > and makes the invariants of the ASTs more logically consistent. > Okay, then I'll back out my codegen patch and re-fix this in Sema. -Eli From dpatel at apple.com Fri May 16 15:39:18 2008 From: dpatel at apple.com (Devang Patel) Date: Fri, 16 May 2008 13:39:18 -0700 Subject: [cfe-dev] module.diff [Re: Continuing Adventures with Objective-C] In-Reply-To: <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> References: <946BCDF2-56DA-45EC-9972-FF1CEFC0C1AE@apple.com> <1AC91899-5645-49F9-9830-8E85FF864493@swan.ac.uk> Message-ID: On May 13, 2008, at 9:22 AM, David Chisnall wrote: > module.diff contains the runtime-agnostic code for generating module- > level ObjC constructs (classes, categories and protocols). This > depends on the new runtime interface since the old one did not have > methods for doing any of this. > +void CodeGenModule::EmitObjCProtocolImplementation(const > ObjCProtocolDecl *PD){ > + llvm::SmallVector Protocols; > + for(unsigned i=0 ; igetNumReferencedProtocols() ; i++) > + { > + Protocols.push_back(PD->getReferencedProtocols()[i]->getName()); > + } Preferred coding style here is for (unsigned i = 0, e = PD->getNumReferencedProtocols(); i != e; i++) Protocols.push_back(PD->getReferencedProtocols()[i]->getName()); > + for(ObjCProtocolDecl::instmeth_iterator iter = PD- > >instmeth_begin() ; > + iter != PD->instmeth_end() ; iter++) { for(ObjCProtocolDecl::instmeth_iterator iter = PD->instmeth_begin(), iterEnd = PD->instmeth_end() ; iter != iterEnd iter++) { This avoids invoking end() after each iteration. - Devang From eli.friedman at gmail.com Fri May 16 16:20:08 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 14:20:08 -0700 Subject: [cfe-dev] Constant expression checking rewrite Message-ID: The other day, I ran into an annoying false positive in the constant expression-checking code; that code has actually been annoying me for a while, so I decided that the code needed to be replaced with some more accurate checking. Attached patch does this; it passes make test, so it's at least mostly right. It's also significantly stricter about what it allows through as a constant expression, so it shouldn't let through expressions that can't be codegen'ed properly. The implementation is complete in the sense that it follows all the C99 rules for constant expressions. I'll put together some additional tests to cover some of the new cases I'm checking before I commit. I haven't really carefully considered the diagnostics yet; we probably want more than just the generic "expression isn't constant" error, but I'm not sure exactly what. I've also marked some places that we might want to warn in -pedantic mode with FIXMEs. That said, the section about constant expressions in C99 is a bit messy, and I'm not sure what warnings are necessary/useful. If Expr::isConstantExpr in the sense that it is currently implemented is useful for outside code to query, I can refactor the code to calculate it. However, that will make things more complicated, and I don't think we need it. Being a constant expression in the C99 sense is not really an interesting property for any purpose I can think of. Besides Sema, there are only a couple of other users in the current codebase, and neither of them really want precisely what Expr::isConstantExpr returns. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: t.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080516/db35f83f/attachment-0001.txt From kremenek at apple.com Fri May 16 16:27:21 2008 From: kremenek at apple.com (Ted Kremenek) Date: Fri, 16 May 2008 14:27:21 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: Message-ID: On May 16, 2008, at 2:20 PM, Eli Friedman wrote: > If Expr::isConstantExpr in the sense that it is currently implemented > is useful for outside code to query, I can refactor the code to > calculate it. However, that will make things more complicated, and I > don't think we need it. Being a constant expression in the C99 sense > is not really an interesting property for any purpose I can think of. > Besides Sema, there are only a couple of other users in the current > codebase, and neither of them really want precisely what > Expr::isConstantExpr returns. Hi Eli, I'm not parsing the first couple sentences of this paragraph very well. Can you please elaborate? I'm also not certain what you mean by "Being a constant expression in the C99 sense is not really an interesting property for any purpose I can think of." It is very useful for isConstantExpr to evaluate the actual value of the expression as an APSInt constant. Ted From eli.friedman at gmail.com Fri May 16 16:43:56 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 14:43:56 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: Message-ID: On Fri, May 16, 2008 at 2:27 PM, Ted Kremenek wrote: > It is very useful for isConstantExpr to evaluate the actual value of the > expression as an APSInt constant. I think you're confusing Expr::isConstantExpr and Expr::isIntegerConstantExpr Okay, I'll try to rephrase the paragraph more simply: I don't think we need Expr::isConstantExpr outside of Sema; the current users don't really want it, and I can't think of any other interesting uses. -Eli From kremenek at apple.com Fri May 16 17:25:47 2008 From: kremenek at apple.com (Ted Kremenek) Date: Fri, 16 May 2008 15:25:47 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: Message-ID: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> On May 16, 2008, at 2:43 PM, Eli Friedman wrote: > On Fri, May 16, 2008 at 2:27 PM, Ted Kremenek > wrote: >> It is very useful for isConstantExpr to evaluate the actual value >> of the >> expression as an APSInt constant. > > I think you're confusing Expr::isConstantExpr and > Expr::isIntegerConstantExpr > > Okay, I'll try to rephrase the paragraph more simply: I don't think we > need Expr::isConstantExpr outside of Sema; the current users don't > really want it, and I can't think of any other interesting uses. > > -Eli > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev You have indeed identified my confusion. Thanks for the clarification! One potential client outside of Sema for Expr::isConstantExpr are refactoring clients that wish to determine if a transformed AST is constant expression in the C99 sense. Even this isn't strictly needed; transformed code can be checked by just running the parser +semantic analyzer on it. The clients of Expr::isConstantExpr that I see outside of Sema are the DeadStores analysis and CGExprAgg. What interface do you suggest that these clients use instead? Ted From akyrtzi at gmail.com Fri May 16 17:47:00 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Fri, 16 May 2008 15:47:00 -0700 Subject: [cfe-dev] [PATCH]: C++ decl classes for the AST Message-ID: <482E0EE4.7080105@gmail.com> Hi, The attached patch introduces new Decl subclasses that accomodate C++ class members (there are changes only to the AST library): -'CXXRecordDecl' (inherits from RecordDecl) is for C++ struct/union/classes that are not simple C structs (i.e. they contain methods, nested types etc.) -'ClassMember' serves as a base class for members of a CXXRecord. It provides the access specifier and the parent CXXRecord. Decls that inherit ClassMember: CXXField - for instance fields (inherits FieldDecl) CXXMethod - for static and instance methods (inherits FunctionDecl) NestedTypedef - for nested typedefs (inherits TypedefDecl) NestedRecordDecl - for nested struct/union/classes (inherits CXXRecordDecl) ClassVar - for static data members (inherits VarDecl) -I also moved the 'Decl' implementation to a separate 'DeclBase.cpp' file The instance fields of CXXRecord are stored in the members array of RecordDecl, thus the data layout of CXXRecord is calculated through the Record. All the other members (including the static fields), are ScopedDecls with the CXXRecord as declaration context, so they can be iterated through a general DeclContext member iterator (not implemented yet). Name lookup for class members will be efficient through the use of the IdentifierResolver. -Argiris -------------- next part -------------- A non-text attachment was scrubbed... Name: ast-cxxdecl.patch Type: text/x-diff Size: 55566 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080516/ae8653fe/attachment-0001.bin From eli.friedman at gmail.com Fri May 16 21:26:49 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 16 May 2008 19:26:49 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> References: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> Message-ID: On Fri, May 16, 2008 at 3:25 PM, Ted Kremenek wrote: > The clients of Expr::isConstantExpr that I see outside of Sema are the > DeadStores analysis and CGExprAgg. What interface do you suggest that these > clients use instead? CGExprAgg isn't really using it (it's an if around a FIXME), and the code that might go there should really be checking the LLVM values, not the AST value. I'm not exactly sure why the DeadStores pass cares if the value is constant; maybe it should just be checking for zero and other common "dummy" initializations? -Eli From mymlreader at gmail.com Fri May 16 21:35:33 2008 From: mymlreader at gmail.com (Zhongxing Xu) Date: Sat, 17 May 2008 10:35:33 +0800 Subject: [cfe-dev] How to know which edge the path is traversing? In-Reply-To: <140B3744-F7C1-4335-8823-5F52AA26A5C4@apple.com> References: <4619993f0805160732s59f465b5t28ad21d23873c743@mail.gmail.com> <140B3744-F7C1-4335-8823-5F52AA26A5C4@apple.com> Message-ID: <4619993f0805161935q5ade7f73pacaefb4ffeb5ce4d@mail.gmail.com> Thank you! On Sat, May 17, 2008 at 12:10 AM, Ted Kremenek wrote: > > On May 16, 2008, at 7:32 AM, Zhongxing Xu wrote: > > I use GRCoreEngine to do a path sensitive analysis. For example, I can get >> two paths for the program below: >> >> int f(int n) { >> if (n > 0) >> ... >> else >> ... >> } >> >> I can use the nodes in EndNodes to get these two paths (by backtracking >> from endnodes). >> There are two BlockEdgeDst nodes after the block containing the IfStmt "if >> (n>0)". >> How can I know which path is led by the condition n > 0 or n <= 0? That >> is, how can I know which is the "true"/"false" branch edge? >> > > Hi Zhongxing, > > You will want to inspect the successors CFBBlocks of the "source" in the > BlockEdge, and compare it against the destination. For CFGBlock's whose > terminator is a branch (if statements, loops, etc), the first successor > block is the true branch, the and second successor block is the false > branch. For example: > > > ProgramPoint P = N->getLocation(); // N is an ExplodedNode<...> > > if (BlockEdge* BE = dyn_cast(&P)) { > > CFGBlock* Src = BE->getSrc(); > CFGBlock* Dst = BE->getDst(); > > // Test if we are at a (binary) branch. > if (Src.hasBinaryBranchTerminator()) { > > if (*Src.succ_begin() == Dst) { > // We took the true branch. > } > else { > assert (*(Src.succ_begin()+1) == Dst); > // We took the false branch. > } > } > } > > Note that "hasBinaryBranchTerminator" only returns true for terminators > that are "ForStmt", "WhileStmt", "DoStmt", "IfStmt", "ChooseExpr", > "ConditionalOperator", and "BinaryOperator" (for '&&' and '||'). > IndirectGotoStmt and SwitchStmt work differently. IndirectGotoStmt always > branches to a special block where the actual indirect goto takes place (do a > CFG dump of code with a labeled goto to see what I mean). Blocks that have > a SwitchStmt terminator have as their successor blocks the targets of the > switch. In that case, each successor block should have "getLabel()" return > a SwitchStmt, with the exception of the last successor. The last successor > is always the "default" branch, which may be explicit (with a "default:" > label) or implicit (in the case of fall-through to the code after the switch > block). > > BTW, "hasBinaryBranchTerminator" was a recently added predicate method. > Most code that inspects terminators actually uses a switch statement on the > statement class of the terminator to handle both binary branches and other > terminator types. > > Ted > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080517/b0ed5eee/attachment.html From kremenek at apple.com Sat May 17 23:30:38 2008 From: kremenek at apple.com (Ted Kremenek) Date: Sat, 17 May 2008 21:30:38 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> Message-ID: On May 16, 2008, at 7:26 PM, Eli Friedman wrote: > I'm not exactly sure why the DeadStores pass cares if the value is > constant; maybe it should just be checking for zero and other common > "dummy" initializations? I found that defining the scope of dummy initializations is not really tractable. Simply checking for a constant assignment was 99% accurate in pruning out dead stores resulting from defensive programming. Aside from the dead store checker, I can see other cases in the static analyzer (and other clients such as refactoring) where knowing whether or not an expression is a constant expression is very useful. From kremenek at apple.com Sat May 17 23:44:36 2008 From: kremenek at apple.com (Ted Kremenek) Date: Sat, 17 May 2008 21:44:36 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> Message-ID: <4F1E8ACD-49D2-448B-B024-F896956FE632@apple.com> On May 17, 2008, at 9:30 PM, Ted Kremenek wrote: > On May 16, 2008, at 7:26 PM, Eli Friedman wrote: > >> I'm not exactly sure why the DeadStores pass cares if the value is >> constant; maybe it should just be checking for zero and other common >> "dummy" initializations? > > I found that defining the scope of dummy initializations is not really > tractable. Simply checking for a constant assignment was 99% accurate > in pruning out dead stores resulting from defensive programming. > > Aside from the dead store checker, I can see other cases in the static > analyzer (and other clients such as refactoring) where knowing whether > or not an expression is a constant expression is very useful. I apologize for digressing this thread, so I want to return to the original point. I think having Sema more accurately reflect what is a C99 constant expression is good thing (as done by your patch). My question now is what is the harm about Expr::isConstantExpr? This hasn't been made clear in this thread. Is the name just misleading? Does it return true when something isn't a constant expression? I don't have a problem removing it, but I'm trying to understand what the problem is, especially since there are clients of it outside of Sema. For the clients outside of Sema (including potential future clients who want to do similar queries on expressions), I want to understand what is fundamentally wrong or limited about Expr::isConstantExpr as it is now. I understand that there is an issue of cleanly implementing the checking of constant expressions in Sema so that we can report diagnostics (as done in your patch), and that keeping isConstantExpr in Expr makes this a little more difficult to do cleanly. Is this the main issue? From eli.friedman at gmail.com Sun May 18 12:09:49 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 18 May 2008 10:09:49 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> Message-ID: On Sat, May 17, 2008 at 9:44 PM, Ted Kremenek wrote: > I understand that there is an issue of cleanly implementing the checking of > constant expressions in Sema so that we can report diagnostics (as done in > your patch), and that keeping isConstantExpr in Expr makes this a little > more difficult to do cleanly. Is this the main issue? Essentially, yes; that's the biggest issue. Implementing this stuff outside of Sema would require a complicated return convention for reasonable diagnostics. It might end up being the best approach, but it's a lot more complicated. Also, there's really a few different kinds of constant expressions. First, there are integer constant expressions, which are precisely defined in the C99 standard to be essentially integers and integer arithmetic. Then, there are general constant expressions, also defined in the C99 standard, which are essentially either arithmetic constants (computed with integer/floating-point arithmetic), or an easily computable address (the details are slightly more complicated). This is what my patch implements. Then, there is what Expr::isConstantExpr returns, which isn't exactly clear; it's currently buggy, but it apparently tries to compute whether an expression is constant in the sense that it doesn't depend on the values of any variables/globals. This could potentially be useful, but it's not acceptable for the Sema checking; we cannot accept all expressions in this category as constant expressions because some of them might not be computable by the linker. One issue with my patch I haven't mentioned: it doesn't try to check whether an expression has a defined result when doing constant-expression checking. Unfortunately, this will significantly complicate the code, and I'm not sure what the best approach is yet (maybe leaving it to an analysis pass would be best). I wouldn't normally worry about that sort of thing, but division by zero in a global currently crashes llc, so it would be best if we emitted a hard error. -Eli From eli.friedman at gmail.com Sun May 18 13:04:54 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 18 May 2008 11:04:54 -0700 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: References: Message-ID: Should I take the fact that there have been no comments as meaning the patch is bad, the patch is okay, or just that nobody with the appropriate expertise had the time to take a look at it? -Eli On Wed, May 14, 2008 at 5:04 PM, Eli Friedman wrote: > Per subject, implementation of stddef.h. > > This implementation is correct per C99, but glibc does some funny > stuff including certain headers, so I'm not sure if this is completely > correct. > > -Eli > From eli.friedman at gmail.com Sun May 18 14:09:31 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 18 May 2008 12:09:31 -0700 Subject: [cfe-dev] Possible fix for issue with computation type for compound assignment Message-ID: Take the following C snippet: void a(unsigned char* a, unsigned b) {*a <<= b;} clang currently compiles this down to a shift of width i8. This is incorrect; per the C standard, the shift should occur in the type int. The difference doesn't really matter for codegen of a lot of operations: for example, if a and b are of type unsigned char, a+b can be computed in the width of unsigned char without affecting the result. However, it does matter for some operations: in LLVM, shifts greater than the width of the left operand are undefined. Another case where this matters: void a(signed char* a, signed char b) {*a /= b;} In this case, if a is -128 and b is -1, the correct result (using clang's definition of signed integer conversion) is -128; however, using the LLVM sdiv operator on i8, the result is undefined (and actually crashes on X86). Attached patch fixes Sema to return the correcct computation type for these cases. I'm not completely confident my fix is the right way to fix this, though. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: tt.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080518/b79181eb/attachment.txt From neil at daikokuya.co.uk Sun May 18 16:47:35 2008 From: neil at daikokuya.co.uk (Neil Booth) Date: Mon, 19 May 2008 06:47:35 +0900 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: References: Message-ID: <20080518214735.GJ23450@daikokuya.co.uk> Eli Friedman wrote:- > Should I take the fact that there have been no comments as meaning the > patch is bad, the patch is okay, or just that nobody with the > appropriate expertise had the time to take a look at it? My only concern would be that for typedef __typeof__(((int*)0)-((int*)0)) ptrdiff_t; I don't think subtracting NULL pointers is well-defined. That doesn't matter as long as clang doesn't complain about it and does what you expect though. Neil. From neil at daikokuya.co.uk Sun May 18 17:11:38 2008 From: neil at daikokuya.co.uk (Neil Booth) Date: Mon, 19 May 2008 07:11:38 +0900 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: References: Message-ID: <20080518221138.GK23450@daikokuya.co.uk> Eli Friedman wrote:- > Should I take the fact that there have been no comments as meaning the > patch is bad, the patch is okay, or just that nobody with the > appropriate expertise had the time to take a look at it? One more thing - since these macros use __typeof__, are we ever going to diagnose that as an extension? If we do that might cause a problem for user code, unless our diagnostic handlers can weasel through the macro expansion level(s) and figure out it was actually defined in a system header. Neil. From eli.friedman at gmail.com Sun May 18 17:23:29 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 18 May 2008 15:23:29 -0700 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: <20080518221138.GK23450@daikokuya.co.uk> References: <20080518221138.GK23450@daikokuya.co.uk> Message-ID: On Sun, May 18, 2008 at 3:11 PM, Neil Booth wrote: > Eli Friedman wrote:- > >> Should I take the fact that there have been no comments as meaning the >> patch is bad, the patch is okay, or just that nobody with the >> appropriate expertise had the time to take a look at it? > > One more thing - since these macros use __typeof__, are we ever > going to diagnose that as an extension? If we do that might cause > a problem for user code, unless our diagnostic handlers can weasel > through the macro expansion level(s) and figure out it was actually > defined in a system header. AFAIK, we already suppress warnings in system headers. And, at least for the moment, clang's implementation of -pedantic follows gcc's version in that we don't warn about the use of anything prefixed with double-underscore. (Since it's reserved by the standard, it's not a constraint violation to use it, so we're conforming even if we don't warn.) -Eli From eli.friedman at gmail.com Sun May 18 17:28:22 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 18 May 2008 15:28:22 -0700 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: <20080518214735.GJ23450@daikokuya.co.uk> References: <20080518214735.GJ23450@daikokuya.co.uk> Message-ID: On Sun, May 18, 2008 at 2:47 PM, Neil Booth wrote: > Eli Friedman wrote:- > >> Should I take the fact that there have been no comments as meaning the >> patch is bad, the patch is okay, or just that nobody with the >> appropriate expertise had the time to take a look at it? > > My only concern would be that for > > typedef __typeof__(((int*)0)-((int*)0)) ptrdiff_t; > > I don't think subtracting NULL pointers is well-defined. That doesn't > matter as long as clang doesn't complain about it and does what you > expect though. clang doesn't complain about it, and appears to work properly. On a side note, the C++ standard actually explicitly defines that ((int*)0)-((int*)0) is valid compares equal to 0. -Eli From eli.friedman at gmail.com Sun May 18 17:55:16 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 18 May 2008 15:55:16 -0700 Subject: [cfe-dev] Cleanup/Bugfixing in SemaInit.cpp Message-ID: Per desccription, patch attached. Mostly small changes: a few trivial bugfixes, and some refactoring. This patch fixes all the test failures except for one minor warning change. The one non-obvious change is the change to CheckImplicitInitList, doing the typechecking before constructing the implicit init list. As I state in a comment, the reason is that we can't know how many elements we need to add to the implicit init list until we've typechecked the children. I think this is the most reasonable way for the code to work correctly in its current form. I won't commit any patches to SemaInit.cpp without snaroff's approval, since I don't want to step on his toes. I can also split up the patch, if that would make it easier to review. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ttt.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080518/e1cf1b04/attachment-0001.txt From kremenek at apple.com Sun May 18 21:45:57 2008 From: kremenek at apple.com (Ted Kremenek) Date: Sun, 18 May 2008 19:45:57 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: References: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> Message-ID: <89216E18-C85E-4193-AC74-DF5F10F23237@apple.com> On May 18, 2008, at 10:09 AM, Eli Friedman wrote: > Essentially, yes; that's the biggest issue. Implementing this stuff > outside of Sema would require a complicated return convention for > reasonable diagnostics. It might end up being the best approach, but > it's a lot more complicated. Understood. The patch as you have also may just be an intermediate design to a final solution, as there is also the issue of your last point about undefined operations) where one may need to (partially) compute constant values (see below). > Also, there's really a few different kinds of constant expressions. > First, there are integer constant expressions, which are precisely > defined in the C99 standard to be essentially integers and integer > arithmetic. > > Then, there are general constant expressions, also defined in the C99 > standard, which are essentially either arithmetic constants (computed > with integer/floating-point arithmetic), or an easily computable > address (the details are slightly more complicated). This is what my > patch implements. > > Then, there is what Expr::isConstantExpr returns, which isn't exactly > clear; it's currently buggy, but it apparently tries to compute > whether an expression is constant in the sense that it doesn't depend > on the values of any variables/globals. This could potentially be > useful, but it's not acceptable for the Sema checking; we cannot > accept all expressions in this category as constant expressions > because some of them might not be computable by the linker. That is an extremely lucid answer. Thank you! > One issue with my patch I haven't mentioned: it doesn't try to check > whether an expression has a defined result when doing > constant-expression checking. Unfortunately, this will significantly > complicate the code, and I'm not sure what the best approach is yet > (maybe leaving it to an analysis pass would be best). I wouldn't > normally worry about that sort of thing, but division by zero in a > global currently crashes llc, so it would be best if we emitted a hard > error. Forgive my naivete, could Expr::isIntegerConstantExpr be used for some of these purposes? It seems like there are cases where you actually want to compute the constant value to do such checking, and this might be a valid approach (although I can see where it would be incomplete). I'm not certain if it would satisfy all the restrictions of a C99 constant expression. Alternatively, it does seem to me that to catch such cases in general one would need to recursively evaluate the subexpressions, compute the actual constants (when necessary), and see in what cases where an undefined result could occur (e.g., divide-by-zero, shift by too many bits). The checking done in your patch is written in a recursive fashion; it seems like the evaluation of the constant values could be done in this way also. I think the main complication it would add to the code is that it would create several more recursive methods to do the checking (e.g., one for checking integer binary operations, floating point binary operations, pointer arithmetic, etc., instead of just having a single case for BinaryOperator). Such logic would not be all that different from what is going on in the static analysis engine when evaluating constant values along paths. From eli.friedman at gmail.com Mon May 19 02:58:25 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 19 May 2008 00:58:25 -0700 Subject: [cfe-dev] Constant expression checking rewrite In-Reply-To: <89216E18-C85E-4193-AC74-DF5F10F23237@apple.com> References: <8083A3C2-4602-44EA-B31F-54CC01D36579@apple.com> <89216E18-C85E-4193-AC74-DF5F10F23237@apple.com> Message-ID: On Sun, May 18, 2008 at 7:45 PM, Ted Kremenek wrote: > Forgive my naivete, could Expr::isIntegerConstantExpr be used for some of > these purposes? It seems like there are cases where you actually want to > compute the constant value to do such checking, and this might be a valid > approach (although I can see where it would be incomplete). I'm not certain > if it would satisfy all the restrictions of a C99 constant expression. isIntegerConstantExpr doesn't accept general arithmetic expressions; only integer arithmetic is allowed. -Eli From mrs at apple.com Mon May 19 12:30:19 2008 From: mrs at apple.com (Mike Stump) Date: Mon, 19 May 2008 10:30:19 -0700 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: References: Message-ID: <5B1D136E-8ABC-48DB-AD9B-62640FC33854@apple.com> On May 18, 2008, at 11:04 AM, Eli Friedman wrote: > Should I take the fact that there have been no comments as meaning the > patch is bad, the patch is okay, or just that nobody with the > appropriate expertise had the time to take a look at it? I looked at it, seemed reasonable... From snaroff at apple.com Mon May 19 12:36:48 2008 From: snaroff at apple.com (Steve Naroff) Date: Mon, 19 May 2008 10:36:48 -0700 Subject: [cfe-dev] Cleanup/Bugfixing in SemaInit.cpp In-Reply-To: References: Message-ID: On May 18, 2008, at 3:55 PM, Eli Friedman wrote: > Per desccription, patch attached. Mostly small changes: a few trivial > bugfixes, and some refactoring. This patch fixes all the test > failures except for one minor warning change. > Excellent. > The one non-obvious change is the change to CheckImplicitInitList, > doing the typechecking before constructing the implicit init list. As > I state in a comment, the reason is that we can't know how many > elements we need to add to the implicit init list until we've > typechecked the children. I think this is the most reasonable way for > the code to work correctly in its current form. > Agreed. I had a feeling we'd need to separate type checking from implicit init list construction. > I won't commit any patches to SemaInit.cpp without snaroff's approval, > since I don't want to step on his toes. I can also split up the > patch, if that would make it easier to review. > No worries. Since you/I worked on CheckInitializerListTypes, it's really good to have you review/enhance CheckInitList. I find this part of the C language very tricky. fyi...I'm in the process of moving my family and Apple's World Wide Developers conference is coming up soon (as a result, CheckInitList hasn't received my attention for a couple weeks now). I noticed this patch doesn't enable any of these changes yet (i.e. you didn't change Sema::CheckInitializerTypes()). Was this intentional? I haven't reviewed the patch closely, however I think it moves the ball forward. Please commit (and we can iterate if necessary...). Thanks again - I greatly appreciate the work you are doing! snaroff > -Eli > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From natebegeman at mac.com Mon May 19 13:02:45 2008 From: natebegeman at mac.com (Nate Begeman) Date: Mon, 19 May 2008 11:02:45 -0700 Subject: [cfe-dev] __builtin_nan support Message-ID: Attached patch implements __builtin_nan, __builtin_nanf, and __builtin_nanl. When given a string literal as an argument, the expression becomes an FP literal that is a quiet NaN with the appropriate significand. When the argument is not a string literal, we let the code for handling builtin math library functions just turn the builtin into a call to the appropriate libm nan()/nanf()/nanl() call, which performs the same conversion. This behavior is similar to GCC, although I believe GCC's __builtin_nan does not call nan() if the argument is not a string literal. -------------- next part -------------- A non-text attachment was scrubbed... Name: builtin_nan.patch Type: application/octet-stream Size: 3745 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080519/e8ca4c07/attachment.obj -------------- next part -------------- From eli.friedman at gmail.com Mon May 19 13:12:28 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 19 May 2008 11:12:28 -0700 Subject: [cfe-dev] Cleanup/Bugfixing in SemaInit.cpp In-Reply-To: References: Message-ID: On Mon, May 19, 2008 at 10:36 AM, Steve Naroff wrote: > No worries. Since you/I worked on CheckInitializerListTypes, it's really > good to have you review/enhance CheckInitList. I find this part of the C > language very tricky. fyi...I'm in the process of moving my family and > Apple's World Wide Developers conference is coming up soon (as a result, > CheckInitList hasn't received my attention for a couple weeks now). Okay. > I noticed this patch doesn't enable any of these changes yet (i.e. you > didn't change Sema::CheckInitializerTypes()). Was this intentional? The enabling patch is trivial; I'll do it in a separate commit. > I haven't reviewed the patch closely, however I think it moves the ball > forward. Please commit (and we can iterate if necessary...). Okay, cool. > Thanks again - I greatly appreciate the work you are doing! Thanks. -Eli From eli.friedman at gmail.com Mon May 19 14:06:09 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 19 May 2008 12:06:09 -0700 Subject: [cfe-dev] __builtin_nan support In-Reply-To: References: Message-ID: On Mon, May 19, 2008 at 11:02 AM, Nate Begeman wrote: > Attached patch implements __builtin_nan, __builtin_nanf, and __builtin_nanl. @@ -58,6 +57,10 @@ if (SemaBuiltinUnorderedCompare(TheCall.get())) return true; return TheCall.take(); + case Builtin::BI__builtin_nan: + case Builtin::BI__builtin_nanf: + case Builtin::BI__builtin_nanl: + return SemaBuiltinNaN(FnInfo->getBuiltinID(), TheCall.get()); case Builtin::BI__builtin_shufflevector: return SemaBuiltinShuffleVector(TheCall.get()); } This is unsafe; in the case where SemaBuiltinNaN doesn't find a string literal, you end up with a dangling pointer. + llvm::APInt Val(64, 0x7ff8000000000000ULL, false); + + char *endp = 0; + uint64_t Significand = strtoull(data, &endp, 0); Windows doesn't have strtoull. -Eli From natebegeman at mac.com Mon May 19 14:29:36 2008 From: natebegeman at mac.com (Nate Begeman) Date: Mon, 19 May 2008 12:29:36 -0700 Subject: [cfe-dev] __builtin_nan support In-Reply-To: References: Message-ID: On May 19, 2008, at 12:06 PM, Eli Friedman wrote: > On Mon, May 19, 2008 at 11:02 AM, Nate Begeman > wrote: >> Attached patch implements __builtin_nan, __builtin_nanf, and >> __builtin_nanl. > > @@ -58,6 +57,10 @@ > if (SemaBuiltinUnorderedCompare(TheCall.get())) > return true; > return TheCall.take(); > + case Builtin::BI__builtin_nan: > + case Builtin::BI__builtin_nanf: > + case Builtin::BI__builtin_nanl: > + return SemaBuiltinNaN(FnInfo->getBuiltinID(), TheCall.get()); > case Builtin::BI__builtin_shufflevector: > return SemaBuiltinShuffleVector(TheCall.get()); > } > > This is unsafe; in the case where SemaBuiltinNaN doesn't find a string > literal, you end up with a dangling pointer. Sure enough, need a take vs. get > > > + llvm::APInt Val(64, 0x7ff8000000000000ULL, false); > + > + char *endp = 0; > + uint64_t Significand = strtoull(data, &endp, 0); > > Windows doesn't have strtoull. What's the suggested replacement? Nate From neil at daikokuya.co.uk Mon May 19 17:27:30 2008 From: neil at daikokuya.co.uk (Neil Booth) Date: Tue, 20 May 2008 07:27:30 +0900 Subject: [cfe-dev] __builtin_nan support In-Reply-To: References: Message-ID: <20080519222730.GL23450@daikokuya.co.uk> Nate Begeman wrote:- > > > > > > + llvm::APInt Val(64, 0x7ff8000000000000ULL, false); > > + > > + char *endp = 0; > > + uint64_t Significand = strtoull(data, &endp, 0); > > > > Windows doesn't have strtoull. > > What's the suggested replacement? I suggest you make a constructor in APFloat to take this kind of input, rather than indirecting through APInt. It can already convert decimals very efficiently (used for decimal->binary) and then you nail the long-double case too, without stroull and friends. Neil. From clattner at apple.com Tue May 20 18:41:04 2008 From: clattner at apple.com (Chris Lattner) Date: Tue, 20 May 2008 16:41:04 -0700 Subject: [cfe-dev] Implementation of stddef.h In-Reply-To: References: <20080518221138.GK23450@daikokuya.co.uk> Message-ID: On May 18, 2008, at 3:23 PM, Eli Friedman wrote: > On Sun, May 18, 2008 at 3:11 PM, Neil Booth > wrote: >> Eli Friedman wrote:- >> >>> Should I take the fact that there have been no comments as meaning >>> the >>> patch is bad, the patch is okay, or just that nobody with the >>> appropriate expertise had the time to take a look at it? >> >> One more thing - since these macros use __typeof__, are we ever >> going to diagnose that as an extension? If we do that might cause >> a problem for user code, unless our diagnostic handlers can weasel >> through the macro expansion level(s) and figure out it was actually >> defined in a system header. > > AFAIK, we already suppress warnings in system headers. And, at least > for the moment, clang's implementation of -pedantic follows gcc's > version in that we don't warn about the use of anything prefixed with > double-underscore. (Since it's reserved by the standard, it's not a > constraint violation to use it, so we're conforming even if we don't > warn.) Right. Another option for similar things is the __extension__ unary expression, which turns off extension warnings for its subexpression. -Chris From eli.friedman at gmail.com Tue May 20 22:25:46 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 20 May 2008 20:25:46 -0700 Subject: [cfe-dev] Test failure on Serialization/complex.c? Message-ID: I'm currently seeing a failure on Serialization/complex.c: glibc detected *** clang: corrupted double-linked list: 0x086896d0 ***. Does anyone else see this? I don't think I have any changes in my tree that could cause this, although I'll look more closely if nobody else is seeing this. -Eli From mymlreader at gmail.com Wed May 21 03:09:54 2008 From: mymlreader at gmail.com (Zhongxing Xu) Date: Wed, 21 May 2008 16:09:54 +0800 Subject: [cfe-dev] [PATCH] GRExprEngine bug Message-ID: <4619993f0805210109x3d14121ame941280d56ad5b49@mail.gmail.com> The patch is simple: Index: lib/Analysis/GRExprEngine.cpp =================================================================== --- lib/Analysis/GRExprEngine.cpp ??? 51366? +++ lib/Analysis/GRExprEngine.cpp ?????? @@ -1596,7 +1596,7 @@ if (asLVal) MakeNode(Dst, U, *I, SetRVal(St, U, location)); else - EvalLoad(Dst, Ex, *I, St, location); + EvalLoad(Dst, U, *I, St, location); } return; Test case: int foo(void) { int i; int *p = &i; if (*p > 0) return 0; else return 1; } Before patch: no warning After patch: ANALYZE: 2.c foo 2.c:4:3: warning: [CHECKER] Branch condition evaluates to an uninitialized value. if (*p > 0) ^ ~~ 1 diagnostic generated. Reason: The loaded value should be set to the UnaryOperator *p, but not its subexpr p. Note: This patch is very likely incomplete. GRExprEngine::EvalLoad() might also be modified. Ted should do better than me. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080521/d9f2c0fe/attachment.html From eli.friedman at gmail.com Wed May 21 04:36:32 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 21 May 2008 02:36:32 -0700 Subject: [cfe-dev] Implementation of mode attribute Message-ID: Per subject, implementation of the gcc mode attribute (e.g. int x __attribute__((mode(HI)));). Most significant usage is in the glibc headers to implement intN_t and friends (PR2204). The implementation is pretty straightforward: it modifies the type of the declaration to correspond to the type corresponding to the specified mode. Currently, the mappings are hardcoded, and only completely correct for X86. We should probably add some way to get integers/floats of specific widths from the ASTContext. There's also a way to use the mode attribute to specify vectors, but per gcc it's deprecated, and there aren't any headers on my system using it. If we do end up needing it, it's should be easy to add. One bug that's kind of out of the scope of this patch: we're currently processing attributes in the wrong order. For example, take the following declaration: float x __attribute((mode(DF),vector_size(16))); This should declare a variable with the LLVM type <2 x double> (a vector of size 16 with DFMode elements), but that doesn't work with this patch because we process the vector_size attribute first. I haven't actually seen any code which does this, though, so I won't worry about it for the moment. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: t.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080521/e8122c84/attachment.txt From eli.friedman at gmail.com Wed May 21 09:37:43 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 21 May 2008 07:37:43 -0700 Subject: [cfe-dev] Rewrite of codegen-level struct/union layout Message-ID: Per subject, attached patch almost completely rewrites the struct/union algorithm. The new version is a lot simpler; some of that was refactoring code, and some of that was depending a lot more on the information already calculated by the ASTContext. Depending on the ASTContext to do struct layout should make it easier to add support for constructs like packed and aligned, because this will pick up any changes in the way the ASTContext does struct layout for free. On a side note, after I finished this patch, PHP compiled with clang started working. I'm not sure if I fixed a struct layout bug, or some other change in my tree helped, but it was crashing on startup before this patch, and now it passes most of its testsuite (although this is with most of the extensions disabled). -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: t.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080521/3993ede5/attachment-0001.txt From snaroff at apple.com Wed May 21 09:41:24 2008 From: snaroff at apple.com (Steve Naroff) Date: Wed, 21 May 2008 07:41:24 -0700 Subject: [cfe-dev] Rewrite of codegen-level struct/union layout In-Reply-To: References: Message-ID: <73CBFC14-B168-481C-A50E-6F2453189194@apple.com> On May 21, 2008, at 7:37 AM, Eli Friedman wrote: > Per subject, attached patch almost completely rewrites the > struct/union algorithm. The new version is a lot simpler; some of > that was refactoring code, and some of that was depending a lot more > on the information already calculated by the ASTContext. > > Depending on the ASTContext to do struct layout should make it easier > to add support for constructs like packed and aligned, because this > will pick up any changes in the way the ASTContext does struct layout > for free. > > On a side note, after I finished this patch, PHP compiled with clang > started working. I'm not sure if I fixed a struct layout bug, or some > other change in my tree helped, but it was crashing on startup before > this patch, and now it passes most of its testsuite (although this is > with most of the extensions disabled). > Wow! Great news, snaroff > -Eli > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From kremenek at apple.com Wed May 21 10:54:16 2008 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 21 May 2008 08:54:16 -0700 Subject: [cfe-dev] Test failure on Serialization/complex.c? In-Reply-To: References: Message-ID: <8B4C3080-C148-4DD2-8119-63E6925EB053@apple.com> Thanks for fixing this. This failure didn't show up on Mac OS X. On May 20, 2008, at 8:25 PM, Eli Friedman wrote: > I'm currently seeing a failure on Serialization/complex.c: glibc > detected *** clang: corrupted double-linked list: 0x086896d0 ***. > > Does anyone else see this? I don't think I have any changes in my > tree that could cause this, although I'll look more closely if nobody > else is seeing this. > > -Eli > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From kremenek at apple.com Wed May 21 10:58:48 2008 From: kremenek at apple.com (Ted Kremenek) Date: Wed, 21 May 2008 08:58:48 -0700 Subject: [cfe-dev] [PATCH] GRExprEngine bug In-Reply-To: <4619993f0805210109x3d14121ame941280d56ad5b49@mail.gmail.com> References: <4619993f0805210109x3d14121ame941280d56ad5b49@mail.gmail.com> Message-ID: This patch looks good to me. Applied: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080519/005812.html The second argument to EvalLoad is the expression that the "loaded" value should bind to. By binding it to the subexpression, "U" would always bind to unknown instead. On May 21, 2008, at 1:09 AM, Zhongxing Xu wrote: > The patch is simple: > > Index: lib/Analysis/GRExprEngine.cpp > =================================================================== > --- lib/Analysis/GRExprEngine.cpp ??? 51366? > +++ lib/Analysis/GRExprEngine.cpp ?????? > @@ -1596,7 +1596,7 @@ > if (asLVal) > MakeNode(Dst, U, *I, SetRVal(St, U, location)); > else > - EvalLoad(Dst, Ex, *I, St, location); > + EvalLoad(Dst, U, *I, St, location); > } > > return; > > Test case: > > int foo(void) { > int i; > int *p = &i; > if (*p > 0) > return 0; > else > return 1; > } > > Before patch: > no warning > > After patch: > ANALYZE: 2.c foo > 2.c:4:3: warning: [CHECKER] Branch condition evaluates to an > uninitialized value. > if (*p > 0) > ^ ~~ > 1 diagnostic generated. > > Reason: > The loaded value should be set to the UnaryOperator *p, but not its > subexpr p. > > Note: > This patch is very likely incomplete. GRExprEngine::EvalLoad() might > also be modified. Ted should do better than me. > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From nunoplopes at sapo.pt Wed May 21 14:46:26 2008 From: nunoplopes at sapo.pt (Nuno Lopes) Date: Wed, 21 May 2008 20:46:26 +0100 Subject: [cfe-dev] Implementation of mode attribute In-Reply-To: References: Message-ID: You have a typo in a comment :P +/// HandleAddressSpaceTypeAttribute - Process a mode attribute on the +/// specified type. +QualType Sema::HandleModeTypeAttribute(QualType Type, + AttributeList *Attr) { Also these aren't supported in the switch() to look for the RetTy (not sure if it was by distraction or if they are really unsupported..). + if (!memcmp(Str, "XF", 2)) { DestWidth = 96; IntegerMode = false; break; } + if (!memcmp(Str, "TF", 2)) { DestWidth = 128; IntegerMode = false; break; } Nuno ----- Original Message ----- From: "Eli Friedman" To: "cfe-dev" Sent: Wednesday, May 21, 2008 10:36 AM Subject: [cfe-dev] Implementation of mode attribute > Per subject, implementation of the gcc mode attribute (e.g. int x > __attribute__((mode(HI)));). Most significant usage is in the glibc > headers to implement intN_t and friends (PR2204). > > The implementation is pretty straightforward: it modifies the type of > the declaration to correspond to the type corresponding to the > specified mode. Currently, the mappings are hardcoded, and only > completely correct for X86. We should probably add some way to get > integers/floats of specific widths from the ASTContext. > > There's also a way to use the mode attribute to specify vectors, but > per gcc it's deprecated, and there aren't any headers on my system > using it. If we do end up needing it, it's should be easy to add. > > One bug that's kind of out of the scope of this patch: we're currently > processing attributes in the wrong order. For example, take the > following declaration: > float x __attribute((mode(DF),vector_size(16))); > This should declare a variable with the LLVM type <2 x double> (a > vector of size 16 with DFMode elements), but that doesn't work with > this patch because we process the vector_size attribute first. I > haven't actually seen any code which does this, though, so I won't > worry about it for the moment. > > -Eli From eli.friedman at gmail.com Wed May 21 16:43:31 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 21 May 2008 14:43:31 -0700 Subject: [cfe-dev] Implementation of mode attribute In-Reply-To: References: Message-ID: On Wed, May 21, 2008 at 12:46 PM, Nuno Lopes wrote: > You have a typo in a comment :P > > +/// HandleAddressSpaceTypeAttribute - Process a mode attribute on the > +/// specified type. > +QualType Sema::HandleModeTypeAttribute(QualType Type, > + AttributeList *Attr) { Thanks. > Also these aren't supported in the switch() to look for the RetTy (not sure > if it was by distraction or if they are really unsupported..). > > + if (!memcmp(Str, "XF", 2)) { DestWidth = 96; IntegerMode = false; > break; } > + if (!memcmp(Str, "TF", 2)) { DestWidth = 128; IntegerMode = false; > break; } > > Nuno The issue is that there isn't any appropriate way to map them onto existing types. If I assumed X86, I could map XFmode to long double, but that isn't right for any other platform. On x86-64, XFmode is illegal, and TFmode is x87DoubleExtended; on PPC, XFmode is illegal, and TFmode is IEEEquad. The whole thing is rather messy. TImode is also unimplemented because clang doesn't have a 128-bit integer type currently. -Eli From dpatel at apple.com Wed May 21 18:25:05 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 21 May 2008 16:25:05 -0700 Subject: [cfe-dev] Rewrite of codegen-level struct/union layout In-Reply-To: References: Message-ID: <88DBB1F1-1FC0-4179-89B6-5743F02DD1C5@apple.com> On May 21, 2008, at 7:37 AM, Eli Friedman wrote: > Per subject, attached patch almost completely rewrites the > struct/union algorithm. The new version is a lot simpler; some of > that was refactoring code, and some of that was depending a lot more > on the information already calculated by the ASTContext. > > Depending on the ASTContext to do struct layout should make it easier > to add support for constructs like packed and aligned, because this > will pick up any changes in the way the ASTContext does struct layout > for free. One thing to note is, struct layout is very much target specific and not all target specific information is required for Semantic Analysis. I have not looked at your patch at all. I'll look at it and get back to you as soon as I can. > On a side note, after I finished this patch, PHP compiled with clang > started working. I'm not sure if I fixed a struct layout bug, or some > other change in my tree helped, but it was crashing on startup before > this patch, and now it passes most of its testsuite (although this is > with most of the extensions disabled). Cool. - Devang From mrs at apple.com Wed May 21 18:49:00 2008 From: mrs at apple.com (Mike Stump) Date: Wed, 21 May 2008 16:49:00 -0700 Subject: [cfe-dev] testing failures due to includes Message-ID: I'm seeing: ---- Sema/format-attribute.c failed ---- . ---- Sema/format-strings.c failed ---- ........................................................................................ ---- Sema/va-method-1.m failed ---- ................ ---- Sema/carbon.c failed ---- ........................ ---- Rewriter/va-method.m failed ---- ---- Analysis-Apple/CFDateGC.m failed ---- ---- Analysis-Apple/CFDate.m failed ---- ---- Analysis-Apple/CFString.c failed ---- ---- Analysis-Apple/uninit-msg-expr.m failed ---- ---- Analysis-Apple/NSString.m failed ---- ---- Analysis-Apple/NoReturn.m failed ---- ---- Sema/cocoa.m failed ---- Basically the include files aren't being found. The below fixes it. Any objections? If not, could someone drop this in for me? -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: d.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080521/e8aba976/attachment.txt From gohman at apple.com Wed May 21 16:31:35 2008 From: gohman at apple.com (Dan Gohman) Date: Wed, 21 May 2008 14:31:35 -0700 Subject: [cfe-dev] [PATCH] AST dependencies on VMCore Message-ID: Hello, I recently had the occasion to examine the dependencies from clang's AST library on LLVM's VMCore library. Attached is a patch which eliminates the dependencies I found. The changes are to move getAccessedFieldNo out of lib/AST/Expr.cpp into lib/CodeGen/CGExpr.cpp and to change include/clang/AST/Attr.h to use its own enum for visibility types instead of using llvm::GlobalValue::VisibilityTypes. Dan -------------- next part -------------- A non-text attachment was scrubbed... Name: clang-dependencies.patch Type: application/octet-stream Size: 10294 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080521/dc6ab1cf/attachment-0001.obj -------------- next part -------------- From clattner at apple.com Wed May 21 19:26:32 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 21 May 2008 17:26:32 -0700 Subject: [cfe-dev] [PATCH] AST dependencies on VMCore In-Reply-To: References: Message-ID: <18C0E902-1FBB-4AF3-AED6-A1FFC9D7AC1F@apple.com> On May 21, 2008, at 2:31 PM, Dan Gohman wrote: > I recently had the occasion to examine the dependencies from > clang's AST library on LLVM's VMCore library. Attached is a patch > which eliminates the dependencies I found. The changes are to move > getAccessedFieldNo out of lib/AST/Expr.cpp into > lib/CodeGen/CGExpr.cpp and to change include/clang/AST/Attr.h to > use its own enum for visibility types instead of using > llvm::GlobalValue::VisibilityTypes. Excellent! Please apply, -Chris From dpatel at apple.com Wed May 21 19:33:37 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 21 May 2008 17:33:37 -0700 Subject: [cfe-dev] testing failures due to includes In-Reply-To: References: Message-ID: <0345DEA6-D670-47E8-B6AA-F6E7AD439887@apple.com> Mike, How about using this opportunity to make this little bit more robust by making target specific (even if you coutinue hard coding path for targets) ? Are you interested in taking steps in that direction ? On May 21, 2008, at 4:49 PM, Mike Stump wrote: > I'm seeing: > > ---- Sema/format-attribute.c failed ---- > . > ---- Sema/format-strings.c failed ---- > ........................................................................................ > ---- Sema/va-method-1.m failed ---- > ................ > ---- Sema/carbon.c failed ---- > ........................ > ---- Rewriter/va-method.m failed ---- > > ---- Analysis-Apple/CFDateGC.m failed ---- > > ---- Analysis-Apple/CFDate.m failed ---- > > ---- Analysis-Apple/CFString.c failed ---- > > ---- Analysis-Apple/uninit-msg-expr.m failed ---- > > ---- Analysis-Apple/NSString.m failed ---- > > ---- Analysis-Apple/NoReturn.m failed ---- > > ---- Sema/cocoa.m failed ---- > > Basically the include files aren't being found. > > The below fixes it. Any objections? If not, could someone drop > this in for me? > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev - Devang From eli.friedman at gmail.com Wed May 21 20:11:59 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 21 May 2008 18:11:59 -0700 Subject: [cfe-dev] Rewrite of codegen-level struct/union layout In-Reply-To: <88DBB1F1-1FC0-4179-89B6-5743F02DD1C5@apple.com> References: <88DBB1F1-1FC0-4179-89B6-5743F02DD1C5@apple.com> Message-ID: On Wed, May 21, 2008 at 4:25 PM, Devang Patel wrote: > > On May 21, 2008, at 7:37 AM, Eli Friedman wrote: > >> Per subject, attached patch almost completely rewrites the >> struct/union algorithm. The new version is a lot simpler; some of >> that was refactoring code, and some of that was depending a lot more >> on the information already calculated by the ASTContext. >> >> Depending on the ASTContext to do struct layout should make it easier >> to add support for constructs like packed and aligned, because this >> will pick up any changes in the way the ASTContext does struct layout >> for free. > > One thing to note is, struct layout is very much target specific and not all > target specific information is required for Semantic Analysis. Hmm? We need to be able to compute the size and alignment of arbitrary structures in Sema, and we need to be able to compute the positions of arbitrary non-bitfield members in Sema. What exactly is left that Sema doesn't need? -Eli From mrs at apple.com Wed May 21 20:33:04 2008 From: mrs at apple.com (Mike Stump) Date: Wed, 21 May 2008 18:33:04 -0700 Subject: [cfe-dev] testing failures due to includes In-Reply-To: <0345DEA6-D670-47E8-B6AA-F6E7AD439887@apple.com> References: <0345DEA6-D670-47E8-B6AA-F6E7AD439887@apple.com> Message-ID: On May 21, 2008, at 5:33 PM, Devang Patel wrote: > How about using this opportunity to make this little bit more robust > by making target specific (even if you coutinue hard coding path for > targets) ? Are you interested in taking steps in that direction ? I'm more interested in getting more of C++ in. From clattner at apple.com Thu May 22 00:17:47 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 21 May 2008 22:17:47 -0700 Subject: [cfe-dev] Rewrite of codegen-level struct/union layout In-Reply-To: References: Message-ID: <5FD8A617-40F7-4C60-95F2-6F4136650595@apple.com> On May 21, 2008, at 7:37 AM, Eli Friedman wrote: > Per subject, attached patch almost completely rewrites the > struct/union algorithm. The new version is a lot simpler; some of > that was refactoring code, and some of that was depending a lot more > on the information already calculated by the ASTContext. > > Depending on the ASTContext to do struct layout should make it easier > to add support for constructs like packed and aligned, because this > will pick up any changes in the way the ASTContext does struct layout > for free. > > On a side note, after I finished this patch, PHP compiled with clang > started working. I'm not sure if I fixed a struct layout bug, or some > other change in my tree helped, but it was crashing on startup before > this patch, and now it passes most of its testsuite (although this is > with most of the extensions disabled). Very nice! -Chris From kremenek at apple.com Thu May 22 17:40:57 2008 From: kremenek at apple.com (Ted Kremenek) Date: Thu, 22 May 2008 15:40:57 -0700 Subject: [cfe-dev] Second Annual LLVM Developers' Meeting Message-ID: <9231281B-50AC-4E2F-9F28-9270D047E688@apple.com> Second Annual LLVM Developers' Meeting August 1, 2008 - Apple Inc. Campus, Cupertino, California, U.S.A. The second annual LLVM Developers' Meeting will be held this year at Apple Inc.'s main campus in Cupertino, California: http://llvm.org/devmtg Like last year's inaugural meeting, the meeting serves as a forum for both LLVM developers and users to get acquainted, to learn how LLVM is used, and to exchange ideas about LLVM and its (potential) applications. We invite everyone to officially register by July 20, 2008 for this meeting via our website: http://llvm.org/devmtg/register.php We believe this meeting will be of interest to the following people: ? Active LLVM developers and users. ? Anyone interested in using LLVM, either as part of a commercial product, open-source project, or research. ? Compiler, programming language, and language runtime enthusiasts. ? Those interested in using compiler technology in novel and interesting ways. Beyond discussing the core LLVM compiler infrastructure, this year's meeting will also dedicate a significant amount of attention to Clang, LLVM's new frontend for C-based languages. We also invite you to sign up for the official Developer Meeting mailing list to be kept informed of updates concerning the meeting: http://lists.cs.uiuc.edu/mailman/listinfo/llvm-devmeeting Last year's inaugural meeting was a success for LLVM and the LLVM community at large. We fully expect that this year's meeting will be an even greater success. Please join us! Potential Speakers If you are interested in presenting at this year's LLVM Developers' Meeting, please submit your talk proposal to us by June 30, 2008 via the website: http://www.llvm.org/devmtg/talk.php. About LLVM The Low-Level Virtual Machine (LLVM) is a collection of libraries and tools that make it easy to build compilers, optimizers, Just-In-Time code generators, and many other compiler-related programs. LLVM uses a single, language-independent virtual instruction set both as an offline code representation (to communicate code between compiler phases and to run-time systems) and as the compiler internal representation (to analyze and transform programs). This persistent code representation allows a common set of sophisticated compiler techniques to be applied at compile-time, link-time, install-time, run- time, or "idle-time" (between program runs). The strengths of the LLVM infrastructure are its extremely simple design (which makes it easy to understand and use), source-language independence, powerful mid-level optimizer, automated compiler debugging support, extensibility, and its stability and reliability. LLVM is currently being used to host a wide variety of academic research projects and commercial projects. For more information, please visit http://llvm.org. About Clang Clang is a new frontend for C-based languages, targeting support for C, Objective-C, and C++. Like the rest of LLVM, Clang consists of a collection of libraries, making it versatile in its applications. The goal of Clang is to be multipurpose, allowing not only the creation of standalone compilers for C-based languages, but also intelligent IDEs, refactoring tools, source to source translators, static analysis tools, and countless others. Other design goals of Clang include 100% compatibility with GCC and a high quality of implementation that makes Clang fast, scalable, and easy to customize and expand. Clang was announced at last year's Developer Meeting. This year's meeting will include an extensive discussion of Clang and its applications (both currently existing and planned). For more information, please visit http://clang.llvm.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080522/e66cf4a0/attachment.html From mrs at apple.com Thu May 22 21:02:04 2008 From: mrs at apple.com (Mike Stump) Date: Thu, 22 May 2008 19:02:04 -0700 Subject: [cfe-dev] More throw fixups Message-ID: <1FB7C8C8-46B6-4715-8A8F-F6BF9EE9581D@apple.com> This fixes the last of the throw parsing issues I know about... Ran the testsuite, no problems. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: d.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080522/32496d9a/attachment.txt -------------- next part -------------- From csaba.hruska at gmail.com Fri May 23 04:35:52 2008 From: csaba.hruska at gmail.com (Csaba Hruska) Date: Fri, 23 May 2008 11:35:52 +0200 Subject: [cfe-dev] external function codegen bug Message-ID: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> Hi! Here is a small piece of C code what compiles with gcc but clang asserts during codegen. I've used clang svn version: revision 51478 I've found this bug during compiling blender open source modeler ( blender.org). Cheers, Csaba -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080523/943559b0/attachment-0001.html From csaba.hruska at gmail.com Fri May 23 04:37:50 2008 From: csaba.hruska at gmail.com (Csaba Hruska) Date: Fri, 23 May 2008 11:37:50 +0200 Subject: [cfe-dev] Fwd: external function codegen bug In-Reply-To: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> References: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> Message-ID: <8914b92d0805230237n1a46bb3ch6605716d47b18fc8@mail.gmail.com> Hi! (now i send the email with code :p ) Here is a small piece of C code what compiles with gcc but clang asserts during codegen. I've used clang svn version: revision 51478 I've found this bug during compiling blender open source modeler ( blender.org). Cheers, Csaba buggy code: extern double g (); void f() { double t, g(); t = g(); } -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080523/e4f02cea/attachment.html From csaba.hruska at gmail.com Fri May 23 05:17:58 2008 From: csaba.hruska at gmail.com (Csaba Hruska) Date: Fri, 23 May 2008 12:17:58 +0200 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size Message-ID: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> Hi! It's me again ;) I've found an another bug. gcc accepts it, but clang thows this: error: variable length array declared outside of any function I'v compiled it with clang svn revision 51478. code: #define C1 100. #define C2 120. typedef struct { int A [(unsigned int)(C1*C2)]; } s_t; Cheers, Csaba -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080523/c0f771d9/attachment.html From eli.friedman at gmail.com Fri May 23 05:58:47 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 23 May 2008 03:58:47 -0700 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> Message-ID: On Fri, May 23, 2008 at 3:17 AM, Csaba Hruska wrote: > Hi! > It's me again ;) > I've found an another bug. > > gcc accepts it, but clang thows this: > error: variable length array declared outside of any function > > I'v compiled it with clang svn revision 51478. > > code: > #define C1 100. > #define C2 120. > > typedef struct > { > int A [(unsigned int)(C1*C2)]; > } s_t; gcc is wrong here; "(unsigned int)(100.*120.)" isn't an integer constant expression (per the definition in C99 6.6), so A is in fact an illegal VLA per the standard. Not sure what to do here. -Eli From eli.friedman at gmail.com Fri May 23 06:15:39 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 23 May 2008 04:15:39 -0700 Subject: [cfe-dev] Fwd: external function codegen bug In-Reply-To: <8914b92d0805230237n1a46bb3ch6605716d47b18fc8@mail.gmail.com> References: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> <8914b92d0805230237n1a46bb3ch6605716d47b18fc8@mail.gmail.com> Message-ID: On Fri, May 23, 2008 at 2:37 AM, Csaba Hruska wrote: > buggy code: > > extern double g (); > void f() > { > double t, g(); > t = g(); > } I'm pretty sure it's an issue with merging function decls. The issue is that t essentially disappears from the AST (try running this code through -ast-dump to see what I mean), so CodeGen ends up seeing a reference to an undeclared variable. Declaration merging and whatnot is tricky code that I never really studied closely, though, so it's not obvious why this is happening. -Eli From neil at daikokuya.co.uk Fri May 23 07:37:58 2008 From: neil at daikokuya.co.uk (Neil Booth) Date: Fri, 23 May 2008 21:37:58 +0900 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> Message-ID: <20080523123758.GN23450@daikokuya.co.uk> Eli Friedman wrote:- > On Fri, May 23, 2008 at 3:17 AM, Csaba Hruska wrote: > > Hi! > > It's me again ;) > > I've found an another bug. > > > > gcc accepts it, but clang thows this: > > error: variable length array declared outside of any function > > > > I'v compiled it with clang svn revision 51478. > > > > code: > > #define C1 100. > > #define C2 120. > > > > typedef struct > > { > > int A [(unsigned int)(C1*C2)]; > > } s_t; > > gcc is wrong here; "(unsigned int)(100.*120.)" isn't an integer > constant expression (per the definition in C99 6.6), so A is in fact > an illegal VLA per the standard. Not sure what to do here. Agreed. I see little reason to accept it; it's not hard to fix in the source. Neil. From clattner at apple.com Fri May 23 11:12:45 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 23 May 2008 09:12:45 -0700 Subject: [cfe-dev] Fwd: external function codegen bug In-Reply-To: References: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> <8914b92d0805230237n1a46bb3ch6605716d47b18fc8@mail.gmail.com> Message-ID: <521BCB73-DADC-480B-8822-DEA950797307@apple.com> On May 23, 2008, at 4:15 AM, Eli Friedman wrote: > On Fri, May 23, 2008 at 2:37 AM, Csaba Hruska > wrote: >> buggy code: >> >> extern double g (); >> void f() >> { >> double t, g(); >> t = g(); >> } > > I'm pretty sure it's an issue with merging function decls. The issue > is that t essentially disappears from the AST (try running this code > through -ast-dump to see what I mean), so CodeGen ends up seeing a > reference to an undeclared variable. Declaration merging and whatnot > is tricky code that I never really studied closely, though, so it's > not obvious why this is happening. One approach would be to have two predicates: one for "real i-c-e" and one for "gcc i-c-e". If it is a GCC ICE but not a standards one, accept but emit a diagnostic? In this case, it is pretty easy, in other cases (such as when dealing with ?: promotion rules) it is much harder. I don't know if it is worth it though. The set of stuff accepted by GCC is, uh, "poorly defined". -Chris From clattner at apple.com Fri May 23 11:28:50 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 23 May 2008 09:28:50 -0700 Subject: [cfe-dev] More throw fixups In-Reply-To: <1FB7C8C8-46B6-4715-8A8F-F6BF9EE9581D@apple.com> References: <1FB7C8C8-46B6-4715-8A8F-F6BF9EE9581D@apple.com> Message-ID: <0B2760CF-B1F8-453D-BFB6-7F3594486BB5@apple.com> On May 22, 2008, at 7:02 PM, Mike Stump wrote: > This fixes the last of the throw parsing issues I know about... > > Ran the testsuite, no problems. Hi Mike, Do you have a testcase? What does this fix? -Chris From eli.friedman at gmail.com Fri May 23 18:26:09 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 23 May 2008 16:26:09 -0700 Subject: [cfe-dev] Fwd: external function codegen bug In-Reply-To: <521BCB73-DADC-480B-8822-DEA950797307@apple.com> References: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> <8914b92d0805230237n1a46bb3ch6605716d47b18fc8@mail.gmail.com> <521BCB73-DADC-480B-8822-DEA950797307@apple.com> Message-ID: On Fri, May 23, 2008 at 9:12 AM, Chris Lattner wrote: >>> buggy code: >>> >>> extern double g (); >>> void f() >>> { >>> double t, g(); >>> t = g(); >>> } > One approach would be to have two predicates: one for "real i-c-e" and one > for "gcc i-c-e". If it is a GCC ICE but not a standards one, accept but > emit a diagnostic? In this case, it is pretty easy, in other cases (such as > when dealing with ?: promotion rules) it is much harder. > > I don't know if it is worth it though. The set of stuff accepted by GCC is, > uh, "poorly defined". > > -Chris Wrong thread? In reply to what you're saying though, I think it isn't worth it unless it turns out to be a common construct, because it's difficult to contain all the effects of deciding something is a "gcc i-c-e"; this case is easy, but stuff starts getting more complicated fast. Besides, once/if we start properly supporting FP rounding modes, we can actually end up with wrong code. -Eli From clattner at apple.com Fri May 23 18:36:15 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 23 May 2008 16:36:15 -0700 Subject: [cfe-dev] Fwd: external function codegen bug In-Reply-To: References: <8914b92d0805230235n40d04d9asae15f36b51cfb042@mail.gmail.com> <8914b92d0805230237n1a46bb3ch6605716d47b18fc8@mail.gmail.com> <521BCB73-DADC-480B-8822-DEA950797307@apple.com> Message-ID: <19784C19-6127-43D7-9293-5281B0FFC795@apple.com> On May 23, 2008, at 4:26 PM, Eli Friedman wrote: >> One approach would be to have two predicates: one for "real i-c-e" >> and one >> for "gcc i-c-e". If it is a GCC ICE but not a standards one, >> accept but >> emit a diagnostic? In this case, it is pretty easy, in other cases >> (such as >> when dealing with ?: promotion rules) it is much harder. >> >> I don't know if it is worth it though. The set of stuff accepted >> by GCC is, >> uh, "poorly defined". >> >> -Chris > > Wrong thread? Hrm, yeah, how about that. Email is hard for me, very technical and stuff. > In reply to what you're saying though, I think it isn't worth it > unless it turns out to be a common construct, because it's difficult > to contain all the effects of deciding something is a "gcc i-c-e"; > this case is easy, but stuff starts getting more complicated fast. > Besides, once/if we start properly supporting FP rounding modes, we > can actually end up with wrong code. Ok. -Chris From csaba.hruska at gmail.com Fri May 23 18:55:50 2008 From: csaba.hruska at gmail.com (Csaba Hruska) Date: Sat, 24 May 2008 01:55:50 +0200 Subject: [cfe-dev] clang bug. (possibly preprocessor) Message-ID: <8914b92d0805231655l491b04b8nb40fd57133ff0d5@mail.gmail.com> Hi! Here is my new bugreport :) gcc accepts it. Its from xvid library, but it is heavily stripped down to be simple and clean. it throws error with clang svn revision 51520. code (filename is important, filename is: qpel.c) : #ifndef XVID_AUTO_INCLUDE #define XVID_AUTO_INCLUDE #define FUNC_H H_Pass_16_C #include "qpel.c" #define FUNC_H H_Pass_8_C #include "qpel.c" #undef XVID_AUTO_INCLUDE typedef void ff(); typedef struct { ff *a; } S; S s = { H_Pass_8_C }; #endif #if defined(XVID_AUTO_INCLUDE) && defined(REFERENCE_CODE) #elif defined(XVID_AUTO_INCLUDE) && !defined(REFERENCE_CODE) static void FUNC_H(){}; #undef FUNC_H #endif -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080524/f43d329a/attachment.html From clattner at apple.com Fri May 23 18:59:00 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 23 May 2008 16:59:00 -0700 Subject: [cfe-dev] [PATCH]: C++ decl classes for the AST In-Reply-To: <482E0EE4.7080105@gmail.com> References: <482E0EE4.7080105@gmail.com> Message-ID: On May 16, 2008, at 3:47 PM, Argiris Kirtzidis wrote: > Hi, > > The attached patch introduces new Decl subclasses that accomodate C+ > + class members (there are changes only to the AST library): Nice! > -'CXXRecordDecl' (inherits from RecordDecl) is for C++ struct/union/ > classes that are not simple C structs (i.e. they contain methods, > nested types etc.) Great. > -'ClassMember' serves as a base class for members of a CXXRecord. It > provides the access specifier and the parent CXXRecord. Decls that > inherit ClassMember: > CXXField - for instance fields (inherits FieldDecl) > CXXMethod - for static and instance methods (inherits FunctionDecl) > NestedTypedef - for nested typedefs (inherits TypedefDecl) > NestedRecordDecl - for nested struct/union/classes (inherits > CXXRecordDecl) > ClassVar - for static data members (inherits VarDecl) I'm less enthused about using multiple inheritance for this :). Is there more common between members than just access specifiers? Would it be horrible to just give each of those classes their own AccessSpecifiers member? It doesn't seem very useful to pass around generic ClassMember*'s. If we need a generic "give me the access specifiers for this member", it could be written as a top-level method on decl or something. Alternatively, maybe AccessSpecifiers should be pushed up the class hierarchy, say to NamedDecl? Though it would not be used by every subclass, it would make things simpler that way. This would eliminate the need for classes like 'NestedTypedef' and 'NestedRecordDecl' which is strange. A typedef shouldn't be represented with a different class based on where it is defined. Some minor things: I don't think it makes sense for CXXMethod to derive from FunctionDecl, a CXXMethod doesn't have the "isa" property for FunctionDecl. Should ClassVarDecl -> CXXClassVarDecl? Does it make sense to distinguish these from CXXField in the class hierarchy? > -I also moved the 'Decl' implementation to a separate 'DeclBase.cpp' > file Ok. > The instance fields of CXXRecord are stored in the members array of > RecordDecl, thus the data layout of CXXRecord is calculated through > the Record. > All the other members (including the static fields), are ScopedDecls > with the CXXRecord as declaration context, so they can be iterated > through a general DeclContext member iterator (not implemented yet). > Name lookup for class members will be efficient through the use of > the IdentifierResolver. Nice! -Chris From kremenek at apple.com Fri May 23 19:04:19 2008 From: kremenek at apple.com (Ted Kremenek) Date: Fri, 23 May 2008 17:04:19 -0700 Subject: [cfe-dev] clang bug. (possibly preprocessor) In-Reply-To: <8914b92d0805231655l491b04b8nb40fd57133ff0d5@mail.gmail.com> References: <8914b92d0805231655l491b04b8nb40fd57133ff0d5@mail.gmail.com> Message-ID: Hi Csaba, Could you file a bugzilla report for this so that we can track the status of the problem? http://llvm.org/bugs/ Also, if you can, please make this test case self-contained. I'm not certain how I'm supposed to reproduce this error, since it includes "qpel.c", but your email implies the test case has that name, etc. If you can provide the exact command line (for clang) and test case than it will be much easier to figure out what is going wrong. Thanks! Ted On May 23, 2008, at 4:55 PM, Csaba Hruska wrote: > Hi! > Here is my new bugreport :) > gcc accepts it. Its from xvid library, but it is heavily stripped > down to be simple and clean. > it throws error with clang svn revision 51520. > > code (filename is important, filename is: qpel.c) : > > #ifndef XVID_AUTO_INCLUDE > > #define XVID_AUTO_INCLUDE > #define FUNC_H H_Pass_16_C > #include "qpel.c" > > #define FUNC_H H_Pass_8_C > > #include "qpel.c" > #undef XVID_AUTO_INCLUDE > > typedef void ff(); > typedef struct { ff *a; } S; > > S s = { H_Pass_8_C }; > > #endif > > #if defined(XVID_AUTO_INCLUDE) && defined(REFERENCE_CODE) > #elif defined(XVID_AUTO_INCLUDE) && !defined(REFERENCE_CODE) > > static void FUNC_H(){}; > #undef FUNC_H > > #endif > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From akyrtzi at gmail.com Sat May 24 04:22:47 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sat, 24 May 2008 02:22:47 -0700 Subject: [cfe-dev] [PATCH]: C++ decl classes for the AST In-Reply-To: References: <482E0EE4.7080105@gmail.com> Message-ID: <4837DE67.9080209@gmail.com> Chris Lattner wrote: > >> -'ClassMember' serves as a base class for members of a CXXRecord. It >> provides the access specifier and the parent CXXRecord. Decls that >> inherit ClassMember: >> CXXField - for instance fields (inherits FieldDecl) >> CXXMethod - for static and instance methods (inherits FunctionDecl) >> NestedTypedef - for nested typedefs (inherits TypedefDecl) >> NestedRecordDecl - for nested struct/union/classes (inherits >> CXXRecordDecl) >> ClassVar - for static data members (inherits VarDecl) > > I'm less enthused about using multiple inheritance for this :). Is > there more common between members than just access specifiers? Would > it be horrible to just give each of those classes their own > AccessSpecifiers member? > > It doesn't seem very useful to pass around generic ClassMember*'s. If > we need a generic "give me the access specifiers for this member", it > could be written as a top-level method on decl or something. > > Alternatively, maybe AccessSpecifiers should be pushed up the class > hierarchy, say to NamedDecl? Though it would not be used by every > subclass, it would make things simpler that way. This would eliminate > the need for classes like 'NestedTypedef' and 'NestedRecordDecl' which > is strange. A typedef shouldn't be represented with a different class > based on where it is defined. All the extra classes were due to the "keep to a decl only the absolutely necessary" philosophy so that a typedef in C doesn't carry a redundant member, but I actually agree with you and would prefer that things were simpler. I'd also suggest to *not* use a CXXRecord. The only difference from a Record is that CXXRecord can be used as a DeclContext, and it adds this kind of complexity: -create CXXRecord -parse the struct definition -determine that Record is sufficient -create new Record with new Fields -Replace CXXRecord with new Record in scope But this is not so bad as this: -struct Foo; // should create a Record #1 -Foo *x; #2 -struct Foo { typedef int Bar; } // should create a CXXRecord #3 Now merging #1 with #3 is not so simple anymore. x's type is of pointer of #1 but Bar belongs to #3. If there was only a Record class, we would easily say that Bar belongs to #1. To sum up, I suggest: -Not using a separate CXXRecord -Set Tag as DeclContext so that both Enum and Record can serve as DeclContext -Add AccessSpecifier at TypeDecl; Typedef, Record, Enum can use it (I had forgotten about Enum) -Add AccessSpecifier at CXXMethod, CXXClassVar. (Could be in ValueDecl but EnumConstant can get it from Enum and ParmVar don't use it) -Add 'isCXXClassMember()' and 'getCXXClassParent()' to NamedDecl. > > Some minor things: I don't think it makes sense for CXXMethod to > derive from FunctionDecl, a CXXMethod doesn't have the "isa" property > for FunctionDecl. I'd regard a CXXMethod as a Function with an implicit 'this' parameter; everything from FunctionDecl can be reused. What is the benefit of CXXMethodDecl as a different class from FunctionDecl ? > > Should ClassVarDecl -> CXXClassVarDecl? Ok. > Does it make sense to distinguish these from CXXField in the class > hierarchy? -The logic is that Field/CXXFields are instance vars that contribute to the data layout of the Record. Everything that refers to the layout need only refer to the members array of the Record without having to check whether a member from the array is static or not; all C struct layout code can work on a C++ struct with no modifications. -A static field have no use for Field's BitWidth, and as far as codegen is concerned, a static field has more in common with a VarDecl than with a FieldDecl. -Argiris From akyrtzi at gmail.com Sat May 24 05:00:46 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sat, 24 May 2008 03:00:46 -0700 Subject: [cfe-dev] [PATCH]: C++ decl classes for the AST In-Reply-To: <4837DE67.9080209@gmail.com> References: <482E0EE4.7080105@gmail.com> <4837DE67.9080209@gmail.com> Message-ID: <4837E74E.6060903@gmail.com> Argiris Kirtzidis wrote: > I'd also suggest to *not* use a CXXRecord. The only difference from a > Record is that CXXRecord can be used as a DeclContext, and it adds > this kind of complexity: > > -create CXXRecord > -parse the struct definition > -determine that Record is sufficient > -create new Record with new Fields > -Replace CXXRecord with new Record in scope I wasn't clear enough here; I didn't mean that there shouldn't be a check for simple structs and treating them differently, it's just that there is a minor complexity in Parser/Sema about starting with a CXXRecord but changing it to a Record later, but, as I mentioned later, the merging of Records is way more problematic. There is going to be a parsing of struct definitions like this (in C++): -parse the struct definition and create CXXFields -if the struct/class contains only public CXXFields, replace them with Fields. -Argiris From akyrtzi at gmail.com Sat May 24 05:53:14 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sat, 24 May 2008 03:53:14 -0700 Subject: [cfe-dev] [PATCH]: C++ decl classes for the AST In-Reply-To: <4837DE67.9080209@gmail.com> References: <482E0EE4.7080105@gmail.com> <4837DE67.9080209@gmail.com> Message-ID: <4837F39A.3040903@gmail.com> Argiris Kirtzidis wrote: > > -struct Foo; // should create a Record #1 > -Foo *x; #2 > -struct Foo { typedef int Bar; } // should create a CXXRecord #3 > > Now merging #1 with #3 is not so simple anymore. x's type is of > pointer of #1 but Bar belongs to #3. If there was only a Record class, > we would easily say that Bar belongs to #1. > > To sum up, I suggest: > > -Not using a separate CXXRecord Another idea is to use Record for C and CXXRecord for C++, but CXXRecord will be used everywhere, Records won't be mixed with CXXRecords. -Argiris From csaba.hruska at gmail.com Sat May 24 07:39:02 2008 From: csaba.hruska at gmail.com (Csaba Hruska) Date: Sat, 24 May 2008 14:39:02 +0200 Subject: [cfe-dev] typedefed function prototype bug Message-ID: <8914b92d0805240539t52701f5k6735ae988ccbd2ea@mail.gmail.com> Hi! Here is my newest bugreport :) http://llvm.org/bugs/show_bug.cgi?id=2360 clang segfaults for this: typedef void fn_t(); fn_t a,b; void b() { } Cheers, Csaba -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080524/b0c05c4d/attachment.html From neil at daikokuya.co.uk Sat May 24 07:48:55 2008 From: neil at daikokuya.co.uk (Neil Booth) Date: Sat, 24 May 2008 21:48:55 +0900 Subject: [cfe-dev] typedefed function prototype bug In-Reply-To: <8914b92d0805240539t52701f5k6735ae988ccbd2ea@mail.gmail.com> References: <8914b92d0805240539t52701f5k6735ae988ccbd2ea@mail.gmail.com> Message-ID: <20080524124855.GO23450@daikokuya.co.uk> Csaba Hruska wrote:- > Hi! > Here is my newest bugreport :) > http://llvm.org/bugs/show_bug.cgi?id=2360 > > > clang segfaults for this: > > typedef void fn_t(); > > fn_t a,b; > > void b() > { > } Works fine for me. You need to report your command line. Your prior report is also unusable as Ted noted; please submit full coherent bug reports. Neil. From csdavec at swansea.ac.uk Sat May 24 12:56:33 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Sat, 24 May 2008 18:56:33 +0100 Subject: [cfe-dev] Objective-C top-level constructs code generation Message-ID: Hi, I've started splitting the Objective-C code generation stuff up into smaller diffs. This one is a bit bigger than the others will be, but the places where it touches existing code are fairly isolated so it should be relatively easy to review. This provides implementations for the Objective-C classes, protocols and categories on the GNU runtime. I will include the implementation for the ?toil? runtime in a separate patch. David -------------- next part -------------- A non-text attachment was scrubbed... Name: objc_top_level.diff Type: application/octet-stream Size: 57245 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080524/0d20a902/attachment-0001.obj From eli.friedman at gmail.com Sun May 25 04:32:23 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 25 May 2008 02:32:23 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 Message-ID: The issue in PR2360 (http://llvm.org/bugs/show_bug.cgi?id=2360) is essentially that the destruction sees the first function definition multiple times, so the program essentially crashes trying to free it twice. The first function definition is seen multiple times because of the funny rearrangements that This patch rearranges things so that each top-level declaration is added to the scope in the usual way, rather than messing with the old declaration. This means that references to a function before a redeclaration refer to the old declaration, and references to a function after a redeclaration refer to the new declaration. The original arrangement where the old declaration was rearranged to look like new declaration was done in r50021; however, that rearrangement isn't needed to fix the original bug (test/Sema/redefinition.c). A nice side effect of this patch is that it makes the representation of function declarations in the AST more faithful to the original source, which might be useful for other purposes. I know there was an extremely long mailing list discussion related to this stuff, but I'm not entirely sure if any of it is relevant to this patch. There's one regression that I found with this patch relating to codegen of static forward declarations, but I don't think it's actually a bug in my patch; I'll have to study it a bit more, and I'll check in a fix for that separately before I check this in. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: t.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080525/647386ec/attachment.txt From akyrtzi at gmail.com Sun May 25 06:13:19 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Sun, 25 May 2008 04:13:19 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: Message-ID: <483949CF.3040301@gmail.com> Hi Eli, Eli Friedman wrote: > The issue in PR2360 (http://llvm.org/bugs/show_bug.cgi?id=2360) is > essentially that the destruction sees the first function definition > multiple times, so the program essentially crashes trying to free it > twice. The first function definition is seen multiple times because > of the funny rearrangements that > > [...] > > I know there was an extremely long mailing list discussion related to > this stuff, but I'm not entirely sure if any of it is relevant to this > patch. > These kind of issues with "swapping" decls triggered that discussion. > This patch rearranges things so that each top-level declaration is > added to the scope in the usual way, rather than messing with the old > declaration. This means that references to a function before a > redeclaration refer to the old declaration, and references to a > function after a redeclaration refer to the new declaration. The > original arrangement where the old declaration was rearranged to look > like new declaration was done in r50021; however, that rearrangement > isn't needed to fix the original bug (test/Sema/redefinition.c). > > A nice side effect of this patch is that it makes the representation > of function declarations in the AST more faithful to the original > source, which might be useful for other purposes. The consensus was that the same declaration node should be used throughout the AST for the same function, regardless of the redeclarations. The purpose of "swapping" decls were to allow only one declaration node to be visible in scope and also allow walking through all redeclarations if the client needs them. During that discussion in mailing list, a better alternative was suggested to accomplish this: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-May/001644.html IMHO, your patch should go in until someone implements that "proper" solution. -Argiris From eli.friedman at gmail.com Sun May 25 21:38:56 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Sun, 25 May 2008 19:38:56 -0700 Subject: [cfe-dev] stdarg.h standard header Message-ID: Patch per subject; not really that much to say about it, except that the __gnuc_va_list thing is ugly but unavoidable. After this come float.h and limits.h, which are more complicated if we want to implement http://lists.cs.uiuc.edu/pipermail/cfe-dev/2007-December/000560.html. I don't know how to hack the preprocessor to add something like that, though; would someone who knows that code better mind doing that? And after that, we're done with standard headers for C, at least for compiling programs using the standard library on Ubuntu. For freestanding programs, we'll also need out own stdint.h. And we still need to finish the SSE and Altivec intrinsics headers (if someone has some compact examples using those headers, that would be nice for testing). -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: t.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080525/c9d30b1a/attachment.txt From freeman1 at gmail.com Mon May 26 13:52:07 2008 From: freeman1 at gmail.com (John Freeman) Date: Mon, 26 May 2008 13:52:07 -0500 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: <20080523123758.GN23450@daikokuya.co.uk> References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> <20080523123758.GN23450@daikokuya.co.uk> Message-ID: Eli Friedman wrote:- > > > On Fri, May 23, 2008 at 3:17 AM, Csaba Hruska > wrote: > > > gcc accepts it, but clang thows this: > > > error: variable length array declared outside of any function > > > > gcc is wrong here; "(unsigned int)(100.*120.)" isn't an integer > > constant expression (per the definition in C99 6.6), so A is in fact > > an illegal VLA per the standard. Not sure what to do here. > Perhaps it could use a better diagnostic, like: error: array declared outside of function without integer constant length - John -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080526/e8dbe5cb/attachment.html From eli.friedman at gmail.com Mon May 26 21:03:47 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 26 May 2008 19:03:47 -0700 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> <20080523123758.GN23450@daikokuya.co.uk> Message-ID: On Mon, May 26, 2008 at 11:52 AM, John Freeman wrote: >> > > gcc accepts it, but clang thows this: >> > > error: variable length array declared outside of any function > Perhaps it could use a better diagnostic, like: > > error: array declared outside of function without integer constant length Okay, I changed the diagnostic to something similar to your suggestion. -Eli From eli.friedman at gmail.com Mon May 26 23:29:55 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 26 May 2008 21:29:55 -0700 Subject: [cfe-dev] Possible fix for issue with computation type for compound assignment In-Reply-To: References: Message-ID: On Sun, May 18, 2008 at 12:09 PM, Eli Friedman wrote: > Take the following C snippet: > void a(unsigned char* a, unsigned b) {*a <<= b;} > > clang currently compiles this down to a shift of width i8. This is > incorrect; per the C standard, the shift should occur in the type int. > The difference doesn't really matter for codegen of a lot of > operations: for example, if a and b are of type unsigned char, a+b can > be computed in the width of unsigned char without affecting the > result. However, it does matter for some operations: in LLVM, shifts > greater than the width of the left operand are undefined. Another > case where this matters: > void a(signed char* a, signed char b) {*a /= b;} > In this case, if a is -128 and b is -1, the correct result (using > clang's definition of signed integer conversion) is -128; however, > using the LLVM sdiv operator on i8, the result is undefined (and > actually crashes on X86). > > Attached patch fixes Sema to return the correcct computation type for > these cases. I'm not completely confident my fix is the right way to > fix this, though. I think I'd like some sort of review for this patch before I commit it. -Eli From kremenek at apple.com Tue May 27 10:44:07 2008 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 27 May 2008 08:44:07 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: Message-ID: On May 25, 2008, at 2:32 AM, Eli Friedman wrote: > This means that references to a function before a > redeclaration refer to the old declaration, and references to a > function after a redeclaration refer to the new declaration. I like the idea of having multiple FunctionDecls, and this will be useful for any client that wishes to perform source-level transformations. For example, refactoring clients will need to know the location of every function declaration corresponding to the same function. One caveat: if a client inspects two separate DeclRefExprs, each one referring to a different FunctionDecl for the same function, is there a good, standard way to compare if they refer to the same function? It would be nice if we didn't have to do a full traversal of the "PreviousDeclaration" list for both FunctionDecls. From an efficiency standpoint, the number of redeclarations is small, so doing the list traversal itself might not matter, but it might be nice to have something like a Decl::isEqual method that could be used to compare if two Decls are "the same" from a semantic perspective. From eli.friedman at gmail.com Tue May 27 10:55:53 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 27 May 2008 08:55:53 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: Message-ID: On Tue, May 27, 2008 at 8:44 AM, Ted Kremenek wrote: > One caveat: if a client inspects two separate DeclRefExprs, each one > referring to a different FunctionDecl for the same function, is there a > good, standard way to compare if they refer to the same function? I think we'd need some sort of map for global names to implement this; two completely unrelated FunctionDecls can refer to the same function if they're in disjoint scopes. -Eli From kremenek at apple.com Tue May 27 11:05:38 2008 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 27 May 2008 09:05:38 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: Message-ID: On May 27, 2008, at 8:55 AM, Eli Friedman wrote: > On Tue, May 27, 2008 at 8:44 AM, Ted Kremenek > wrote: >> One caveat: if a client inspects two separate DeclRefExprs, each one >> referring to a different FunctionDecl for the same function, is >> there a >> good, standard way to compare if they refer to the same function? > > I think we'd need some sort of map for global names to implement this; > two completely unrelated FunctionDecls can refer to the same function > if they're in disjoint scopes. > > -Eli Yes, that makes sense. From akyrtzi at gmail.com Tue May 27 11:17:52 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Tue, 27 May 2008 09:17:52 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: Message-ID: <483C3430.7050607@gmail.com> Do you guys disagree with the suggestion here ? : http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-May/001644.html I was going to implement that one. The result will be that DeclRefExprs that refer to the same function will always refer to the same FunctionDecl as well. -Argiris Ted Kremenek wrote: > > On May 27, 2008, at 8:55 AM, Eli Friedman wrote: > >> On Tue, May 27, 2008 at 8:44 AM, Ted Kremenek >> wrote: >>> One caveat: if a client inspects two separate DeclRefExprs, each one >>> referring to a different FunctionDecl for the same function, is there a >>> good, standard way to compare if they refer to the same function? >> >> I think we'd need some sort of map for global names to implement this; >> two completely unrelated FunctionDecls can refer to the same function >> if they're in disjoint scopes. >> >> -Eli > > Yes, that makes sense. > From kremenek at apple.com Tue May 27 11:33:47 2008 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 27 May 2008 09:33:47 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: <483C3430.7050607@gmail.com> References: <483C3430.7050607@gmail.com> Message-ID: That sounds fine to me. We will still need a global map when doing things like inter-procedural analysis or analysis across translation units, but having all DeclRefExprs refer to the first FunctionDecl seems like a nice simplification. I agree with Doug's email that clients interested in source information can just walk the chain of Decls. On May 27, 2008, at 9:17 AM, Argiris Kirtzidis wrote: > Do you guys disagree with the suggestion here ? : http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-May/001644.html > I was going to implement that one. The result will be that > DeclRefExprs that refer to the same function will always refer to > the same FunctionDecl as well. > > > -Argiris > > > Ted Kremenek wrote: >> >> On May 27, 2008, at 8:55 AM, Eli Friedman wrote: >> >>> On Tue, May 27, 2008 at 8:44 AM, Ted Kremenek >>> wrote: >>>> One caveat: if a client inspects two separate DeclRefExprs, each >>>> one >>>> referring to a different FunctionDecl for the same function, is >>>> there a >>>> good, standard way to compare if they refer to the same function? >>> >>> I think we'd need some sort of map for global names to implement >>> this; >>> two completely unrelated FunctionDecls can refer to the same >>> function >>> if they're in disjoint scopes. >>> >>> -Eli >> >> Yes, that makes sense. >> From eli.friedman at gmail.com Tue May 27 11:44:09 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 27 May 2008 09:44:09 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: <483C3430.7050607@gmail.com> References: <483C3430.7050607@gmail.com> Message-ID: On Tue, May 27, 2008 at 9:17 AM, Argiris Kirtzidis wrote: > Do you guys disagree with the suggestion here ? : > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-May/001644.html > I was going to implement that one. The result will be that DeclRefExprs that > refer to the same function will always refer to the same FunctionDecl as > well. Mmm... the issue with that exact approach is that we lose information: once the AST is completely constructed, you can't tell which declaration a DeclRef refers to. Usually, it doesn't really matter; however, a rewriting tool might be interested in knowing which declaration is referred to, and we'd have funny cases where the type of the DeclRef and the type of the Decl itself are different. Still might be workable, though. -Eli From kremenek at apple.com Tue May 27 11:52:30 2008 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 27 May 2008 09:52:30 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: <483C3430.7050607@gmail.com> Message-ID: <3F7D4D73-7D10-4AB3-BCBE-12ED25A39D34@apple.com> On May 27, 2008, at 9:44 AM, Eli Friedman wrote: > On Tue, May 27, 2008 at 9:17 AM, Argiris Kirtzidis > wrote: >> Do you guys disagree with the suggestion here ? : >> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-May/001644.html >> I was going to implement that one. The result will be that >> DeclRefExprs that >> refer to the same function will always refer to the same >> FunctionDecl as >> well. > > Mmm... the issue with that exact approach is that we lose information: > once the AST is completely constructed, you can't tell which > declaration a DeclRef refers to. This makes sense. > Usually, it doesn't really matter; > however, a rewriting tool might be interested in knowing which > declaration is referred to, and we'd have funny cases where the type > of the DeclRef and the type of the Decl itself are different. Interesting. Can you give an example of this in the case of FunctionDecls? > Still > might be workable, though. It can be difficult to accurately reconstruct information after an "abstraction leak", but a global map (as you suggested) would provide the same functionality as Argiris's proposed change. And, as I already mentioned, we will need some global name resolution anyway, especially when dealing with multiple translation units. Handling multiple Decls within the same translation unit that refer to the same entity might just be a special case within that more general problem. From eli.friedman at gmail.com Tue May 27 12:09:58 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 27 May 2008 10:09:58 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: <3F7D4D73-7D10-4AB3-BCBE-12ED25A39D34@apple.com> References: <483C3430.7050607@gmail.com> <3F7D4D73-7D10-4AB3-BCBE-12ED25A39D34@apple.com> Message-ID: On Tue, May 27, 2008 at 9:52 AM, Ted Kremenek wrote: > Interesting. Can you give an example of this in the case of FunctionDecls? int a(); int b(void) {a(1);} int a(int i) {return i;} int c(void) {return a(2);} The resolved "a" in b has type int(), while the one in c has type int(int). Not usually significant, but could matter; for example, int(void) is compatible with int(), but not int(int). -Eli From kremenek at apple.com Tue May 27 12:13:13 2008 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 27 May 2008 10:13:13 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: <483C3430.7050607@gmail.com> <3F7D4D73-7D10-4AB3-BCBE-12ED25A39D34@apple.com> Message-ID: <8382565C-BEA0-4D16-9F3A-A805902F0069@apple.com> On May 27, 2008, at 10:09 AM, Eli Friedman wrote: > On Tue, May 27, 2008 at 9:52 AM, Ted Kremenek > wrote: >> Interesting. Can you give an example of this in the case of >> FunctionDecls? > > int a(); > int b(void) {a(1);} > int a(int i) {return i;} > int c(void) {return a(2);} > > The resolved "a" in b has type int(), while the one in c has type > int(int). Not usually significant, but could matter; for example, > int(void) is compatible with int(), but not int(int). > > -Eli Okay. This is what I thought you were referring to, but I wanted to make sure. Thanks! From akyrtzi at gmail.com Tue May 27 13:19:31 2008 From: akyrtzi at gmail.com (Argiris Kirtzidis) Date: Tue, 27 May 2008 11:19:31 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: References: Message-ID: <483C50B3.5030901@gmail.com> Eli Friedman wrote: > This means that references to a function before a > redeclaration refer to the old declaration, and references to a > function after a redeclaration refer to the new declaration. How about these cases: int a(int); // #1 int a(); // #2 int b(void) {a();} // should refer to #2 or is more accurate to refer to #1 ? And: int a(int x = 5); // #1 int a(int x); // #2 int b(void) {a();} // Shouldn't it refer to #1 ? -Argiris From mrs at apple.com Tue May 27 14:36:22 2008 From: mrs at apple.com (Mike Stump) Date: Tue, 27 May 2008 12:36:22 -0700 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> Message-ID: On May 23, 2008, at 3:58 AM, Eli Friedman wrote: > On Fri, May 23, 2008 at 3:17 AM, Csaba Hruska > wrote: >> Hi! >> It's me again ;) >> I've found an another bug. >> >> gcc accepts it, but clang thows this: >> error: variable length array declared outside of any function >> >> I'v compiled it with clang svn revision 51478. >> >> code: >> #define C1 100. >> #define C2 120. >> >> typedef struct >> { >> int A [(unsigned int)(C1*C2)]; >> } s_t; > > gcc is wrong here; No, it isn't: [#2] A constant expression can be evaluated during translation rather than runtime, and accordingly may be used in any place that a constant may be. [#10] An implementation may accept other forms of constant expressions. > "(unsigned int)(100.*120.)" isn't an integer > constant expression (per the definition in C99 6.6), so A is in fact > an illegal VLA per the standard. Not sure what to do here. If one can do FP math at compile time, an error is needlessly pedantic. From dpatel at apple.com Tue May 27 17:07:42 2008 From: dpatel at apple.com (Devang Patel) Date: Tue, 27 May 2008 15:07:42 -0700 Subject: [cfe-dev] Objective-C top-level constructs code generation In-Reply-To: References: Message-ID: Hi David, > > Index: CGObjCRuntime.h > =================================================================== > --- CGObjCRuntime.h (revision 51550) > +++ CGObjCRuntime.h (working copy) > @@ -15,6 +15,7 @@ > > #ifndef CLANG_CODEGEN_OBCJRUNTIME_H > #define CLANG_CODEGEN_OBCJRUNTIME_H > +#include "llvm/ADT/SmallVector.h" > > namespace llvm { > class IRBuilder; > @@ -25,11 +26,15 @@ > class Function; > } > > +//FIXME: The capitalisation of methods in this class is horribly > inconsistent. This is not useful. In some sense, it encourages someone to be sloppy and hope that everything will be fixed! > > namespace clang { > namespace CodeGen { > > -// Implements runtime-specific code generation functions > +//FIXME Several methods should be pure virtual but aren't to avoid > the > +//partially-implemented subclass breaking. This is not specific. Remove this from here and add individual FIXMEs. > + > +/// Implements runtime-specific code generation functions. > class CGObjCRuntime { > public: > virtual ~CGObjCRuntime(); > @@ -41,16 +46,71 @@ > llvm::Value *Receiver, > llvm::Value *Selector, > llvm::Value** ArgV, > - unsigned ArgC) = 0; > + unsigned ArgC) =0; > /// Generate the function required to register all Objective-C > components in > /// this compilation unit with the runtime library. > - virtual llvm::Function *ModuleInitFunction() { return 0; } > + virtual llvm::Function *ModuleInitFunction() =0; > + /// Get a selector for the specified name and type values > + virtual llvm::Value *getSelector(llvm::IRBuilder &Builder, > + llvm::Value *SelName, > + llvm::Value *SelTypes) =0; > + /// Generate a constant string object > + virtual llvm::Constant *GenerateConstantString(const char > *String, const size_t > + length) =0; > + /// Generate a category. A category contains a list of methods > (and > + /// accompanying metadata) and a list of protocols. > + virtual void GenerateCategory(const char *ClassName, const char > *CategoryName, > + const llvm::SmallVector > &InstanceMethodNames, If use llvm::SmallVectorImpl &InstanceMethodNames then you do not hard code vector size here and let each runtime implementation use it appropriate size. > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes, > + const llvm::SmallVector &Protocols) =0; > + /// Generate a class stucture for this class. > + virtual void GenerateClass( > + const char *ClassName, > + const char *SuperClassName, > + const int instanceSize, > + const llvm::SmallVector > &IvarNames, > + const llvm::SmallVector > &IvarTypes, > + const llvm::SmallVector > &IvarOffsets, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes, > + const llvm::SmallVector &Protocols) =0; > + /// Generate a reference to the named protocol. > + virtual llvm::Value *GenerateProtocolRef(llvm::IRBuilder > &Builder, const char > + *ProtocolName) =0; > + virtual llvm::Value *generateMessageSendSuper(llvm::IRBuilder > &Builder, GenereateMessage... > + const llvm::Type > *ReturnTy, > + llvm::Value *Sender, > + const char > *SuperClassName, > + llvm::Value *Receiver, > + llvm::Value *Selector, > + llvm::Value** ArgV, > + unsigned ArgC) {return > NULL;}; > + /// Generate the named protocol. Protocols contain method > metadata but no > + /// implementations. > + virtual void GenerateProtocol(const char *ProtocolName, > + const llvm::SmallVector &Protocols, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes) =0; > /// Generate a function preamble for a method with the specified > types > - virtual llvm::Function *MethodPreamble(const llvm::Type *ReturnTy, > + virtual llvm::Function *MethodPreamble( > + const std::string > &ClassName, > + const std::string > &CategoryName, > + const std::string > &MethodName, > + const llvm::Type *ReturnTy, > const llvm::Type *SelfTy, > const llvm::Type **ArgTy, > unsigned ArgC, > + bool isClassMethod, > bool isVarArg) = 0; > + /// Look up the class for the specified name > + virtual llvm::Value *LookupClass(llvm::IRBuilder &Builder, > llvm::Value > + *ClassName) =0; > /// If instance variable addresses are determined at runtime then > this should > /// return true, otherwise instance variables will be accessed > directly from > /// the structure. If this returns true then @defs is invalid > for this > Index: CGObjCGNU.cpp > =================================================================== > --- CGObjCGNU.cpp (revision 51550) > +++ CGObjCGNU.cpp (working copy) > @@ -16,11 +16,25 @@ > #include "llvm/Support/Compiler.h" > #include "llvm/Support/IRBuilder.h" > #include "llvm/ADT/SmallVector.h" > +#include > +#include > > +// Some zeros used for GEPs in lots of places. > +static llvm::Constant *Zeros[] = > {llvm::ConstantInt::get(llvm::Type::Int32Ty, 0), > + llvm::ConstantInt::get(llvm::Type::Int32Ty, 0) }; > +static llvm::Constant *NULLPtr = llvm::ConstantPointerNull::get( > + llvm::PointerType::getUnqual(llvm::Type::Int8Ty)); > + > +// The version of the runtime that this class targets. Must match > the version > +// in the runtime. > +const static int RuntimeVersion = 8; > +static llvm::Constant *ProtocolVersion = > + llvm::ConstantInt::get(llvm::Type::Int32Ty, 2); > namespace { > class CGObjCGNU : public clang::CodeGen::CGObjCRuntime { > private: > llvm::Module &TheModule; > + const llvm::StructType *SelStructTy; > const llvm::Type *SelectorTy; > const llvm::Type *PtrToInt8Ty; > const llvm::Type *IMPTy; > @@ -29,10 +43,51 @@ > const llvm::Type *PtrTy; > const llvm::Type *LongTy; > const llvm::Type *PtrToIntTy; > + std::vector Classes; > + std::vector Categories; > + std::vector ConstantStrings; > + llvm::Function *LoadFunction; > + std::map ExistingProtocols; > + typedef std::pair TypedSelector; > + std::map TypedSelectors; > + std::map UntypedSelectors; > +private: > + llvm::Constant *GenerateIvarList( > + const llvm::SmallVector &IvarNames, > + const llvm::SmallVector &IvarTypes, > + const llvm::SmallVector &IvarOffsets); > + llvm::Constant *GenerateMethodList(const std::string &ClassName, > + const std::string &CategoryName, > + const llvm::SmallVector &MethodNames, > + const llvm::SmallVector &MethodTypes, > + bool isClassMethodList); > + llvm::Constant *GenerateProtocolList( > + const llvm::SmallVector &Protocols); > + llvm::Constant *GenerateClassStructure( > + llvm::Constant *MetaClass, > + llvm::Constant *SuperClass, > + unsigned info, > + llvm::Constant *Name, > + llvm::Constant *Version, > + llvm::Constant *InstanceSize, > + llvm::Constant *IVars, > + llvm::Constant *Methods, > + llvm::Constant *Protocols); > + llvm::Constant *GenerateProtocolMethodList( > + const llvm::SmallVector &MethodNames, > + const llvm::SmallVector &MethodTypes); > + llvm::Constant *MakeConstantString(const std::string &Str, const > std::string > + &Name=""); > + llvm::Constant *MakeGlobal(const llvm::StructType *Ty, > + std::vector V, std::string Name=""); > + llvm::Constant *MakeGlobal(const llvm::ArrayType *Ty, > + std::vector V, std::string Name=""); > public: > CGObjCGNU(llvm::Module &Mp, > const llvm::Type *LLVMIntType, > const llvm::Type *LLVMLongType); > + virtual llvm::Constant *GenerateConstantString(const char > *String, const size_t > + length); > virtual llvm::Value *generateMessageSend(llvm::IRBuilder &Builder, > const llvm::Type > *ReturnTy, > llvm::Value *Sender, > @@ -40,17 +95,74 @@ > llvm::Value *Selector, > llvm::Value** ArgV, > unsigned ArgC); > - llvm::Value *getSelector(llvm::IRBuilder &Builder, > + virtual llvm::Value *generateMessageSendSuper(llvm::IRBuilder > &Builder, > + const llvm::Type > *ReturnTy, > + llvm::Value *Sender, > + const char > *SuperClassName, > + llvm::Value *Receiver, > + llvm::Value *Selector, > + llvm::Value** ArgV, > + unsigned ArgC); > + virtual llvm::Value *LookupClass(llvm::IRBuilder &Builder, > llvm::Value > + *ClassName); > + virtual llvm::Value *getSelector(llvm::IRBuilder &Builder, GetSelector ... > llvm::Value *SelName, > llvm::Value *SelTypes); > - virtual llvm::Function *MethodPreamble(const llvm::Type *ReturnTy, > - const llvm::Type *SelfTy, > - const llvm::Type **ArgTy, > - unsigned ArgC, > - bool isVarArg); > + virtual llvm::Function *MethodPreamble( > + const std::string > &ClassName, > + const std::string > &CategoryName, > + const std::string > &MethodName, > + const llvm::Type *ReturnTy, > + const llvm::Type *SelfTy, > + const llvm::Type **ArgTy, > + unsigned ArgC, > + bool isClassMethod, > + bool isVarArg); > + virtual void GenerateCategory(const char *ClassName, const char > *CategoryName, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes, > + const llvm::SmallVector > &Protocols); > + virtual void GenerateClass( > + const char *ClassName, > + const char *SuperClassName, > + const int instanceSize, > + const llvm::SmallVector > &IvarNames, > + const llvm::SmallVector > &IvarTypes, > + const llvm::SmallVector > &IvarOffsets, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes, > + const llvm::SmallVector > &Protocols); > + virtual llvm::Value *GenerateProtocolRef(llvm::IRBuilder > &Builder, const char > + *ProtocolName); > + virtual void GenerateProtocol(const char *ProtocolName, > + const llvm::SmallVector &Protocols, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes); > + virtual llvm::Function *ModuleInitFunction(); > }; > } // end anonymous namespace > > + > + > +static std::string SymbolNameForClass(std::string ClassName) { > + return ".objc_class_" + ClassName; > +} > + > +static std::string SymbolNameForMethod(const std::string > &ClassName, const > + std::string CategoryName, const std::string MethodName, bool > isClassMethod) > +{ > + if (isClassMethod) { > + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "+" > + MethodName; > + } nit pick. Use { } only if the block is more then 1 line. > + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "-" + > MethodName; > +} > + > CGObjCGNU::CGObjCGNU(llvm::Module &M, > const llvm::Type *LLVMIntType, > const llvm::Type *LLVMLongType) : > @@ -62,11 +174,12 @@ > PtrToInt8Ty = > llvm::PointerType::getUnqual(llvm::Type::Int8Ty); > // Get the selector Type. > - const llvm::Type *SelStructTy = llvm::StructType::get( > + SelStructTy = llvm::StructType::get( > PtrToInt8Ty, > PtrToInt8Ty, > NULL); > SelectorTy = llvm::PointerType::getUnqual(SelStructTy); > + M.addTypeName(".objc_selector", SelectorTy); > PtrToIntTy = llvm::PointerType::getUnqual(IntTy); > PtrTy = PtrToInt8Ty; > > @@ -77,24 +190,65 @@ > llvm::cast(OpaqueObjTy.get())- > >refineAbstractTypeTo(IdTy); > IdTy = llvm::cast(OpaqueObjTy.get()); > IdTy = llvm::PointerType::getUnqual(IdTy); > + M.addTypeName(".objc_id", IdTy); > > // IMP type > std::vector IMPArgs; > IMPArgs.push_back(IdTy); > IMPArgs.push_back(SelectorTy); > IMPTy = llvm::FunctionType::get(IdTy, IMPArgs, true); > - > + M.addTypeName(".objc_imp", IMPTy); > } > +// This has to perform the lookup every time, since posing and > related > +// techniques can modify the name -> class mapping. > +llvm::Value *CGObjCGNU::LookupClass(llvm::IRBuilder &Builder, > + llvm::Value *ClassName) { > + llvm::Constant *ClassLookupFn = > + TheModule.getOrInsertFunction("objc_lookup_class", IdTy, > PtrToInt8Ty, > + NULL); > + return Builder.CreateCall(ClassLookupFn, ClassName); > +} > > /// Looks up the selector for the specified name / type pair. > // FIXME: Selectors should be statically cached, not looked up on > every call. > llvm::Value *CGObjCGNU::getSelector(llvm::IRBuilder &Builder, > llvm::Value *SelName, > - llvm::Value *SelTypes) > -{ > - // Look up the selector. > + llvm::Value *SelTypes) { > + // For static selectors, we return an alias for now then store > them all in a > + // list that the runtime will initialise later. > + if (llvm::Constant *CName = > llvm::dyn_cast(SelName)) { > + // Untyped selector > + if (SelTypes == 0) { > + // If it's already cached, return it. > + if (UntypedSelectors[CName->getStringValue()]) { > + return Builder.CreateLoad(UntypedSelectors[CName- > >getStringValue()]); > + } > + // If it isn't, cache it. > + llvm::GlobalAlias *Sel = new > llvm::GlobalAlias(llvm::PointerType::getUnqual(SelectorTy), > + llvm::GlobalValue::InternalLinkage, > ".objc_untyped_selector_alias", NULL, > + &TheModule); > + UntypedSelectors[CName->getStringValue()] = Sel; > + return Builder.CreateLoad(Sel); > + } > + // Typed selectors > + if (llvm::Constant *CTypes = > llvm::dyn_cast(SelTypes)) { > + TypedSelector Selector = TypedSelector(CName->getStringValue(), > + CTypes->getStringValue()); > + // If it's already cached, return it. > + if (TypedSelectors[Selector]) { > + return Builder.CreateLoad(TypedSelectors[Selector]); > + } > + // If it isn't, cache it. > + llvm::GlobalAlias *Sel = new > llvm::GlobalAlias(llvm::PointerType::getUnqual(SelectorTy), > + llvm::GlobalValue::InternalLinkage, > ".objc_typed_selector_alias", NULL, > + &TheModule); > + TypedSelectors[Selector] = Sel; > + return Builder.CreateLoad(Sel); > + } > + } > + // Dynamically look up selectors from non-constant sources > llvm::Value *cmd; > - if(SelTypes == 0) { > + if (SelTypes == 0) { > llvm::Constant *SelFunction = > TheModule.getOrInsertFunction("sel_get_uid", > SelectorTy, > PtrToInt8Ty, > @@ -114,10 +268,88 @@ > } > > > +llvm::Constant *CGObjCGNU::MakeConstantString(const std::string > &Str, const > + std::string &Name) { > + llvm::Constant * ConstStr = llvm::ConstantArray::get(Str); > + ConstStr = new llvm::GlobalVariable(ConstStr->getType(), true, > + llvm::GlobalValue::InternalLinkage, > + ConstStr, Name, &TheModule); > + return llvm::ConstantExpr::getGetElementPtr(ConstStr, Zeros, 2); > +} > +llvm::Constant *CGObjCGNU::MakeGlobal(const llvm::StructType *Ty, > + std::vector V, std::string Name) { Please use vector reference here > + llvm::Constant *C = llvm::ConstantStruct::get(Ty, V); > + return new llvm::GlobalVariable(Ty, false, > + llvm::GlobalValue::InternalLinkage, C, Name, &TheModule); > +} > +llvm::Constant *CGObjCGNU::MakeGlobal(const llvm::ArrayType *Ty, > + std::vector V, std::string Name) { and here. > + llvm::Constant *C = llvm::ConstantArray::get(Ty, V); > + return new llvm::GlobalVariable(Ty, false, > + llvm::GlobalValue::InternalLinkage, C, Name, &TheModule); > +} > + > +/// Generate an NSConstantString object. > +//TODO: In case there are any crazy people still using Objective-C > without an > +//OpenStep implementation, this should let them select their own > class for > +//constant strings. :) Here, you want to say "GNU Objective-C runtime" here. > +llvm::Constant *CGObjCGNU::GenerateConstantString(const char > *String, const > + size_t length) { > + std::vector Ivars; > + Ivars.push_back(NULLPtr); > + Ivars.push_back(MakeConstantString(String)); > + Ivars.push_back(llvm::ConstantInt::get(IntTy, length)); > + llvm::Constant *ObjCStr = MakeGlobal( > + llvm::StructType::get(PtrToInt8Ty, PtrToInt8Ty, IntTy, NULL), > + Ivars, ".objc_str"); > + ConstantStrings.push_back( > + llvm::ConstantExpr::getBitCast(ObjCStr, PtrToInt8Ty)); > + return ObjCStr; > +} > +llvm::Value *CGObjCGNU::generateMessageSendSuper(llvm::IRBuilder > &Builder, > + const llvm::Type > *ReturnTy, > + llvm::Value *Sender, > + const char > *SuperClassName, > + llvm::Value *Receiver, > + llvm::Value *Selector, > + llvm::Value** ArgV, > + unsigned ArgC) { > + // TODO: This should be cached, not looked up every time. > + llvm::Value *ReceiverClass = LookupClass(Builder, > + MakeConstantString(SuperClassName)); > + llvm::Value *cmd = getSelector(Builder, Selector, 0); > + std::vector impArgTypes; > + impArgTypes.push_back(Receiver->getType()); > + impArgTypes.push_back(SelectorTy); > + > + // Avoid an explicit cast on the IMP by getting a version that > has the right > + // return type. > + llvm::FunctionType *impType = llvm::FunctionType::get(ReturnTy, > impArgTypes, > + true); > + // Construct the structure used to look up the IMP > + llvm::StructType *ObjCSuperTy = llvm::StructType::get(Receiver- > >getType(), > + IdTy, NULL); > + llvm::Value *ObjCSuper = Builder.CreateAlloca(ObjCSuperTy); > + Builder.CreateStore(Receiver, Builder.CreateStructGEP(ObjCSuper, > 0)); > + Builder.CreateStore(ReceiverClass, > Builder.CreateStructGEP(ObjCSuper, 1)); > + > + // Get the IMP > + llvm::Constant *lookupFunction = > + TheModule.getOrInsertFunction("objc_msg_lookup_super", > + > llvm::PointerType::getUnqual(impType), > + > llvm::PointerType::getUnqual(ObjCSuperTy), > + SelectorTy, NULL); > + llvm::Value *lookupArgs[] = {ObjCSuper, cmd}; > + llvm::Value *imp = Builder.CreateCall(lookupFunction, lookupArgs, > lookupArgs+2); > + > + // Call the method > + llvm::SmallVector callArgs; > + callArgs.push_back(Receiver); > + callArgs.push_back(cmd); > + callArgs.insert(callArgs.end(), ArgV, ArgV+ArgC); > + return Builder.CreateCall(imp, callArgs.begin(), callArgs.end()); > +} > /// Generate code for a message send expression on the GNU runtime. > -// FIXME: Much of this code will need factoring out later. > -// TODO: This should take a sender argument (pointer to self in the > calling > -// context) > llvm::Value *CGObjCGNU::generateMessageSend(llvm::IRBuilder &Builder, > const llvm::Type > *ReturnTy, > llvm::Value *Sender, > @@ -129,12 +361,21 @@ > > // Look up the method implementation. > std::vector impArgTypes; > + const llvm::Type *RetTy; > + if (ReturnTy->isFirstClassType() && ReturnTy != > llvm::Type::VoidTy) { Now, isSingleType() is now preferred over isFirstClassType() here and other places where you check return type. > + RetTy = ReturnTy; > + } else { > + // For struct returns allocate the space in the caller and pass > it up to > + // the sender. Note, LLVM is moving in the direction to where return will be able to return aggregates. > + RetTy = llvm::Type::VoidTy; > + impArgTypes.push_back(llvm::PointerType::getUnqual(ReturnTy)); > + } > impArgTypes.push_back(Receiver->getType()); > impArgTypes.push_back(SelectorTy); > > // Avoid an explicit cast on the IMP by getting a version that > has the right > // return type. > - llvm::FunctionType *impType = llvm::FunctionType::get(ReturnTy, > impArgTypes, > + llvm::FunctionType *impType = llvm::FunctionType::get(RetTy, > impArgTypes, > true); > > llvm::Constant *lookupFunction = > @@ -144,20 +385,480 @@ > llvm::Value *imp = Builder.CreateCall2(lookupFunction, Receiver, > cmd); > > // Call the method. > - llvm::SmallVector lookupArgs; > - lookupArgs.push_back(Receiver); > - lookupArgs.push_back(cmd); > - lookupArgs.insert(lookupArgs.end(), ArgV, ArgV+ArgC); > - return Builder.CreateCall(imp, lookupArgs.begin(), > lookupArgs.end()); > + if (ReturnTy->isFirstClassType() && ReturnTy != > llvm::Type::VoidTy) { > + llvm::SmallVector Args; > + Args.push_back(Receiver); > + Args.push_back(cmd); > + Args.insert(Args.end(), ArgV, ArgV+ArgC); > + return Builder.CreateCall(imp, Args.begin(), Args.end()); > + } else { > + llvm::SmallVector Args; > + llvm::Value *Return = Builder.CreateAlloca(ReturnTy); > + Args.push_back(Return); > + Args.push_back(Receiver); > + Args.push_back(cmd); > + Args.insert(Args.end(), ArgV, ArgV+ArgC); > + Builder.CreateCall(imp, Args.begin(), Args.end()); > + return Return; > + } > } > > +/// Generates a MethodList. Used in construction of a objc_class and > +/// objc_category structures. > +llvm::Constant *CGObjCGNU::GenerateMethodList(const std::string > &ClassName, > + const std::string &CategoryName, const > llvm::SmallVector + 16> &MethodNames, const llvm::SmallVector > + &MethodTypes, bool isClassMethodList) { > + // Get the method structure type. > + llvm::StructType *ObjCMethodTy = llvm::StructType::get( > + PtrToInt8Ty, // Really a selector, but the runtime creates it us. > + PtrToInt8Ty, // Method types > + llvm::PointerType::getUnqual(IMPTy), //Method pointer > + NULL); > + std::vector Methods; > + std::vector Elements; > + for(unsigned int i=0 ; i + Elements.clear(); > + > Elements > .push_back( llvm::ConstantExpr::getGetElementPtr(MethodNames[i], > + Zeros, 2)); > + Elements.push_back( > + llvm::ConstantExpr::getGetElementPtr(MethodTypes[i], > Zeros, 2)); > + llvm::Constant *Method = > + TheModule.getFunction(SymbolNameForMethod(ClassName, > CategoryName, > + MethodNames[i]->getStringValue(), isClassMethodList)); > + Method = llvm::ConstantExpr::getBitCast(Method, > + llvm::PointerType::getUnqual(IMPTy)); > + Elements.push_back(Method); > + Methods.push_back(llvm::ConstantStruct::get(ObjCMethodTy, > Elements)); > + } > + > + // Array of method structures > + llvm::ArrayType *ObjCMethodArrayTy = > llvm::ArrayType::get(ObjCMethodTy, > + MethodNames.size()); > + llvm::Constant *MethodArray = > llvm::ConstantArray::get(ObjCMethodArrayTy, > + Methods); > + > + // Structure containing list pointer, array and array count > + llvm::SmallVector ObjCMethodListFields; > + llvm::PATypeHolder OpaqueNextTy = llvm::OpaqueType::get(); > + llvm::Type *NextPtrTy = llvm::PointerType::getUnqual(OpaqueNextTy); > + llvm::StructType *ObjCMethodListTy = > llvm::StructType::get(NextPtrTy, > + IntTy, > + ObjCMethodArrayTy, > + NULL); > + // Refine next pointer type to concrete type > + llvm::cast( > + OpaqueNextTy.get())->refineAbstractTypeTo(ObjCMethodListTy); > + ObjCMethodListTy = > llvm::cast(OpaqueNextTy.get()); > + > + Methods.clear(); > + Methods.push_back(llvm::ConstantPointerNull::get( > + llvm::PointerType::getUnqual(ObjCMethodListTy))); > + Methods.push_back(llvm::ConstantInt::get(llvm::Type::Int32Ty, > + MethodTypes.size())); > + Methods.push_back(MethodArray); > + > + // Create an instance of the structure > + return MakeGlobal(ObjCMethodListTy, Methods, ".objc_method_list"); > +} > + > +/// Generates an IvarList. Used in construction of a objc_class > +llvm::Constant *CGObjCGNU::GenerateIvarList( > + const llvm::SmallVector &IvarNames, > + const llvm::SmallVector &IvarTypes, > + const llvm::SmallVector &IvarOffsets) { > + // Get the method structure type. > + llvm::StructType *ObjCIvarTy = llvm::StructType::get( > + PtrToInt8Ty, > + PtrToInt8Ty, > + IntTy, > + NULL); > + std::vector Ivars; > + std::vector Elements; > + for(unsigned int i=0 ; i + Elements.clear(); > + > Elements.push_back( llvm::ConstantExpr::getGetElementPtr(IvarNames[i], > + Zeros, 2)); > + > Elements.push_back( llvm::ConstantExpr::getGetElementPtr(IvarTypes[i], > + Zeros, 2)); > + Elements.push_back(IvarOffsets[i]); > + Ivars.push_back(llvm::ConstantStruct::get(ObjCIvarTy, Elements)); > + } > + > + // Array of method structures > + llvm::ArrayType *ObjCIvarArrayTy = llvm::ArrayType::get(ObjCIvarTy, > + IvarNames.size()); > + > + > + Elements.clear(); > + Elements.push_back(llvm::ConstantInt::get( > + llvm::cast(IntTy), > (int)IvarNames.size())); > + Elements.push_back(llvm::ConstantArray::get(ObjCIvarArrayTy, > Ivars)); > + // Structure containing array and array count > + llvm::StructType *ObjCIvarListTy = llvm::StructType::get(IntTy, > + ObjCIvarArrayTy, > + NULL); > + > + // Create an instance of the structure > + return MakeGlobal(ObjCIvarListTy, Elements, ".objc_ivar_list"); > +} > + > +/// Generate a class structure > +llvm::Constant *CGObjCGNU::GenerateClassStructure( > + llvm::Constant *MetaClass, > + llvm::Constant *SuperClass, > + unsigned info, > + llvm::Constant *Name, > + llvm::Constant *Version, > + llvm::Constant *InstanceSize, > + llvm::Constant *IVars, > + llvm::Constant *Methods, > + llvm::Constant *Protocols) { > + // Set up the class structure > + // Note: Several of these are char*s when they should be ids. > This is > + // because the runtime performs this translation on load. > + llvm::StructType *ClassTy = llvm::StructType::get( > + PtrToInt8Ty, // class_pointer > + PtrToInt8Ty, // super_class > + PtrToInt8Ty, // name > + LongTy, // version > + LongTy, // info > + LongTy, // instance_size > + IVars->getType(), // ivars > + Methods->getType(), // methods > + // These are all filled in by the runtime, so we pretend > + PtrTy, // dtable > + PtrTy, // subclass_list > + PtrTy, // sibling_class > + PtrTy, // protocols > + PtrTy, // gc_object_type > + NULL); > + llvm::Constant *Zero = llvm::ConstantInt::get(LongTy, 0); > + llvm::Constant *NullP = > + > llvm::ConstantPointerNull::get(llvm::cast(PtrTy)); > + // Fill in the structure > + std::vector Elements; > + Elements.push_back(llvm::ConstantExpr::getBitCast(MetaClass, > PtrToInt8Ty)); > + Elements.push_back(SuperClass); > + Elements.push_back(Name); > + Elements.push_back(Zero); > + Elements.push_back(llvm::ConstantInt::get(LongTy, info)); > + Elements.push_back(InstanceSize); > + Elements.push_back(IVars); > + Elements.push_back(Methods); > + Elements.push_back(NullP); > + Elements.push_back(NullP); > + Elements.push_back(NullP); > + Elements.push_back(llvm::ConstantExpr::getBitCast(Protocols, > PtrTy)); > + Elements.push_back(NullP); > + // Create an instance of the structure > + return MakeGlobal(ClassTy, Elements, SymbolNameForClass(Name- > >getStringValue())); > +} Add extra line before starting new function. > +llvm::Constant *CGObjCGNU::GenerateProtocolMethodList( > + const llvm::SmallVector &MethodNames, > + const llvm::SmallVector &MethodTypes) { > + // Get the method structure type. > + llvm::StructType *ObjCMethodDescTy = llvm::StructType::get( > + PtrToInt8Ty, // Really a selector, but the runtime does the > casting for us. > + PtrToInt8Ty, > + NULL); > + std::vector Methods; > + std::vector Elements; > + for(unsigned int i=0 ; i + Elements.clear(); > + > Elements > .push_back( llvm::ConstantExpr::getGetElementPtr(MethodNames[i], > + Zeros, 2)); > + Elements.push_back( > + llvm::ConstantExpr::getGetElementPtr(MethodTypes[i], > Zeros, 2)); > + Methods.push_back(llvm::ConstantStruct::get(ObjCMethodDescTy, > Elements)); > + } > + llvm::ArrayType *ObjCMethodArrayTy = > llvm::ArrayType::get(ObjCMethodDescTy, > + MethodNames.size()); > + llvm::Constant *Array = > llvm::ConstantArray::get(ObjCMethodArrayTy, Methods); > + llvm::StructType *ObjCMethodDescListTy = llvm::StructType::get( > + IntTy, ObjCMethodArrayTy, NULL); > + Methods.clear(); > + Methods.push_back(llvm::ConstantInt::get(IntTy, > MethodNames.size())); > + Methods.push_back(Array); > + return MakeGlobal(ObjCMethodDescListTy, Methods, > ".objc_method_list"); > +} > +// Create the protocol list structure used in classes, categories > and so on > +llvm::Constant *CGObjCGNU::GenerateProtocolList( > + const llvm::SmallVector &Protocols) { > + llvm::ArrayType *ProtocolArrayTy = > llvm::ArrayType::get(PtrToInt8Ty, > + Protocols.size()); > + llvm::StructType *ProtocolListTy = llvm::StructType::get( > + PtrTy, //Should be a recurisve pointer, but it's always NULL > here. > + LongTy,//FIXME: Should be size_t > + ProtocolArrayTy, > + NULL); > + std::vector Elements; > + for(const std::string *iter=Protocols.begin() ; iter != > Protocols.end() ; > + iter++) { > + llvm::Constant *Ptr = > + llvm::ConstantExpr::getBitCast(ExistingProtocols[*iter], > PtrToInt8Ty); > + Elements.push_back(Ptr); > + } > + llvm::Constant * ProtocolArray = > llvm::ConstantArray::get(ProtocolArrayTy, > + Elements); > + Elements.clear(); > + Elements.push_back(NULLPtr); > + > Elements > .push_back > (llvm::ConstantInt::get(llvm::cast(LongTy), > + Protocols.size())); > + Elements.push_back(ProtocolArray); > + return MakeGlobal(ProtocolListTy, Elements, ".objc_protocol_list"); > +} > +llvm::Value *CGObjCGNU::GenerateProtocolRef(llvm::IRBuilder > &Builder, const char > + *ProtocolName) { > + return ExistingProtocols[ProtocolName]; > +} > +void CGObjCGNU::GenerateProtocol(const char *ProtocolName, > + const llvm::SmallVector &Protocols, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes) { > + > + llvm::Constant *ProtocolList = GenerateProtocolList(Protocols); > + llvm::Constant *InstanceMethodList = > + GenerateProtocolMethodList(InstanceMethodNames, > InstanceMethodTypes); > + llvm::Constant *ClassMethodList = > + GenerateProtocolMethodList(ClassMethodNames, ClassMethodTypes); > + // Protocols are objects containing lists of the methods > implemented and > + // protocols adopted. > + llvm::StructType *ProtocolTy = llvm::StructType::get(IdTy, > + PtrToInt8Ty, > + ProtocolList->getType(), > + InstanceMethodList->getType(), > + ClassMethodList->getType(), > + NULL); > + std::vector Elements; > + // The isa pointer must be set to a magic number so the runtime > knows it's > + // the correct layout. > + > Elements.push_back(llvm::ConstantExpr::getIntToPtr(ProtocolVersion, > IdTy)); > + Elements.push_back(MakeConstantString(ProtocolName, > ".objc_protocol_name")); > + Elements.push_back(ProtocolList); > + Elements.push_back(InstanceMethodList); > + Elements.push_back(ClassMethodList); > + ExistingProtocols[ProtocolName] = > + llvm::ConstantExpr::getBitCast(MakeGlobal(ProtocolTy, Elements, > + ".objc_protocol"), IdTy); > +} > + > +void CGObjCGNU::GenerateCategory( > + const char *ClassName, > + const char *CategoryName, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes, > + const llvm::SmallVector &Protocols) { > + std::vector Elements; > + Elements.push_back(MakeConstantString(CategoryName)); > + Elements.push_back(MakeConstantString(ClassName)); > + // Instance method list > + > Elements > .push_back > (llvm::ConstantExpr::getBitCast(GenerateMethodList(ClassName, > + CategoryName, InstanceMethodNames, InstanceMethodTypes, > false), > + PtrTy)); > + // Class method list > + > Elements > .push_back > (llvm::ConstantExpr::getBitCast(GenerateMethodList(ClassName, > + CategoryName, ClassMethodNames, ClassMethodTypes, true), > PtrTy)); > + // Protocol list > + > Elements > .push_back > (llvm::ConstantExpr::getBitCast(GenerateProtocolList(Protocols), > + PtrTy)); > + Categories.push_back(llvm::ConstantExpr::getBitCast( > + MakeGlobal(llvm::StructType::get(PtrToInt8Ty, PtrToInt8Ty, > PtrTy, > + PtrTy, PtrTy, NULL), Elements), PtrTy)); > +} > +void CGObjCGNU::GenerateClass( > + const char *ClassName, > + const char *SuperClassName, > + const int instanceSize, > + const llvm::SmallVector > &IvarNames, > + const llvm::SmallVector > &IvarTypes, > + const llvm::SmallVector > &IvarOffsets, > + const llvm::SmallVector > &InstanceMethodNames, > + const llvm::SmallVector > &InstanceMethodTypes, > + const llvm::SmallVector > &ClassMethodNames, > + const llvm::SmallVector > &ClassMethodTypes, > + const llvm::SmallVector &Protocols) { > + // Get the superclass pointer. > + llvm::Constant *SuperClass; > + if (SuperClassName) { > + SuperClass = MakeConstantString(SuperClassName, > ".super_class_name"); > + } else { > + SuperClass = llvm::ConstantPointerNull::get( > + llvm::cast(PtrToInt8Ty)); > + } > + llvm::Constant * Name = MakeConstantString(ClassName, > ".class_name"); > + // Empty vector used to construct empty method lists > + llvm::SmallVector empty; > + // Generate the method and instance variable lists > + llvm::Constant *MethodList = GenerateMethodList(ClassName, "", > + InstanceMethodNames, InstanceMethodTypes, false); > + llvm::Constant *ClassMethodList = GenerateMethodList(ClassName, "", > + ClassMethodNames, ClassMethodTypes, true); > + llvm::Constant *IvarList = GenerateIvarList(IvarNames, IvarTypes, > + IvarOffsets); > + //Generate metaclass for class methods > + llvm::Constant *MetaClassStruct = GenerateClassStructure(NULLPtr, > + NULLPtr, 0x2L, NULLPtr, 0, Zeros[0], GenerateIvarList( > + empty, empty, empty), ClassMethodList, NULLPtr); > + // Generate the class structure > + llvm::Constant *ClassStruct = > GenerateClassStructure(MetaClassStruct, > + SuperClass, 0x1L, Name, 0, > + llvm::ConstantInt::get(llvm::Type::Int32Ty, instanceSize), > IvarList, > + MethodList, GenerateProtocolList(Protocols)); > + // Add class structure to list to be added to the symtab later > + ClassStruct = llvm::ConstantExpr::getBitCast(ClassStruct, > PtrToInt8Ty); > + Classes.push_back(ClassStruct); > +} > + > +llvm::Function *CGObjCGNU::ModuleInitFunction() { > + // Only emit an ObjC load function if no Objective-C stuff has > been called > + if (Classes.size() + Categories.size() + ConstantStrings.size() + > + ExistingProtocols.size() + TypedSelectors.size() + > + UntypedSelectors.size() == 0) { > + return NULL; > + } > + std::vector Elements; > + // Generate statics list: > + llvm::ArrayType *StaticsArrayTy = llvm::ArrayType::get(PtrToInt8Ty, > + ConstantStrings.size() + 1); > + ConstantStrings.push_back(NULLPtr); > + Elements.push_back(MakeConstantString("NSConstantString", > + ".objc_static_class_name")); > + Elements.push_back(llvm::ConstantArray::get(StaticsArrayTy, > ConstantStrings)); > + llvm::StructType *StaticsListTy = > llvm::StructType::get(PtrToInt8Ty, > + StaticsArrayTy, NULL); > + llvm::Constant *Statics = MakeGlobal(StaticsListTy, Elements, > ".objc_statics"); > + Statics = new > + > llvm::GlobalVariable(llvm::PointerType::getUnqual(StaticsListTy), > false, > + llvm::GlobalValue::InternalLinkage, Statics, > ".objc_statics_ptr", > + &TheModule); > + Statics = llvm::ConstantExpr::getBitCast(Statics, PtrTy); > + // Array of classes, categories, and constant objects > + llvm::ArrayType *ClassListTy = llvm::ArrayType::get(PtrToInt8Ty, > + Classes.size() + Categories.size() + 2); > + llvm::StructType *SymTabTy = llvm::StructType::get( > + LongTy, > + SelectorTy, > + llvm::Type::Int16Ty, > + llvm::Type::Int16Ty, > + ClassListTy, > + NULL); > + > + Elements.clear(); > + // Pointer to an array of selectors used in this module. > + std::vector Selectors; > + for(std::map::iterator > + iter=TypedSelectors.begin() ; iter!=TypedSelectors.end() ; > iter++) { > + Elements.push_back(MakeConstantString((*iter).first.first, > ".objc_sel_name")); > + Elements.push_back(MakeConstantString((*iter).first.first, > ".objc_sel_types")); > + Selectors.push_back(llvm::ConstantStruct::get(SelStructTy, > Elements)); > + Elements.clear(); > + } > + for(std::map::iterator > + iter=UntypedSelectors.begin() ; iter! > =UntypedSelectors.end() ; iter++) { > + Elements.push_back(MakeConstantString((*iter).first, > ".objc_sel_name")); > + Elements.push_back(NULLPtr); > + Selectors.push_back(llvm::ConstantStruct::get(SelStructTy, > Elements)); > + Elements.clear(); > + } > + Elements.push_back(NULLPtr); > + Elements.push_back(NULLPtr); > + Selectors.push_back(llvm::ConstantStruct::get(SelStructTy, > Elements)); > + Elements.clear(); > + // Number of static selectors > + Elements.push_back(llvm::ConstantInt::get(LongTy, > Selectors.size() )); > + llvm::Constant *SelectorList = MakeGlobal( > + llvm::ArrayType::get(SelStructTy, Selectors.size()), > Selectors, > + ".objc_selector_list"); > + Elements.push_back(llvm::ConstantExpr::getBitCast(SelectorList, > SelectorTy)); > + > + // Now that all of the static selectors exist, create pointers to > them. > + int index = 0; > + for(std::map::iterator > + iter=TypedSelectors.begin() ; iter!=TypedSelectors.end() ; > iter++) { Please use for(std::map::iterator iter=TypedSelectors.begin(), iterEnd =TypedSelectors.end(); iter != iterEnd; ++iter) form here and other places. > + llvm::Constant *Idxs[] = {Zeros[0], > + llvm::ConstantInt::get(llvm::Type::Int32Ty, index++), > Zeros[0]}; > + llvm::GlobalVariable *SelPtr = new > llvm::GlobalVariable(SelectorTy, true, > + llvm::GlobalValue::InternalLinkage, > + llvm::ConstantExpr::getGetElementPtr(SelectorList, Idxs, 2), > + ".objc_sel_ptr", &TheModule); > + (*iter).second->setAliasee(SelPtr); > + } > + for(std::map::iterator > + iter=UntypedSelectors.begin() ; iter! > =UntypedSelectors.end() ; iter++) { > + llvm::Constant *Idxs[] = {Zeros[0], > + llvm::ConstantInt::get(llvm::Type::Int32Ty, index++), > Zeros[0]}; > + llvm::GlobalVariable *SelPtr = new > llvm::GlobalVariable(SelectorTy, true, > + llvm::GlobalValue::InternalLinkage, > + llvm::ConstantExpr::getGetElementPtr(SelectorList, Idxs, 2), > + ".objc_sel_ptr", &TheModule); > + (*iter).second->setAliasee(SelPtr); > + } > + // Number of classes defined. > + Elements.push_back(llvm::ConstantInt::get(llvm::Type::Int16Ty, > + Classes.size())); > + // Number of categories defined > + Elements.push_back(llvm::ConstantInt::get(llvm::Type::Int16Ty, > + Categories.size())); > + // Create an array of classes, then categories, then static > object instances > + Classes.insert(Classes.end(), Categories.begin(), > Categories.end()); > + // NULL-terminated list of static object instances (mainly > constant strings) > + Classes.push_back(Statics); > + Classes.push_back(NULLPtr); > + llvm::Constant *ClassList = llvm::ConstantArray::get(ClassListTy, > Classes); > + Elements.push_back(ClassList); > + // Construct the symbol table > + llvm::Constant *SymTab= MakeGlobal(SymTabTy, Elements); > + > + // The symbol table is contained in a module which has some > version-checking > + // constants > + llvm::StructType * ModuleTy = llvm::StructType::get(LongTy, LongTy, > + PtrToInt8Ty, llvm::PointerType::getUnqual(SymTabTy), NULL); > + Elements.clear(); > + // Runtime version used for compatibility checking. > + Elements.push_back(llvm::ConstantInt::get(LongTy, RuntimeVersion)); > + //FIXME: Should be sizeof(ModuleTy) > + Elements.push_back(llvm::ConstantInt::get(LongTy, 16)); > + //FIXME: Should be the path to the file where this module was > declared > + Elements.push_back(NULLPtr); > + Elements.push_back(SymTab); > + llvm::Value *Module = MakeGlobal(ModuleTy, Elements); > + > + // Create the load function calling the runtime entry point with > the module > + // structure > + std::vector VoidArgs; > + llvm::Function * LoadFunction = llvm::Function::Create( > + llvm::FunctionType::get(llvm::Type::VoidTy, VoidArgs, false), > + llvm::GlobalValue::InternalLinkage, ".objc_load_function", > + &TheModule); > + llvm::BasicBlock *EntryBB = llvm::BasicBlock::Create("entry", > LoadFunction); > + llvm::IRBuilder Builder; > + Builder.SetInsertPoint(EntryBB); > + llvm::Value *Register = > TheModule.getOrInsertFunction("__objc_exec_class", > + llvm::Type::VoidTy, llvm::PointerType::getUnqual(ModuleTy), > NULL); > + Builder.CreateCall(Register, Module); > + Builder.CreateRetVoid(); > + return LoadFunction; > +} > llvm::Function *CGObjCGNU::MethodPreamble( > + const std::string > &ClassName, > + const std::string > &CategoryName, > + const std::string > &MethodName, > const llvm::Type *ReturnTy, > const llvm::Type *SelfTy, > const llvm::Type **ArgTy, > unsigned ArgC, > + bool isClassMethod, > bool isVarArg) { > std::vector Args; > + if (!ReturnTy->isFirstClassType() && ReturnTy != > llvm::Type::VoidTy) { > + Args.push_back(llvm::PointerType::getUnqual(ReturnTy)); > + ReturnTy = llvm::Type::VoidTy; > + } > Args.push_back(SelfTy); > Args.push_back(SelectorTy); > Args.insert(Args.end(), ArgTy, ArgTy+ArgC); > @@ -165,12 +866,20 @@ > llvm::FunctionType *MethodTy = llvm::FunctionType::get(ReturnTy, > Args, > isVarArg); > + std::string FunctionName = SymbolNameForMethod(ClassName, > CategoryName, > + MethodName, isClassMethod); > + > llvm::Function *Method = llvm::Function::Create(MethodTy, > llvm::GlobalValue::InternalLinkage, > - ".objc.method", > + FunctionName, > &TheModule); > - // Set the names of the hidden arguments > llvm::Function::arg_iterator AI = Method->arg_begin(); > + // Name the struct return argument. > + // FIXME: This is probably the wrong test. > + if (!ReturnTy->isFirstClassType() && ReturnTy != > llvm::Type::VoidTy) { > + AI->setName("agg.result"); > + ++AI; > + } > AI->setName("self"); > ++AI; > AI->setName("_cmd"); > Index: CodeGenModule.cpp > =================================================================== > --- CodeGenModule.cpp (revision 51550) > +++ CodeGenModule.cpp (working copy) > @@ -25,6 +25,7 @@ > #include "llvm/DerivedTypes.h" > #include "llvm/Module.h" > #include "llvm/Intrinsics.h" > +#include "llvm/Target/TargetData.h" > #include "llvm/Analysis/Verifier.h" > #include > using namespace clang; > @@ -49,10 +50,10 @@ > } > > CodeGenModule::~CodeGenModule() { > + EmitStatics(); > llvm::Function *ObjCInitFunction = Runtime->ModuleInitFunction(); > if (ObjCInitFunction) > AddGlobalCtor(ObjCInitFunction); > - EmitStatics(); > EmitGlobalCtors(); > EmitAnnotations(); > delete Runtime; > @@ -328,7 +329,159 @@ > if (OMD->getBody()) > CodeGenFunction(*this).GenerateObjCMethod(OMD); > } > +void CodeGenModule::EmitObjCProtocolImplementation(const > ObjCProtocolDecl *PD){ Why are these methods not runtime specific ? > + llvm::SmallVector Protocols; > + for(unsigned i=0 ; igetNumReferencedProtocols() ; i++) > + { > + Protocols.push_back(PD->getReferencedProtocols()[i]->getName()); > + } Avoid unnecessary { and } > + llvm::SmallVector InstanceMethodNames; > + llvm::SmallVector InstanceMethodTypes; > + for(ObjCProtocolDecl::instmeth_iterator iter = PD- > >instmeth_begin() ; > + iter != PD->instmeth_end() ; iter++) { > + std::string TypeStr; > + Context.getObjCEncodingForMethodDecl((*iter),TypeStr); > + InstanceMethodNames.push_back( > + GetAddrOfConstantString((*iter)->getSelector().getName())); > + InstanceMethodTypes.push_back(GetAddrOfConstantString(TypeStr)); > + } > + // Collect information about class methods: > + llvm::SmallVector ClassMethodNames; > + llvm::SmallVector ClassMethodTypes; > + for(ObjCProtocolDecl::classmeth_iterator iter = PD- > >classmeth_begin() ; > + iter != PD->classmeth_end() ; iter++) { > + std::string TypeStr; > + Context.getObjCEncodingForMethodDecl((*iter),TypeStr); > + ClassMethodNames.push_back( > + GetAddrOfConstantString((*iter)->getSelector().getName())); > + ClassMethodTypes.push_back(GetAddrOfConstantString(TypeStr)); > + } > + Runtime->GenerateProtocol(PD->getName(), Protocols, > InstanceMethodNames, > + InstanceMethodTypes, ClassMethodNames, ClassMethodTypes); > +} > > +void CodeGenModule::EmitObjCCategoryImpl(const ObjCCategoryImplDecl > *OCD) { > + > + // Collect information about instance methods > + llvm::SmallVector InstanceMethodNames; > + llvm::SmallVector InstanceMethodTypes; > + for(ObjCCategoryDecl::instmeth_iterator iter = OCD- > >instmeth_begin() ; > + iter != OCD->instmeth_end() ; iter++) { > + std::string TypeStr; > + Context.getObjCEncodingForMethodDecl((*iter),TypeStr); > + InstanceMethodNames.push_back( > + GetAddrOfConstantString((*iter)->getSelector().getName())); > + InstanceMethodTypes.push_back(GetAddrOfConstantString(TypeStr)); > + } > + > + // Collect information about class methods > + llvm::SmallVector ClassMethodNames; > + llvm::SmallVector ClassMethodTypes; > + for(ObjCCategoryDecl::classmeth_iterator iter = OCD- > >classmeth_begin() ; > + iter != OCD->classmeth_end() ; iter++) { > + std::string TypeStr; > + Context.getObjCEncodingForMethodDecl((*iter),TypeStr); > + ClassMethodNames.push_back( > + GetAddrOfConstantString((*iter)->getSelector().getName())); > + ClassMethodTypes.push_back(GetAddrOfConstantString(TypeStr)); > + } > + > + // Collect the names of referenced protocols > + llvm::SmallVector Protocols; > + ObjCInterfaceDecl * ClassDecl = (ObjCInterfaceDecl*)OCD- > >getClassInterface(); > + for(unsigned i=0 ; igetNumIntfRefProtocols() ; i++) { > + Protocols.push_back(ClassDecl->getReferencedProtocols()[i]- > >getName()); > + } > + > + // Generate the category > + Runtime->GenerateCategory(OCD->getClassInterface()->getName(), > + OCD->getName(), InstanceMethodNames, InstanceMethodTypes, > + ClassMethodNames, ClassMethodTypes, Protocols); > +} > + > +void CodeGenModule::EmitObjCClassImplementation( > + const ObjCImplementationDecl *OID) { > + // Get the superclass name. > + const ObjCInterfaceDecl * SCDecl = OID->getClassInterface()- > >getSuperClass(); > + const char * SCName = NULL; > + if (SCDecl) { > + SCName = SCDecl->getName(); > + } > + > + // Get the class name > + ObjCInterfaceDecl * ClassDecl = (ObjCInterfaceDecl*)OID- > >getClassInterface(); > + const char * ClassName = ClassDecl->getName(); > + > + // Get the size of instances. For runtimes that support late- > bound instances > + // this should probably be something different (size just of > instance > + // varaibles in this class, not superclasses?). > + int instanceSize = 0; > + const llvm::Type *ObjTy; > + if (!Runtime->LateBoundIVars()) { > + ObjTy = > getTypes().ConvertType(Context.getObjCInterfaceType(ClassDecl)); > + instanceSize = TheTargetData.getABITypeSize(ObjTy); > + } > + > + // Collect information about instance variables. > + llvm::SmallVector IvarNames; > + llvm::SmallVector IvarTypes; > + llvm::SmallVector IvarOffsets; > + const llvm::StructLayout *Layout = > + TheTargetData.getStructLayout(llvm::cast llvm::StructType>(ObjTy)); > + ObjTy = llvm::PointerType::getUnqual(ObjTy); > + for(ObjCInterfaceDecl::ivar_iterator iter = ClassDecl- > >ivar_begin() ; > + iter != ClassDecl->ivar_end() ; iter++) { > + // Store the name > + IvarNames.push_back(GetAddrOfConstantString((*iter)- > >getName())); > + // Get the type encoding for this ivar > + std::string TypeStr; > + llvm::SmallVector EncodingRecordTypes; > + Context.getObjCEncodingForType((*iter)->getType(), TypeStr, > + EncodingRecordTypes); > + IvarTypes.push_back(GetAddrOfConstantString(TypeStr)); > + // Get the offset > + int offset = > + (int)Layout- > >getElementOffset(getTypes().getLLVMFieldNo(*iter)); > + IvarOffsets.push_back( > + llvm::ConstantInt::get(llvm::Type::Int32Ty, offset)); > + } > + > + // Collect information about instance methods > + llvm::SmallVector InstanceMethodNames; > + llvm::SmallVector InstanceMethodTypes; > + for(ObjCImplementationDecl::instmeth_iterator iter = OID- > >instmeth_begin() ; > + iter != OID->instmeth_end() ; iter++) { > + std::string TypeStr; > + Context.getObjCEncodingForMethodDecl((*iter),TypeStr); > + InstanceMethodNames.push_back( > + GetAddrOfConstantString((*iter)->getSelector().getName())); > + InstanceMethodTypes.push_back(GetAddrOfConstantString(TypeStr)); > + } > + > + // Collect information about class methods > + llvm::SmallVector ClassMethodNames; > + llvm::SmallVector ClassMethodTypes; > + for(ObjCImplementationDecl::classmeth_iterator iter = OID- > >classmeth_begin() ; > + iter != OID->classmeth_end() ; iter++) { > + std::string TypeStr; > + Context.getObjCEncodingForMethodDecl((*iter),TypeStr); > + ClassMethodNames.push_back( > + GetAddrOfConstantString((*iter)->getSelector().getName())); > + ClassMethodTypes.push_back(GetAddrOfConstantString(TypeStr)); > + } > + // Collect the names of referenced protocols > + llvm::SmallVector Protocols; > + for(unsigned i=0 ; igetNumIntfRefProtocols() ; i++) { > + Protocols.push_back(ClassDecl->getReferencedProtocols()[i]- > >getName()); > + } > + > + // Generate the category > + Runtime->GenerateClass(ClassName, SCName, instanceSize, > IvarNames, IvarTypes, > + IvarOffsets, InstanceMethodNames, InstanceMethodTypes, > ClassMethodNames, > + ClassMethodTypes, Protocols); > +} > + > + > void CodeGenModule::EmitFunction(const FunctionDecl *FD) { > // If this is not a prototype, emit the body. > if (!FD->isThisDeclarationADefinition()) > @@ -600,6 +753,7 @@ > return MemSetFn = getIntrinsic(IID); > } > > +// FIXME: This needs moving into an Apple Objective-C runtime class > llvm::Constant *CodeGenModule:: > GetAddrOfConstantCFString(const std::string &str) { > llvm::StringMapEntry &Entry = > Index: CodeGenFunction.cpp > =================================================================== > --- CodeGenFunction.cpp (revision 51550) > +++ CodeGenFunction.cpp (working copy) > @@ -65,11 +65,22 @@ > for (unsigned i=0 ; iparam_size() ; i++) { > ParamTypes.push_back(ConvertType(OMD->getParamDecl(i)- > >getType())); > } > - CurFn =CGM.getObjCRuntime()->MethodPreamble(ConvertType(OMD- > >getResultType()), > - > llvm::PointerType::getUnqual(llvm::Type::Int32Ty), > - ParamTypes.begin(), > - OMD->param_size(), > - OMD->isVariadic()); > + std::string CategoryName = ""; > + if (ObjCCategoryImplDecl *OCD = > + dyn_cast(OMD->getMethodContext())) { > + CategoryName = OCD->getName(); > + } > + > + CurFn =CGM.getObjCRuntime()->MethodPreamble( > + OMD->getClassInterface()->getName(), > + CategoryName, > + OMD->getSelector().getName(), > + ConvertType(OMD->getResultType()), > + > llvm::PointerType::getUnqual(llvm::Type::Int32Ty), > + ParamTypes.begin(), > + OMD->param_size(), > + !OMD->isInstance(), > + OMD->isVariadic()); > llvm::BasicBlock *EntryBB = llvm::BasicBlock::Create("entry", > CurFn); > > // Create a marker to make it easy to insert allocas into the > entryblock > Index: CodeGenModule.h > =================================================================== > --- CodeGenModule.h (revision 51550) > +++ CodeGenModule.h (working copy) > @@ -32,6 +32,9 @@ > class ASTContext; > class FunctionDecl; > class ObjCMethodDecl; > + class ObjCImplementationDecl; > + class ObjCCategoryImplDecl; > + class ObjCProtocolDecl; > class Decl; > class Expr; > class Stmt; > @@ -112,6 +115,9 @@ > void EmitStatics(void); > > void EmitObjCMethod(const ObjCMethodDecl *OMD); > + void EmitObjCCategoryImpl(const ObjCCategoryImplDecl *OCD); > + void EmitObjCClassImplementation(const ObjCImplementationDecl > *OID); > + void EmitObjCProtocolImplementation(const ObjCProtocolDecl *PD); > void EmitFunction(const FunctionDecl *FD); > void EmitGlobalVar(const VarDecl *D); > void EmitGlobalVarInit(const VarDecl *D); > Index: ModuleBuilder.cpp > =================================================================== > --- ModuleBuilder.cpp (revision 51550) > +++ ModuleBuilder.cpp (working copy) > @@ -66,6 +66,24 @@ > > if (FunctionDecl *FD = dyn_cast(D)) { > Builder->EmitFunction(FD); > + } else if (dyn_cast(D)){ > + //Forward declaration. Only used for type checking. > + } else if (ObjCProtocolDecl *PD = > dyn_cast(D)){ > + // Generate Protocol object. > + Builder->EmitObjCProtocolImplementation(PD); > + } else if (dyn_cast(D)){ > + //Only used for typechecking. > + } else if (ObjCCategoryImplDecl *OCD = > dyn_cast(D)){ > + // Generate methods, attach to category structure > + Builder->EmitObjCCategoryImpl(OCD); > + } else if (ObjCImplementationDecl * OID = > + dyn_cast(D)){ > + // Generate methods, attach to class structure > + Builder->EmitObjCClassImplementation(OID); > + } else if (dyn_cast(D)){ > + // Ignore - generated when the implementation decl is > CodeGen'd > + } else if (ObjCMethodDecl *OMD = dyn_cast(D)){ > + Builder->EmitObjCMethod(OMD); > } else if (VarDecl *VD = dyn_cast(D)) { > if (VD->isFileVarDecl()) > Builder->EmitGlobalVarDeclarator(VD); > - Devang From eli.friedman at gmail.com Tue May 27 21:53:59 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 27 May 2008 19:53:59 -0700 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> Message-ID: On Tue, May 27, 2008 at 12:36 PM, Mike Stump wrote: > [#10] An implementation may accept other forms of constant > expressions. That doesn't apply to integer constant expressions. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_312.htm. >> "(unsigned int)(100.*120.)" isn't an integer >> constant expression (per the definition in C99 6.6), so A is in fact >> an illegal VLA per the standard. Not sure what to do here. > > If one can do FP math at compile time, an error is needlessly pedantic. If we implement rounding modes, the result of FP math is no longer constant. And actually, in certain edge cases involving null pointer constants and conditionals, treating an expression that isn't an integer constant expression as an integer constant expression can lead to errors on valid code. By the way, this is gcc bug 456 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=456). -Eli From eli.friedman at gmail.com Tue May 27 22:06:35 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 27 May 2008 20:06:35 -0700 Subject: [cfe-dev] [PATCH] Function redeclaration and PR2360 In-Reply-To: <483C50B3.5030901@gmail.com> References: <483C50B3.5030901@gmail.com> Message-ID: On Tue, May 27, 2008 at 11:19 AM, Argiris Kirtzidis wrote: > Eli Friedman wrote: >> >> This means that references to a function before a >> redeclaration refer to the old declaration, and references to a >> function after a redeclaration refer to the new declaration. > > How about these cases: > > int a(int); // #1 > int a(); // #2 > int b(void) {a();} // should refer to #2 or is more accurate to refer to #1 > ? a refers to #2; but note that per the rules about redeclaration, the type of #2 is actually int(int). (See C99 6.2.7 p3 and p4.) clang doesn't implement this bit correctly yet, though. > And: > > int a(int x = 5); // #1 > int a(int x); // #2 > int b(void) {a();} // Shouldn't it refer to #1 ? a again refers to #2; the only way that really makes sense here is to have #2 point to the default argument from #1. Take the following example: int a(int x, int y = 3); int a(int x = 5, int y); Neither version of a has all the arguments; we have to propagate them forward somehow. This is currently done in Sema::MergeCXXFunctionDecl. Note that there is currently a small bug here currently: declarations can't determine whether they own a default argument, so we end up leaking them. -Eli From bolzoni at cs.unipr.it Wed May 28 06:52:09 2008 From: bolzoni at cs.unipr.it (Paolo Bolzoni) Date: Wed, 28 May 2008 13:52:09 +0200 Subject: [cfe-dev] Linking options In-Reply-To: References: Message-ID: <20080528135209.41912632@cs.unipr.it> Here a very simple program, t.cc: #include int main(int argc, char** argv) { using namespace clang; SourceManager source_mgr; } Compiling command (taken almost one-to-one from clang executable linking): g++ -o /dev/null \ t.cc \ -lclangCodeGen \ -lclangAnalysis \ -lclangRewrite \ -lclangSEMA \ -lclangAST \ -lclangParse \ -lclangLex \ -lclangBasic \ -lLLVMCore \ -lLLVMSupport \ -lLLVMSystem \ -lLLVMBitWriter \ -lLLVMBitReader \ -lLLVMCodeGen \ -lLLVMTarget \ -lpthread \ -ldl \ -lm \ -lelf Result: /usr/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib/libLLVMBitReader.a(Deserialize.o): In function `llvm::Deserializer::~Deserializer()': Deserialize.cpp:(.text+0xe1): undefined reference to `llvm::BumpPtrAllocator::~BumpPtrAllocator()' /usr/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib/libLLVMBitReader.a(Deserialize.o): In function `llvm::Deserializer::~Deserializer()': Deserialize.cpp:(.text+0x181): undefined reference to `llvm::BumpPtrAllocator::~BumpPtrAllocator()' /usr/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib/libLLVMBitReader.a(Deserialize.o): In function `llvm::Deserializer::Deserializer(llvm::BitstreamReader&)': Deserialize.cpp:(.text+0x20c): undefined reference to `llvm::BumpPtrAllocator::BumpPtrAllocator()' /usr/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib/libLLVMBitReader.a(Deserialize.o): In function `llvm::Deserializer::Deserializer(llvm::BitstreamReader&)': Deserialize.cpp:(.text+0x300): undefined reference to `llvm::BumpPtrAllocator::BumpPtrAllocator()' /usr/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib/libLLVMBitReader.a(Deserialize.o): In function `llvm::Deserializer::ReadUIntPtr(unsigned long&, unsigned int const&, bool)': Deserialize.cpp:(.text+0x9bb): undefined reference to `llvm::BumpPtrAllocator::Allocate(unsigned long, unsigned long)' collect2: ld returned 1 exit status What am I doing wrong? From eli.friedman at gmail.com Wed May 28 07:09:35 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 28 May 2008 05:09:35 -0700 Subject: [cfe-dev] Linking options In-Reply-To: <20080528135209.41912632@cs.unipr.it> References: <20080528135209.41912632@cs.unipr.it> Message-ID: On Wed, May 28, 2008 at 4:52 AM, Paolo Bolzoni wrote: > Here a very simple program, t.cc: > Compiling command (taken almost one-to-one from clang executable linking): > g++ -o /dev/null \ > t.cc \ > -lclangCodeGen \ > -lclangAnalysis \ > -lclangRewrite \ > -lclangSEMA \ > -lclangAST \ > -lclangParse \ > -lclangLex \ > -lclangBasic \ > -lLLVMCore \ > -lLLVMSupport \ > -lLLVMSystem \ > -lLLVMBitWriter \ > -lLLVMBitReader \ > -lLLVMCodeGen \ > -lLLVMTarget \ > -lpthread \ > -ldl \ > -lm \ > -lelf > What am I doing wrong? Try putting your libraries in the following order: -lclangCodeGen -lclangAnalysis -lclangRewrite -lclangSEMA -lclangAST -lclangParse -lclangLex -lclangBasic -lLLVMBitWriter -lLLVMBitReader -lLLVMCodeGen -lLLVMTarget -lLLVMSupport -lLLVMCore -lLLVMSystem -lpthread -ldl -lm I'm not entirely sure how clang avoids running into this; I don't know very much about linker magic. -Eli From eli.friedman at gmail.com Wed May 28 09:56:08 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 28 May 2008 07:56:08 -0700 Subject: [cfe-dev] Simplify/fix-up Sema-level struct layout Message-ID: Patch per subject. This simplifies the code a bit by sharing the same codepath for unions and structs, and fixes some bugs with some edge cases (the edge cases addressed are in the testcases in the patch). As far as I know, this is correct on X86 for all combinations of bitfields, the packed attribute, and the aligned attribute in both structs and unions. However, I haven't tested heavily. (Anyone up for making a struct/union fuzz tester?) At the moment, codegen for packed bit-fields is broken, but that's a separate issue. -Eli -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: tt.txt Url: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080528/5038f6cf/attachment-0001.txt From csdavec at swansea.ac.uk Wed May 28 10:56:42 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Wed, 28 May 2008 16:56:42 +0100 Subject: [cfe-dev] Objective-C top-level constructs code generation In-Reply-To: References: Message-ID: Hi Devang, Thanks for the feedback. Here is a new version. On 27 May 2008, at 23:07, Devang Patel wrote: >> +//FIXME: The capitalisation of methods in this class is horribly >> inconsistent. > > This is not useful. In some sense, it encourages someone to be sloppy > and hope that everything will be fixed! Yup, total cop-out. Fixed now. >> +//FIXME Several methods should be pure virtual but aren't to avoid >> the >> +//partially-implemented subclass breaking. > > This is not specific. Remove this from here and add individual FIXMEs. Ooops. This is left over from earlier and is actually fixed in the code. I've removed the comment. >> + virtual void GenerateCategory(const char *ClassName, const char >> *CategoryName, >> + const llvm::SmallVector >> &InstanceMethodNames, > > If use > llvm::SmallVectorImpl &InstanceMethodNames > then you do not hard code vector size here and let each runtime > implementation use it appropriate size. Thanks, I was wondering if there was a sensible way of doing this. I've fixed this in the interface. There are a few places now in caller that can have their SmallVector sizes tuned a bit, but I'll wait for a later patch to do that. >> + virtual llvm::Value *generateMessageSendSuper(llvm::IRBuilder >> &Builder, > > GenereateMessage... Fixed. >> + virtual llvm::Value *getSelector(llvm::IRBuilder &Builder, > > GetSelector ... Fixed. >> isClassMethod) >> +{ >> + if (isClassMethod) { >> + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "+" >> + MethodName; >> + } > > nit pick. Use { } only if the block is more then 1 line. I prefer to use blocks everywhere, since it makes it easier to see fall-through in nested conditionals and makes inserting debugging lines easier, but if LLVM coding conventions require them to not be blocks then I will try to remember in future. I've removed this one. >> + std::vector V, std::string Name) { > > Please use vector reference here > >> +llvm::Constant *CGObjCGNU::MakeGlobal(const llvm::ArrayType *Ty, >> + std::vector V, std::string Name) { > > and here. Fixed. >> >> + llvm::Constant *C = llvm::ConstantArray::get(Ty, V); >> + return new llvm::GlobalVariable(Ty, false, >> + llvm::GlobalValue::InternalLinkage, C, Name, &TheModule); >> +} >> + >> +/// Generate an NSConstantString object. >> +//TODO: In case there are any crazy people still using Objective-C >> without an >> +//OpenStep implementation, this should let them select their own >> class for >> +//constant strings. > > :) Here, you want to say "GNU Objective-C runtime" here. Pedantic, but since I am using this code with a Smalltalk compiler too I guess it's valid so I'll change it... I stand by my statement that only crazy people program without OpenStep though :-) >> + if (ReturnTy->isFirstClassType() && ReturnTy != >> llvm::Type::VoidTy) { > > Now, isSingleType() is now preferred over isFirstClassType() here and > other places where you check return type. Fixed. (Presumably this was meant to be isSingleValueType()?) >> + RetTy = ReturnTy; >> + } else { >> + // For struct returns allocate the space in the caller and pass >> it up to >> + // the sender. > > Note, LLVM is moving in the direction to where return will be able to > return aggregates. Someone mentioned that C ABI issues make it difficult to use this in this particular case. I have added a TODO to revisit it when LLVM gets support for aggregate return types. >> + for(unsigned int i=0 ; i > for (unsigned int i = 0, e < MethodTypes.size(); i != e; ++i) { Fixed everywhere I found - I might have missed one or two... >> + return MakeGlobal(ClassTy, Elements, SymbolNameForClass(Name- >>> getStringValue())); >> +} > Add extra line before starting new function. Fixed. >> + for(std::map::iterator >> + iter=TypedSelectors.begin() ; iter!=TypedSelectors.end() ; >> iter++) { > > Please use > > for(std::map::iterator > iter=TypedSelectors.begin(), iterEnd =TypedSelectors.end(); > iter != iterEnd; ++iter) > > form here and other places. Fixed. Also added the missing space between for and the bracket. >> +void CodeGenModule::EmitObjCProtocolImplementation(const >> ObjCProtocolDecl *PD){ > > Why are these methods not runtime specific ? These methods correspond to language features, which are runtime agnostic. They then call the runtime-specific methods which perform the target-specific things. All runtimes, for example, need to know the method names and type encodings in a protocol (and how these are found is specific to the source language, in this case Objective-C). This then calls the runtime method Runtime->GenerateProtocol, which either sets up some data structures or calls some runtime functions, depending on the runtime library being used. >> + llvm::SmallVector Protocols; >> + for(unsigned i=0 ; igetNumReferencedProtocols() ; i++) >> + { >> + Protocols.push_back(PD->getReferencedProtocols()[i]->getName()); >> + } > > Avoid unnecessary { and } Fixed here and in three other places. -------------- next part -------------- A non-text attachment was scrubbed... Name: objc.diff Type: application/octet-stream Size: 61379 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080528/ebaec4f1/attachment-0001.obj -------------- next part -------------- From clattner at apple.com Wed May 28 11:27:03 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 28 May 2008 09:27:03 -0700 Subject: [cfe-dev] Objective-C top-level constructs code generation In-Reply-To: References: Message-ID: <1474DEEB-30F8-4F09-B698-C5AE3BAE0078@apple.com> On May 28, 2008, at 8:56 AM, David Chisnall wrote: > Hi Devang, > > Thanks for the feedback. Here is a new version. Nice! This is looking good. A few more thoughts: +// Some zeros used for GEPs in lots of places. +static llvm::Constant *Zeros[] = {llvm::ConstantInt::get(llvm::Type::Int32Ty, 0), + llvm::ConstantInt::get(llvm::Type::Int32Ty, 0) }; +static llvm::Constant *NULLPtr = llvm::ConstantPointerNull::get( + llvm::PointerType::getUnqual(llvm::Type::Int8Ty)); +static llvm::Constant *ProtocolVersion = + llvm::ConstantInt::get(llvm::Type::Int32Ty, 2); This causes static initializers to be formed and run. Please stay away from them. Maybe these should be instance variables of the class? + std::map ExistingProtocols; + typedef std::pair TypedSelector; + std::map TypedSelectors; + std::map UntypedSelectors; std::map's from std::string are really inefficient. Can you use StringMap for these? These: +static std::string SymbolNameForClass(std::string ClassName) { +static std::string SymbolNameForMethod(const std::string &ClassName, const + std::string CategoryName, const std::string MethodName, bool isClassMethod) Copy the std::string objects. Please pass as "const std::string &foo" to avoid this. Likewise in a few other places (e.g. MakeGlobal) + if (isClassMethod) + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "+" + MethodName; + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "-" + MethodName; +} How about something like: return "._objc_method_" + ClassName +"("+CategoryName+")"+ (isClassMethod ? "+" : "-") + MethodName; To make it more obvious what the difference is between the two. + if (ReturnTy->isSingleValueType() && ReturnTy != llvm::Type::VoidTy) { + llvm::SmallVector Args; + Args.push_back(Receiver); + Args.push_back(cmd); + Args.insert(Args.end(), ArgV, ArgV+ArgC); + return Builder.CreateCall(imp, Args.begin(), Args.end()); + } else { + llvm::SmallVector Args; + llvm::Value *Return = Builder.CreateAlloca(ReturnTy); + Args.push_back(Return); + Args.push_back(Receiver); + Args.push_back(cmd); + Args.insert(Args.end(), ArgV, ArgV+ArgC); + Builder.CreateCall(imp, Args.begin(), Args.end()); + return Return; + } Is there a specific reason not to share the common code here? +void CGObjCGNU::GenerateProtocol(const char *ProtocolName, It would be nice to add a block comment above each of these methods with a C struct (in the comment) that describes the thing that you are generating. This would make it easier to follow the code. + if (Classes.size() + Categories.size() + ConstantStrings.size() + + ExistingProtocols.size() + TypedSelectors.size() + + UntypedSelectors.size() == 0) { It is generally better to query .empty() instead of .size() if you just care about whether it is empty or not. Also, using "if (x.empty() && y.empty() && z.empty() ...)" is more idiomatic than using additions. + Elements.push_back(MakeConstantString((*iter).first.first, ".objc_sel_types")); Watch out for 80 columns. + const llvm::StructLayout *Layout = + TheTargetData.getStructLayout(llvm::cast(ObjTy)); You don't need the llvm:: qualifier on cast, and you don't need 'const' in the type argument (the constness of the result follows the constness of the input). This should be enough: + const llvm::StructLayout *Layout = + TheTargetData.getStructLayout(cast(ObjTy)); +++ lib/CodeGen/ModuleBuilder.cpp (working copy) @@ -67,6 +67,24 @@ if (FunctionDecl *FD = dyn_cast(D)) { Builder->EmitFunction(FD); + } else if (dyn_cast(D)){ + //Forward declaration. Only used for type checking. If you're not using the result of the dyn_cast, please use isa(D) instead. Likewise for other cases in this file. This is looking nice! -Chris From dpatel at apple.com Wed May 28 11:58:42 2008 From: dpatel at apple.com (Devang Patel) Date: Wed, 28 May 2008 09:58:42 -0700 Subject: [cfe-dev] Simplify/fix-up Sema-level struct layout In-Reply-To: References: Message-ID: <1DE9685B-0C9F-486B-B6E3-FEAEF8E29FA1@apple.com> Eli, On May 28, 2008, at 7:56 AM, Eli Friedman wrote: > Patch per subject. This simplifies the code a bit by sharing the same > codepath for unions and structs, and fixes some bugs with some edge > cases (the edge cases addressed are in the testcases in the patch). > As far as I know, this is correct on X86 for all combinations of > bitfields, the packed attribute, and the aligned attribute in both > structs and unions. cool > However, I haven't tested heavily. (Anyone up > for making a struct/union fuzz tester?) 1. Checkout llvm-gcc-42 sources. It includes gcc testsuite. 2. Put clang and ccc in your path and do 3. runtest --srcdir /...../llvmgcc42/gcc/testsuite GCC_UNDER_TEST=ccc HOSTCC=gcc HOSTCFLAGS="-g" --tool gcc struct-layout-1.exp If this reports all passes and zero failure then you're golden! - Devang From clattner at apple.com Wed May 28 12:00:24 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 28 May 2008 10:00:24 -0700 Subject: [cfe-dev] Linking options In-Reply-To: References: <20080528135209.41912632@cs.unipr.it> Message-ID: On May 28, 2008, at 5:09 AM, Eli Friedman wrote: > On Wed, May 28, 2008 at 4:52 AM, Paolo Bolzoni > wrote: >> Here a very simple program, t.cc: >> Compiling command (taken almost one-to-one from clang executable >> linking): >> g++ -o /dev/null \ >> t.cc \ >> > Try putting your libraries in the following order: -lclangCodeGen > -lclangAnalysis -lclangRewrite -lclangSEMA -lclangAST -lclangParse > -lclangLex -lclangBasic -lLLVMBitWriter -lLLVMBitReader -lLLVMCodeGen > -lLLVMTarget -lLLVMSupport -lLLVMCore -lLLVMSystem -lpthread -ldl -lm > > I'm not entirely sure how clang avoids running into this; I don't know > very much about linker magic. In the llvm side of things, we use llvm-config to wrangle the library dependencies and figure out what order to link things it. llvm-config works by nm'ing the libraries and building a dependence graph. It should work with clang, but I don't think anyone has tried making it work. It would require rebuilding the llvm-config database after the clang libs are built. -Chris From eli.friedman at gmail.com Wed May 28 12:42:26 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 28 May 2008 10:42:26 -0700 Subject: [cfe-dev] Simplify/fix-up Sema-level struct layout In-Reply-To: <1DE9685B-0C9F-486B-B6E3-FEAEF8E29FA1@apple.com> References: <1DE9685B-0C9F-486B-B6E3-FEAEF8E29FA1@apple.com> Message-ID: On Wed, May 28, 2008 at 9:58 AM, Devang Patel wrote: > 1. Checkout llvm-gcc-42 sources. It includes gcc testsuite. > 2. Put clang and ccc in your path > > and do > > 3. runtest --srcdir /...../llvmgcc42/gcc/testsuite GCC_UNDER_TEST=ccc > HOSTCC=gcc HOSTCFLAGS="-g" --tool gcc struct-layout-1.exp > > If this reports all passes and zero failure then you're golden! Mmmm... I'll have to see what I can do about that; currently, I'm getting *no* passes. That said, it's tripping over other issues; I think I even found a preprocessor bug. -Eli From mrs at apple.com Wed May 28 13:37:11 2008 From: mrs at apple.com (Mike Stump) Date: Wed, 28 May 2008 11:37:11 -0700 Subject: [cfe-dev] clang bug: constant array size is recognized as variable array size In-Reply-To: References: <8914b92d0805230317l6e594664t6a760a7d5abff81c@mail.gmail.com> Message-ID: On May 27, 2008, at 7:53 PM, Eli Friedman wrote: > On Tue, May 27, 2008 at 12:36 PM, Mike Stump wrote: > >> [#10] An implementation may accept other forms of constant >> expressions. > > That doesn't apply to integer constant expressions. See > http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_312.htm. Ah, I was wrong. Thanks. From csdavec at swansea.ac.uk Wed May 28 15:26:31 2008 From: csdavec at swansea.ac.uk (David Chisnall) Date: Wed, 28 May 2008 21:26:31 +0100 Subject: [cfe-dev] Objective-C top-level constructs code generation In-Reply-To: <1474DEEB-30F8-4F09-B698-C5AE3BAE0078@apple.com> References: <1474DEEB-30F8-4F09-B698-C5AE3BAE0078@apple.com> Message-ID: <89AC97C0-60E5-4257-B608-5F7A88249A9C@swan.ac.uk> On 28 May 2008, at 17:27, Chris Lattner wrote: > On May 28, 2008, at 8:56 AM, David Chisnall wrote: >> Hi Devang, >> >> Thanks for the feedback. Here is a new version. > > Nice! This is looking good. A few more thoughts: > > +// Some zeros used for GEPs in lots of places. > +static llvm::Constant *Zeros[] = > {llvm::ConstantInt::get(llvm::Type::Int32Ty, 0), > + llvm::ConstantInt::get(llvm::Type::Int32Ty, 0) }; > +static llvm::Constant *NULLPtr = llvm::ConstantPointerNull::get( > + llvm::PointerType::getUnqual(llvm::Type::Int8Ty)); > +static llvm::Constant *ProtocolVersion = > + llvm::ConstantInt::get(llvm::Type::Int32Ty, 2); > > This causes static initializers to be formed and run. Please stay > away from them. Maybe these should be instance variables of the > class? Okay. > + std::map ExistingProtocols; > + typedef std::pair TypedSelector; > + std::map TypedSelectors; > + std::map UntypedSelectors; > > std::map's from std::string are really inefficient. Can you use > StringMap for these? Done for the two with string keys. I'm happy to accept suggestions for a more efficient way of storing typed selectors (although clang currently never uses that code path, so the inefficiency isn't important). > These: > +static std::string SymbolNameForClass(std::string ClassName) { > +static std::string SymbolNameForMethod(const std::string &ClassName, > const > + std::string CategoryName, const std::string MethodName, bool > isClassMethod) > > > Copy the std::string objects. Please pass as "const std::string &foo" > to avoid this. Likewise in a few other places (e.g. MakeGlobal) Done. > + if (isClassMethod) > + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "+" + > MethodName; > + return "._objc_method_" + ClassName +"("+CategoryName+")"+ "-" + > MethodName; > +} > > How about something like: > return "._objc_method_" + ClassName +"("+CategoryName+")"+ > (isClassMethod ? "+" : "-") + MethodName; > > To make it more obvious what the difference is between the two. Yup, seems clearer. > + if (ReturnTy->isSingleValueType() && ReturnTy != > llvm::Type::VoidTy) { > + llvm::SmallVector Args; > + Args.push_back(Receiver); > + Args.push_back(cmd); > + Args.insert(Args.end(), ArgV, ArgV+ArgC); > + return Builder.CreateCall(imp, Args.begin(), Args.end()); > + } else { > + llvm::SmallVector Args; > + llvm::Value *Return = Builder.CreateAlloca(ReturnTy); > + Args.push_back(Return); > + Args.push_back(Receiver); > + Args.push_back(cmd); > + Args.insert(Args.end(), ArgV, ArgV+ArgC); > + Builder.CreateCall(imp, Args.begin(), Args.end()); > + return Return; > + } > > Is there a specific reason not to share the common code here? Other than incompetence? No. Fixed. > +void CGObjCGNU::GenerateProtocol(const char *ProtocolName, > > It would be nice to add a block comment above each of these methods > with a C struct (in the comment) that describes the thing that you are > generating. This would make it easier to follow the code. I don't want to copy the structs directly from the runtime headers, since the header is GPL'd (with a special exemption for code compiled with GCC, but not for code compiled with LLVM) and, although you can't copyright an interface, you can copyright a representation of an interface (see AT&T Vs UCB). I've added a comment at the top of the file pointing people in the right direction for finding the structures. > + if (Classes.size() + Categories.size() + ConstantStrings.size() + > + ExistingProtocols.size() + TypedSelectors.size() + > + UntypedSelectors.size() == 0) { > > It is generally better to query .empty() instead of .size() if you > just care about whether it is empty or not. Also, using "if > (x.empty() && y.empty() && z.empty() ...)" is more idiomatic than > using additions. Fixed. > + Elements.push_back(MakeConstantString((*iter).first.first, > ".objc_sel_types")); > > Watch out for 80 columns. Fixed. > + const llvm::StructLayout *Layout = > + TheTargetData.getStructLayout(llvm::cast llvm::StructType>(ObjTy)); > > > You don't need the llvm:: qualifier on cast, and you don't need > 'const' in the type argument (the constness of the result follows the > constness of the input). This should be enough: > > + const llvm::StructLayout *Layout = > + TheTargetData.getStructLayout(cast(ObjTy)); Done. > +++ lib/CodeGen/ModuleBuilder.cpp (working copy) > @@ -67,6 +67,24 @@ > > if (FunctionDecl *FD = dyn_cast(D)) { > Builder->EmitFunction(FD); > + } else if (dyn_cast(D)){ > + //Forward declaration. Only used for type checking. > > If you're not using the result of the dyn_cast, please use > isa(D) instead. Likewise for other cases in this file. Fixed. David -------------- next part -------------- A non-text attachment was scrubbed... Name: objc.diff Type: application/octet-stream Size: 61969 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080528/8337f69f/attachment-0001.obj -------------- next part -------------- From mrs at apple.com Wed May 28 16:41:30 2008 From: mrs at apple.com (Mike Stump) Date: Wed, 28 May 2008 14:41:30 -0700 Subject: [cfe-dev] More throw fixups In-Reply-To: <0B2760CF-B1F8-453D-BFB6-7F3594486BB5@apple.com> References: <1FB7C8C8-46B6-4715-8A8F-F6BF9EE9581D@apple.com> <0B2760CF-B1F8-453D-BFB6-7F3594486BB5@apple.com> Message-ID: On May 23, 2008, at 9:28 AM, Chris Lattner wrote: > On May 22, 2008, at 7:02 PM, Mike Stump wrote: >> This fixes the last of the throw parsing issues I know about... >> >> Ran the testsuite, no problems. > Do you have a testcase? I was using: int i; void foo() { (throw,throw); (1 ? throw 1 : throw 2); throw int(1); throw; throw 1; throw; (void)throw; // ERROR - expected expression switch (i) case throw: ; // ERROR - case label does not reduce to an integer constant } to develop it. I haven't massaged this into a real live clang testcase yet. > What does this fix? Primarily the first line. From asl at math.spbu.ru Thu May 29 12:53:35 2008 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Thu, 29 May 2008 21:53:35 +0400 Subject: [cfe-dev] Files were renamed Message-ID: <1212083615.832.28.camel@localhost> Hello, Everyone. Several .h files were renamed in the LLVM repository. We forgot to do this year ago, during subversion migration. Most of these files shouldn't be widely used directly (except ADT/iterator, for which special forwarding header will be introduced). Please look into PR1338 for more information. -- With best regards, Anton Korobeynikov. Faculty of Mathematics & Mechanics, Saint Petersburg State University. From hs4233 at mail.mn-solutions.de Fri May 30 05:53:07 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Fri, 30 May 2008 12:53:07 +0200 Subject: [cfe-dev] segfault with -serialize Message-ID: <200805301253.07659.hs4233@mail.mn-solutions.de> $ cat main.c int main(int argc, char *argv[]) { printf("argc: %d\n", argc); printf("argv[0]: %s\n", argv[0]); } $ clang main.c -serialize clang[0x8375f51] Segmentation fault Unfortunately, right now I have a release build, so the backtrace doesn't say too much: (gdb) bt #0 0x096bae28 in ?? () #1 0x0823ce14 in clang::TranslationUnit::~TranslationUnit () #2 0xb7e10ff4 in ?? () from /lib/tls/i686/cmov/libc.so.6 #3 0xb7e124c0 in ?? () from /lib/tls/i686/cmov/libc.so.6 #4 0xbf885fa0 in ?? () #5 0x096b9c88 in ?? () #6 0x096b9ae8 in ?? () #7 0xb7e124c0 in ?? () from /lib/tls/i686/cmov/libc.so.6 #8 0x096bfd98 in ?? () #9 0x096bfda0 in ?? () #10 0x00000000 in ?? () From hs4233 at mail.mn-solutions.de Fri May 30 05:56:00 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Fri, 30 May 2008 12:56:00 +0200 Subject: [cfe-dev] uninitialized variable generates a "Pass-by-value argument in function" warning ?!? Message-ID: <200805301256.00202.hs4233@mail.mn-solutions.de> $ cat main.c int main(int argc, char *argv[]) { int i = 0; int j; printf("i %d\n", i); printf("j %d\n", j); } $ clang main.c -checker-simple ANALYZE: main.c main main.c:6:2: warning: [CHECKER] Pass-by-value argument in function is undefined. printf("j %d\n", j); ^ ~ 1 diagnostic generated. From hs4233 at mail.mn-solutions.de Fri May 30 05:58:23 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Fri, 30 May 2008 12:58:23 +0200 Subject: [cfe-dev] erraneous html with -emit-html Message-ID: <200805301258.23165.hs4233@mail.mn-solutions.de> See the attached picture. It contains two "tr>" at the top that shouldn't be there. Displayed this way with Konqueror and Firefox. -------------- next part -------------- A non-text attachment was scrubbed... Name: emit-html.png Type: image/png Size: 2861 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080530/625a8121/attachment.png From hs4233 at mail.mn-solutions.de Fri May 30 06:06:14 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Fri, 30 May 2008 13:06:14 +0200 Subject: [cfe-dev] clang doesn't know about it's installation prefix when searching header files Message-ID: <200805301306.14414.hs4233@mail.mn-solutions.de> I configured llvm with /usr/src/llvm/svn.llvm/configure \ --prefix=/usr/src/llvm/dist \ --with-llvmgccdir=/usr/src/llvm/dist \ --enable-optimized --disable-debug When I later compile and install clang, it installs it's own include files in the $prefix: $ find /usr/src/llvm/dist -name "std*.h" | tail -n3 /usr/src/llvm/dist/Headers/stddef.h /usr/src/llvm/dist/Headers/stdarg.h /usr/src/llvm/dist/Headers/stdbool.h But clang searches in wrong places: $ clang -v main.c --emit-llvm-bc ignoring nonexistent directory "/Headers" ignoring nonexistent directory "/usr/lib/gcc/i686-apple-darwin10/4.2.1/include" ignoring nonexistent directory "/usr/lib/gcc/powerpc-apple-darwin10/4.2.1/include" ignoring nonexistent directory "/usr/lib/gcc/i686-apple-darwin9/4.0.1/include" ignoring nonexistent directory "/usr/lib/gcc/powerpc-apple-darwin9/4.0.1/include" ignoring nonexistent directory "/usr/lib/gcc/powerpc-apple-darwin9/4.0.1/../../../../powerpc-apple-darwin0/include" ignoring nonexistent directory "/usr/lib/gcc/i686-apple-darwin8/4.0.1/include" ignoring nonexistent directory "/usr/lib/gcc/powerpc-apple-darwin8/4.0.1/include" ignoring nonexistent directory "/usr/lib/gcc/powerpc-apple-darwin8/4.0.1/../../../../powerpc-apple-darwin8/include" ignoring nonexistent directory "/usr/lib/gcc/i486-linux-gnu/4.1.3/include" ignoring nonexistent directory "/usr/lib/gcc/i386-redhat-linux/4.1.2/include" ignoring nonexistent directory "/usr/lib/gcc/i486-linux-gnu/4.2.3/include" ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/4.2.3/include" ignoring nonexistent directory "/System/Library/Frameworks" ignoring nonexistent directory "/Library/Frameworks" #include "..." search starts here: #include <...> search starts here: /usr/local/include /usr/include End of search list. This is on Linux (Debian Etch). Maybe llvm::sys::Path::GetMainExecutable is buggy here? From eli.friedman at gmail.com Fri May 30 06:29:30 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 30 May 2008 04:29:30 -0700 Subject: [cfe-dev] segfault with -serialize In-Reply-To: <200805301253.07659.hs4233@mail.mn-solutions.de> References: <200805301253.07659.hs4233@mail.mn-solutions.de> Message-ID: On Fri, May 30, 2008 at 3:53 AM, Holger Schurig wrote: > $ cat main.c > int main(int argc, char *argv[]) > { > printf("argc: %d\n", argc); > printf("argv[0]: %s\n", argv[0]); > } > > > $ clang main.c -serialize > clang[0x8375f51] > Segmentation fault Thanks for the report; fixed committed. For future reference, it's easier to keep track of bug reports in Bugzilla; reports to the list tend to get lost. -Eli From eli.friedman at gmail.com Fri May 30 06:33:07 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Fri, 30 May 2008 04:33:07 -0700 Subject: [cfe-dev] erraneous html with -emit-html In-Reply-To: <200805301258.23165.hs4233@mail.mn-solutions.de> References: <200805301258.23165.hs4233@mail.mn-solutions.de> Message-ID: On Fri, May 30, 2008 at 3:58 AM, Holger Schurig wrote: > See the attached picture. It contains two "tr>" at the top that > shouldn't be there. Displayed this way with Konqueror and > Firefox. I can't reproduce; would you mind listing the exact source file and exact command line you used? Oh, and would you mind putting this into Bugzilla? Bug reports to the list tend to get lost. -Eli From hs4233 at mail.mn-solutions.de Fri May 30 06:47:23 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Fri, 30 May 2008 13:47:23 +0200 Subject: [cfe-dev] clang doesn't know about it's installation prefix when searching header files In-Reply-To: <200805301306.14414.hs4233@mail.mn-solutions.de> References: <200805301306.14414.hs4233@mail.mn-solutions.de> Message-ID: <200805301347.23177.hs4233@mail.mn-solutions.de> > This is on Linux (Debian Etch). Maybe > llvm::sys::Path::GetMainExecutable is buggy here? It is indeed buggy. This llvm::sys::Path MainExecutablePath = llvm::sys::Path::GetMainExecutable(Argv0, (void*)(intptr_t)InitializeIncludePaths); + fprintf(stderr, "MainExecutablePath %s\n", MainExecutablePath.c_str()); says that MainExecutablePath just contains "clang". But on the shell, it says otherwise: $ which clang /usr/src/llvm/dist/bin/clang Unfortunately, my shell (bash) doesn't call the program in a way so that this can be re-used. A simple test program reveals this: #include int main(int argc, char *argv[]) { printf("argv[0] %s\n", argv[0]); return 0; } and run it like this: $ ./argc argv[0] ./argc $ mv argc /usr/src/llmv/dist/bin $ which argc /usr/src/llvm/dist/bin/argc $ argc argv[0] argc ... I see that using argv[0] is not sufficient, at least not if bash 3.1.17 handles the PATH. Only when I specify the full path does argv[] work: $ /usr/src/llvm/dist/bin/argc argv[0] /usr/src/llvm/dist/bin/argc So, should I amend llvm::sys::Path::GetMainExecutable so that it: * check if there is a "/" in the path? * if not, iterate over env['PATH'] to search for itself and use the first match? From hs4233 at mail.mn-solutions.de Fri May 30 06:58:38 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Fri, 30 May 2008 13:58:38 +0200 Subject: [cfe-dev] erraneous html with -emit-html In-Reply-To: References: <200805301258.23165.hs4233@mail.mn-solutions.de> Message-ID: <200805301358.38150.hs4233@mail.mn-solutions.de> > Oh, and would you mind putting this into Bugzilla? Bug > reports to the list tend to get lost. Done: http://llvm.org/bugs/show_bug.cgi?id=2386 From kremenek at apple.com Fri May 30 10:42:06 2008 From: kremenek at apple.com (Ted Kremenek) Date: Fri, 30 May 2008 08:42:06 -0700 Subject: [cfe-dev] uninitialized variable generates a "Pass-by-value argument in function" warning ?!? In-Reply-To: <200805301256.00202.hs4233@mail.mn-solutions.de> References: <200805301256.00202.hs4233@mail.mn-solutions.de> Message-ID: <17B30B33-64D7-466E-ABA0-14BE72E70CD2@apple.com> Hi Holger, Do you have a particular question? The warning has to do with the value of 'j' is uninitialized and passed as an argument to printf. Do you feel that the diagnostic is unclear? I need a little bit more than a screen dump to understand your concerns. Ted On May 30, 2008, at 3:56 AM, Holger Schurig wrote: > $ cat main.c > int main(int argc, char *argv[]) > { > int i = 0; > int j; > printf("i %d\n", i); > printf("j %d\n", j); > } > > $ clang main.c -checker-simple > ANALYZE: main.c main > main.c:6:2: warning: [CHECKER] Pass-by-value argument in function > is undefined. > printf("j %d\n", j); > ^ ~ > 1 diagnostic generated. > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From peter.neumark at gmail.com Sat May 31 06:37:43 2008 From: peter.neumark at gmail.com (Peter Neumark) Date: Sat, 31 May 2008 13:37:43 +0200 Subject: [cfe-dev] file exchange protocol in clang/distcc Message-ID: <2e837e3a0805310437h1780c10ak1901e8cfa83750f4@mail.gmail.com> Hi! I'd like to discuss about file exchange method for clang/discc. I'd like to use ssh protocol, because it is a standard and supports authentication. any opinion ? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20080531/31f3ef04/attachment.html From sanxiyn at gmail.com Sat May 31 11:09:52 2008 From: sanxiyn at gmail.com (Sanghyeon Seo) Date: Sun, 1 Jun 2008 01:09:52 +0900 Subject: [cfe-dev] clang doesn't know about it's installation prefix when searching header files In-Reply-To: <200805301347.23177.hs4233@mail.mn-solutions.de> References: <200805301306.14414.hs4233@mail.mn-solutions.de> <200805301347.23177.hs4233@mail.mn-solutions.de> Message-ID: <5b0248170805310909l5ceba146m1a41968d31dbe892@mail.gmail.com> 2008/5/30 Holger Schurig : > So, should I amend llvm::sys::Path::GetMainExecutable so that it: > > * check if there is a "/" in the path? > * if not, iterate over env['PATH'] to search for itself and > use the first match? "GetMainExecutable" is a tricky function to implement in cross-platform way. The standard Linux method is to resolve symlink /proc/self/exe. On Windows, you call GetModuleFileName. See http://autopackage.org/docs/binreloc/ for extensive discussion. -- Seo Sanghyeon From kremenek at apple.com Sat May 31 14:24:37 2008 From: kremenek at apple.com (Ted Kremenek) Date: Sat, 31 May 2008 12:24:37 -0700 Subject: [cfe-dev] file exchange protocol in clang/distcc In-Reply-To: <2e837e3a0805310437h1780c10ak1901e8cfa83750f4@mail.gmail.com> References: <2e837e3a0805310437h1780c10ak1901e8cfa83750f4@mail.gmail.com> Message-ID: <00557B22-5735-43B9-B4FD-CE5B25010DC8@apple.com> On May 31, 2008, at 4:37 AM, Peter Neumark wrote: > Hi! > I'd like to discuss about file exchange method for clang/discc. I'd > like to use ssh protocol, because it is a standard and supports > authentication. > any opinion ? Hi Peter, I think it depends on your goals. Consider the following points: 1) Should a user require full account access on a remote machine in order to use it for distcc? Is this absolutely necessary? What are the tradeoffs? Having a "distcc" account on remote machine that is dedicated for the purpose of doing remote compiles isn't necessarily a bad thing, as it isolates access of resources. There is an administrative cost, however, of setting up an account and installing ssh keys on every single machine. In order to make clang-distcc useful, we probably want low administrative overhead to lower the administrative cost of using it. What does the standard distcc do? 2) There is a high cost of initiating the ssh connection (a key exchange has to be done using cryptography primitives that are typically more expensive to use that the ciphers used by ssh to encrypt data). Do you plan on keeping an ssh connection open throughout the compilation of a source file(s) on a remote host? This answer to this question directly effects scalability. Keeping a bunch of connections open is also a scalability concern. I'm not an expert on distributed network protocols, so hopefully someone else has more insight here. 3) Encrypting data that does not need to be encrypted wastes a non- negligible amount of CPU cycles, both on the sender and the receiver. This has immediate ramifications on scalability, as you rob precious cycles from potentially other compilation jobs. Ted From clattner at apple.com Sat May 31 14:57:13 2008 From: clattner at apple.com (Chris Lattner) Date: Sat, 31 May 2008 12:57:13 -0700 Subject: [cfe-dev] file exchange protocol in clang/distcc In-Reply-To: <00557B22-5735-43B9-B4FD-CE5B25010DC8@apple.com> References: <2e837e3a0805310437h1780c10ak1901e8cfa83750f4@mail.gmail.com> <00557B22-5735-43B9-B4FD-CE5B25010DC8@apple.com> Message-ID: <50A66016-1CF3-4661-A217-922ACC75DB2B@apple.com> On May 31, 2008, at 12:24 PM, Ted Kremenek wrote: >> Hi! >> I'd like to discuss about file exchange method for clang/discc. I'd >> like to use ssh protocol, because it is a standard and supports >> authentication. >> any opinion ? > > Hi Peter, > > I think it depends on your goals. Consider the following points: > > What does the standard distcc do? I'd 1) keep it simple, and 2) make the transport pluggable in the future so that people can choose the right protocol for their needs. In the short term, do whatever is simplest... following what standard distcc does is a great way to start. -Chris From filcab at gmail.com Sat May 31 17:07:28 2008 From: filcab at gmail.com (Filipe Cabecinhas) Date: Sat, 31 May 2008 23:07:28 +0100 Subject: [cfe-dev] Unable to build clang in MacOS 10.5.3 Message-ID: Hi, I'm trying to build clang on MacOS 10.5.3 (XCode 3.0, i686-apple- darwin9-gcc-4.0.1) but I get the following error: /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const __gnu_debug::_Safe_iterator<_Iterator, _Sequence>&, const char*, __gnu_debug::_Error_formatter::_Is_iterator)': /usr/include/c++/4.0.0/debug/formatter.h:214: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h:220: error: cannot use typeid with -fno-rtti I commented all the CXX.flags = -fno-rtti and it built. I'm now testing it but I'm checking with you to see if it's a known bug (in llvm's IRC channel no-one knew about the bug, only suggested commenting the -fno-rtti). Thanks for the help, - Filipe Cabecinhas From clattner at apple.com Sat May 31 17:15:27 2008 From: clattner at apple.com (Chris Lattner) Date: Sat, 31 May 2008 15:15:27 -0700 Subject: [cfe-dev] Unable to build clang in MacOS 10.5.3 In-Reply-To: References: Message-ID: On May 31, 2008, at 3:07 PM, Filipe Cabecinhas wrote: > Hi, > > I'm trying to build clang on MacOS 10.5.3 (XCode 3.0, i686-apple- > darwin9-gcc-4.0.1) but I get the following error: > /usr/include/c++/4.0.0/debug/formatter.h: In constructor > '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const > __gnu_debug::_Safe_iterator<_Iterator, _Sequence>&, const char*, > __gnu_debug::_Error_formatter::_Is_iterator)': > /usr/include/c++/4.0.0/debug/formatter.h:214: error: cannot use typeid > with -fno-rtti > /usr/include/c++/4.0.0/debug/formatter.h:220: error: cannot use typeid > with -fno-rtti I'm not familiar with debug/formatter.h. Can you please include the full error message from the compiler? -Chris > > > I commented all the CXX.flags = -fno-rtti and it built. I'm now > testing it but I'm checking with you to see if it's a known bug (in > llvm's IRC channel no-one knew about the bug, only suggested > commenting the -fno-rtti). > > Thanks for the help, > > - Filipe Cabecinhas > > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev From filcab at gmail.com Sat May 31 17:27:56 2008 From: filcab at gmail.com (Filipe Cabecinhas) Date: Sat, 31 May 2008 23:27:56 +0100 Subject: [cfe-dev] Unable to build clang in MacOS 10.5.3 In-Reply-To: References: Message-ID: Hi, On 31 May, 2008, at 23:15, Chris Lattner wrote: > On May 31, 2008, at 3:07 PM, Filipe Cabecinhas wrote: > > I'm not familiar with debug/formatter.h. Can you please include the > full error message from the compiler? > > -Chris Here it is. I uncommented Driver/Makefile and tried to rebuild clang: ... llvm[1]: Compiling ASTConsumers.cpp for Debug+Checks build /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const __gnu_debug::_Safe_iterator<_Iterator, _Sequence>&, const char*, __gnu_debug::_Error_formatter::_Is_iterator)': /usr/include/c++/4.0.0/debug/formatter.h:214: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h:220: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const _Type*&, const char*, __gnu_debug::_Error_formatter::_Is_iterator)': /usr/include/c++/4.0.0/debug/formatter.h:243: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(_Type*&, const char*, __gnu_debug::_Error_formatter::_Is_iterator)': /usr/include/c++/4.0.0/debug/formatter.h:256: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const _Iterator&, const char*, __gnu_debug::_Error_formatter::_Is_iterator)': /usr/include/c++/4.0.0/debug/formatter.h:269: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const __gnu_debug::_Safe_sequence<_Sequence>&, const char*, __gnu_debug::_Error_formatter::_Is_sequence)': /usr/include/c++/4.0.0/debug/formatter.h:285: error: cannot use typeid with -fno-rtti /usr/include/c++/4.0.0/debug/formatter.h: In constructor '__gnu_debug::_Error_formatter::_Parameter::_Parameter(const _Sequence&, const char*, __gnu_debug::_Error_formatter::_Is_sequence)': /usr/include/c++/4.0.0/debug/formatter.h:294: error: cannot use typeid with -fno-rtti make[1]: *** [/Users/filcab/dev/stuff/llvm/llvm/tools/clang/Driver/ Debug+Checks/ASTConsumers.o] Error 1 make: *** [all] Error 1 Thanks for the help, - Filipe Cabecinhas