[LLVMdev] Question regarding basic-block placement optimization
chandlerc at google.com
Tue Oct 18 17:07:09 CDT 2011
On Tue, Oct 18, 2011 at 2:59 PM, Cameron Zwarich <zwarich at apple.com> wrote:
> On Oct 18, 2011, at 2:53 AM, Chandler Carruth wrote:
> > Hello,
> > I'm working on basic-block placement optimizations based on branch
> probability information. I've run into a stumbling block though. One of the
> existing passes to do this, essentially a dead pass 'block-placement'
> operates on the IR, reordering the blocks of the function, and relying on
> the code generator to preserve that order unless it has a reason to
> rearrange it. This pass is entirely untested AFAICT, and it doesn't work any
> > That said, I've replicated its functionality with a much updated
> transformation algorithm (no longer O(N^2), and finds better orderings!) and
> predicated it on the new frequency and probability information. It "works"
> great, but it doesn't actually optimize anything. It re-arranges all the
> blocks of the IR function, laying them out perfectly. And then the code
> generation layer throws that away and seems to establish its own ordering
> > What's the best approach here? Is ignoring the order of blocks in the IR
> function desired behavior? (It does seem reasonable, it just seems to have
> not always been that way, so I wondered.)
> > Should I sink my pass down to a codegen pass?
> I think this should really live as a CodeGen pass. Is there any good reason
> to make it an IR pass?
So, as it happens, I was *completely* wrong here. CodeGen correctly
preserves the ordering of blocks from IR, *unless* it can do folding, etc.
My original test cases ended up being folded into other structures, hiding
the fact that the ordering wasn't what was changing.
I've run much more real-world style test cases through the pass, and it is
working beautifully. I'll be attaching a patch shortly, I'm currently
cleaning it up based on some over-the-shoulder feedback from Nick Lewycky.
As for why it should be an IR pass, mostly because once the selection dag
runs through the code, we can never recover all of the freedom we have at
the IR level. To start with, splicing MBBs around requires known about the
terminators (which we only some of the time do), and it requires re-writing
them a touch to account for the different fall-through pattern. To make
matters worse, at this point we don't have the nicely analyzable 'switch'
terminator (I think), and so the existing MBB placement code just bails on
Operating at the IR level makes writing the pass a breeze, and we can
quickly and efficiently simply lay the blocks out in the ideal ordering.
Then the SelectionDAG and other layers can preserve this modulo folding and
other optimization opportunities.
At the most, we should likely add probability awareness to some of these
CodeGen passes to turn off (or on) their transforms around unlikely (or
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the LLVMdev