[llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp LegalizeDAG.cpp SelectionDAG.cpp SelectionDAGISel.cpp TargetLowering.cpp
clattner at apple.com
Tue Jul 3 17:07:26 CDT 2007
On Jul 3, 2007, at 2:13 PM, Dan Gohman wrote:
>>> We overload ISD::FADD and quite a lot of others. Why not
>>> ISD::ConstantFP too?
>> Fair enough, after pondering on it, I agree with you. The proposed
>> semantics are that a ConstantFP (and also a normal Constant?) produce
>> the splatted immediate value?
> Constant sounds good too. And UNDEF, for that matter. And yes,
> that's the
> semantics I mean.
Ok, makes sense. I think we already use UNDEF for vectors.
>> Please add a dag combine xform from build_vector [c,c,c,c] ->
>> constantfp and friends.
> I sketched out some of the code for this. One question that's come
> up so far is
> whether if the vector has some undef elements but all the non-undef
> are equal it should still be folded. My initial preference is to
> still fold it,
> since that lets things like isBuildVectorAllZeros become trivial to
> but it is a pessimization in some obscure cases.
I'm not sure about it. One specific issue is with shuffle masks,
which we want to retain the undef element values for. I don't think
there is a good way to retain shuffle masks but not other build
vectors, so we probably need to keep the individual undef elements in
isBuildVectorAllZeros and friends are another issue. To me there are
actually two issues that should be resolved at some point:
1. vector constant and shuffle mask matching code is crazily complex,
particularly in the x86 backend. For vector constants, this is only
slightly annoying. For vector shuffle masks, the selected shuffles
are currently whatever is best for yonah, and it's not really
possible to prefer different shuffles on different subtargets. We
really want to add a layer of abstraction in the shuffle/constant
matching code, which would make the undef handling stuff happen
implicitly. Making a more declarative description of the various
masks would make it much easier to maintain, understand, and debug.
2. the x86 backend specifically has a problem with the way it selects
vector constants (I think this is in the readme). In particular, if
you have a 4 x f32 and a 4 x i32 zero vector, you'll get two
different pxor instructions, because they are of different type.
There are two different ways to solve this problem:
The easy answer is to do what the ppc backend does. It always
selects zero (and -1) vectors to 4 x i32 IIRC, and then does a
bitcast to the desired type if needed. This ensures that the
constant vectors always get CSEd. The tricky part of this is to
ensure that the 0/-1 vectors still get folded if you have operations
(like ~) that require one of these as an operand. This ugliness is
why we have "vnot" and "vnot_conv" and have to duplicate patterns.
The better fix is to change the way the select phase produces code.
In particular, the reason these two zero vectors don't get CSE'd
after selection is because they have two different value types, and
the autocse stuff doesn't "know" that the two VTs end up in the same
To solve this, it seems like we can add a new MVT type, where a
certain range of MVTs (128-255?) correspond to register class ID's.
At selection time, instead of giving the new nodes their old MVT's,
they would get new MVT's that correspond to the regclass of the
result (ok, we'd keep MVT::Other, MVT::Flag and maybe some others).
This makes the scheduler slightly simpler (because it doesn't need to
map MVT -> regclass) anymore, and opens up future possibilities.
In particular, it lets us fix a long-standing class of issues where
we can't have fp stack and SSE registers around at the same time,
both with MVT::f32 or f64 type. The current scheduler can only map
f32 to one register class (thus, it can't keep the distinction) but
with this change the select pass can pick any regclass it wants.
Anyway, this is a bit of a crazy tangent, but I think undef's in
buildvector should probably stay :)
More information about the llvm-commits