[llvm-commits] shufflevector on ARM (clumsy x-post from llvmdev)
bob.wilson at apple.com
Fri Jan 7 15:42:31 CST 2011
On Jan 7, 2011, at 6:57 AM, Tim Northover wrote:
> On 07/01/11 07:28, Bob Wilson wrote:
>> The extract_subvector patch looks good, except for the testsuite
>> changes. Those tests are supposed to test spill code, and your patch
>> causes them to stop spilling. I'll commit the patch after I fix the
>> tests to continue spilling in spite of your change.
> Ah thanks. I'd convinced myself it was spilling, just slightly
> differently. Glad you picked that up.
You were right -- it was spilling. The difference was in whether there was an aligned stack slot for something other than a spill. I added an aligned alloca and that fixed the test.
>> The build_vector patch looks good, too. Can you also provide some tests
>> that exercise this? (The test/CodeGen/ARM/vext.ll file would be a good
>> place to put them.)
> Yep, I've attached the replacement patch (hopefully).
> It incorporates your comments and some tests that I believe exercise
> most of the code (some is just there in case weird lowering creates
> something unexpected and I can't actually produce an example).
Looks good. I've committed it as svn 123035.
>> This new code will apply to <4 x i32> vectors. The following code to
>> implement the BUILD_VECTOR by directly assigning subregisters will also
>> handle that case.Have you looked at which is better? It might be better
>> to swap the order of these. I suppose accessing S subregisters can be
>> slow since the move instructions will run in the VFP pipeline and cause
>> stalls on some processors
> I hadn't thought of anything so cunning. If both pieces of code apply
> then the result is <4 x i32> and both source vectors are <4 x i32> as
> well (if <2 x i32> I bail). I think this means that the result of my
> code would be identical to a perfect shuffle with no added overhead (no
> So assuming that perfect shuffles are indeed handled optimally the order
> I gave happens to be correct. More by luck than judgement.
OK, that makes sense to me.
More information about the llvm-commits