Search
Primary links
Please donate to libfreevec to ensure its continuing development! Donations are done via Paypal.
Matrix 4x4 Transpose (floats)
markos — Sat, 01/03/2008 - 20:13
(Please see Matrix 4x4 addition/subtraction (floats) for the typedefs and definitions used.)
For the theory behind matrix transposition, please see here.
So, the 4x4 transpose would be:
| A1 | A2 | A3 | A4 |T | A1 | B1 | C1 | D1 | | B1 | B2 | B3 | B4 | | A2 | B2 | C2 | D2 | | C1 | C2 | C3 | C4 | = | A3 | B3 | C3 | D3 | | D1 | D2 | D3 | C4 | | A4 | B4 | C4 | D4 |
Basically we want to shuffle the elements in each vector in a particular way to end up with the transposed matrix. We could do it with a series of vec_permute instructions, but AltiVec offers us an easier and faster way: vec_mergeh/vec_mergel. Basically, what they do is this:
| A1 | A2 | A3 | A4 | | B1 | B2 | B3 | B4 | | \ vec_mergeh vec_mergel | \ v \ | A1 | B1 | A2 | B2 | | A3 | B3 | A4 | B4 |
So in effect we do two steps of mergeh/mergel operations:
vec_mergeh(Line 1, Line3) -> | A1 | C1 | A2 | C2 | vec_mergel(Line 1, Line3) -> | A3 | C3 | A4 | C4 | vec_mergeh(Line 2, Line4) -> | B1 | D1 | B2 | D2 | vec_mergel(Line 2, Line4) -> | B3 | D3 | B4 | D4 |
2nd step:
vec_mergeh(Line 1, Line3) -> | A1 | B1 | C1 | D1 | vec_mergel(Line 1, Line3) -> | A2 | B2 | C2 | D2 | vec_mergeh(Line 2, Line4) -> | A3 | B3 | C3 | D3 | vec_mergel(Line 2, Line4) -> | A4 | B4 | C4 | D4 |
And here is the final code:
void Mat44Transp(Mat44 m) { vector float vm_1, vm_2, vm_3, vm_4, vr_1, vr_2, vr_3, vr_4; // Load matrix LOAD_ALIGNED_MATRIX(m, vm_1, vm_2, vm_3, vm_4); // Do the transpose, first set of moves vr_1 = vec_mergeh(vm_1, vm_3); vr_2 = vec_mergel(vm_1, vm_3); vr_3 = vec_mergeh(vm_2, vm_4); vr_4 = vec_mergel(vm_2, vm_4); // Get the resulting vectors vm_1 = vec_mergeh(vr_1, vr_3); vm_2 = vec_mergel(vr_1, vr_3); vm_3 = vec_mergeh(vr_2, vr_4); vm_4 = vec_mergel(vr_2, vr_4); // Store back the result STORE_ALIGNED_MATRIX(m1, vm_1, vm_2, vm_3, vm_4); }