## SIMD book, first draft published!

Submitted by markos on Tue, 03/06/2014 - 19:42

Check activity here:

https://www.indiegogo.com/projects/simd-engines-comparative-reference-guide/x/4966960#activity

From the update:

Ok, I've been busy the past days, I started writing the book (using LaTeX :), and I'd like to say that progress has been good. I fixed the current list of SIMD engines that I'm going to include and it's a long one:

## Crowdfunded campaign for a SIMD Comparison Reference Book on Indiegogo!

Submitted by markos on Sat, 24/05/2014 - 12:03

## Powerbook G4 12" revamping, part 1

Submitted by markos on Tue, 14/01/2014 - 13:20

I decided to give my trusty powerbook G4 a second chance. But I thought it might be a good idea to upgrade some parts of it in the meantime. Now being as it is, I can't upgrade the CPU or RAM (G4 is fixed at 1Ghz and RAM at 1.2GB), but I could upgrade the disk and screen. This time I upgraded the disk plus I replaced the thermal toothpaste with something much more efficient so it wouldn't get as hot.

I won't go into the actual details of doing the upgrades, these are covered by the excellent ifixit.com articles:

## 32-bit *signed* integer multiplication with AltiVec

Submitted by markos on Sat, 23/08/2008 - 22:55

While completing Eigen2 AltiVec support (should be almost complete now), I noticed that the 32-bit integer multiplication didn't work correctly all of the time. As AltiVec does not really include any instruction to do 32-bit integer multiplication, I used Apple's routine from the Apple Developer's site. But this didn't work and some results were totally off. With some debugging, I found out that this routine works for unsigned 32-bit integers, where Eigen2 uses signed integers! So, I had to search more, and to my surprise, I found no reference of any similar work. So I had 2 choices: a) ditch AltiVec integer vectorisation from Eigen2 (not acceptable!) b) implement my own method! It is obvious which choice I followed :)

UPDATE: Thanks to Matt Sealey, who noticed I could have used vec_abs() instead of vec_sub() and vec_max(). Duh! :D

UPDATE: Thanks to Matt Sealey, who noticed I could have used vec_abs() instead of vec_sub() and vec_max(). Duh! :D

## Inverse of Matrix 4x4 using partitioning in Altivec

Submitted by markos on Fri, 18/04/2008 - 18:31

We tackle the 4x4 matrix inversion using the matrix partitioning method, as described in the "Numerical Recipes in C" book (2nd ed., though I guess it will be similar in the 3rd edition). Using the AltiVec SIMD unit, we achieve almost 300% increase in performance, making the routine the fastest -at least known to us, matrix inversion method!

Relevant URLs:

## AltiVec runtime detection in Linux

Submitted by markos on Thu, 10/04/2008 - 15:01

After a little search I did on Google to find how to detect AltiVec runtime in Linux (I used keywords such as runtime altivec detection and similar), I found that there is no single nice article anywhere that describes something so simple. Thankfully, I got a few good answers from benh and dwmw2 in #mklinux/FreeNode, and I decided to put these down in a cleaned up form.

Tags:

## Matrix 4x4 Identity matrix

Submitted by markos on Sat, 01/03/2008 - 20:54

The nice thing about the identity matrix, is that we don't have to do any reading of the matrix. And since the form of the identity matrix is already known:

## Matrix 4x4 Multiply with Vector (floats)

Submitted by markos on Sat, 01/03/2008 - 20:45

(Please see Matrix 4x4 addition/subtraction (floats) for the typedefs and definitions used.)

## Matrix 4x4 Transpose (floats)

Submitted by markos on Sat, 01/03/2008 - 20:13

For the theory behind matrix transposition, please see here.

So, the 4x4 transpose would be: