Eigen NEON port extended to ARMv8!

Soon after the VSX port, and as promised I have completed the ARMv8 NEON (a.k.a. Advanced SIMD) port. Basically this extends support to 64-bit doubles and also provides faster alternatives to division as ARMv8 has builtin instructions for division both for 32-bit floats and 64-bit doubles. Preliminary benchmarks (bench_gemm):

SIMD optimizations, cont.

A friend of mine told me that I should advertise my passion and know-how about SIMD more, and I decided to follow his advice. Though I am terrible at marketing and even more at personal marketing, I've made an attempt to do just that, advertise the fact that I'm offering SIMD Optimization Services (with emphasis on PowerPC AltiVec/VMX/VSX, and ARM NEON, but I'm ok with SSE as well, the logic is pretty much the same, though the difference(s) are in the details). For this reason I'm offering a free evaluation of your performance critical code (open/closed, able to sign NDAs if needed) to let you know if it's worth optimizing it, what kind of a performance gain you would get and how much it would cost you to get that result.
You can read more here.

EfikaMX updated wheezy and jessie images available

A while ago, I promised to some people in powerdeveloper.org forum that I would provide bootable armhf images for wheezy but most importantly for jessie with an updated kernel. After a delay -I did have the images ready and working, but had to clean them up a bit- I decided to publish them here first.

So, here are the images:

http://freevec.org/files/efikamx-wheezy-armhf-20140921.img.xz (559MB)
http://freevec.org/files/efikamx-jessie-armhf-20140921.img.xz (635MB)

VSX port added to Eigen!

Being the SIMD fanatic that I am, a few years ago I did the PowerPC Altivec and ARM NEON port for the Eigen linear algebra library, one of the best and most popular libraries -and most ported.

Recently I thought it would be a good idea to extend both ports to 64-bit, and it would also help me with the SIMD book, using VSX in one case and ARMv8 NEON (or Advanced SIMD as ARM likes to call it) in the latter. ARMv8 hardware is a bit scarce at the moment, so I thought I'd start with VSX. Being in Debian, I have access to a number of porterboxes in several architectures, and luckily one of those was a Power7 (with VSX) running ppc64. So I started the porting -or rather extending the code- to use VSX in the 64-bit doubles case. Unluckily, I could not test anything because Debian kernels do not have VSX enabled in wheezy -which is what the porterbox is running and enabling it is a non-option(#758620). So, running VSX code would turn out to be quite hard.

SIMD book, "Sponsored by ARM"!

Ok, took a while but I got the final word about this and can announce that the sponsor who donated 500 EUR to the Indiegogo campaign was ARM itself! I have to thank my friends at ARM@Cambridge and especially Dr Monika Biddulph, General Manager, Partner Enablement Group at ARM. When the book goes to print you can be sure it will include "Sponsored by ARM" somewhere! :)

Also a friendly reminder that even if the campaign is over, I still welcome the support in the form of preorders/sponsorships.

SIMD book, update 3! Addition/Subtraction mostly finished

Finally. Apologies for the delay, but it's been a busy month. This time I will hold true to my word and upload a PDF for people to see (attached to this page).

So, what's new? Here is a list of things done:

* Finished ALL addition/subtraction related instructions for all engines and major derivatives (SSE*/AVX, VMX/VSX, NEON/armv8 NEON). With diagrams (these were the reasons it has taken so long).
* Reorganized the structure (split the book into Parts I/II, the instruction index will be in Part II, Part I will carry the design analysis of each SIMD engine.
* Added an TOC/index.
* So far, with just Addition/Subtraction Chapter and the rest empty sections, it has reached 175 pages (B5, again I'm not fixed on the size, it might actually change)! My estimate is that the whole book will surpass 800 pages with everything included.


SIMD book, second update!

From the Indiegogo page:


  • Added titlepage (simple, but will do the job)
  • Reorganized ALL instructions to include both unsigned/signed in the same entity
  • Added Saturated Addition, Subtraction and Saturated Subtraction
  • Added ARMv8 NEON instructions taken from ARM infocenter draft
  • Fixed some instructions (added 64-bit arithmetic for NEON)
  • Added some special addition/subtraction, like add/sub with carry(vmx/vsx), addsub(SSE3/AVX)
  • Added some in-vector sum additions, sum reductions but no descriptions yet
  • Added diagram for 8-bit addition/subtraction (still need lots more).
  • Removed VMX128, couldn't find enough information, an email to some IBM toolchain developers was left unanswered, so I guess noone really will really care if that engine is left out, if enough people insist on it, please also be kind enough to provide some documentation on it.

SIMD book, first draft published!

Check activity here:


From the update:

Ok, I've been busy the past days, I started writing the book (using LaTeX :), and I'd like to say that progress has been good. I fixed the current list of SIMD engines that I'm going to include and it's a long one:

Crowdfunded campaign for a SIMD Comparison Reference Book on Indiegogo!

This is my first ever attempt to do a project using crowdfunding. The project is something I've always wanted to do, a SIMD Engines Comparison Reference, that is a reference book that compares all three major SIMD engines (Altivec, NEON, SSE*). If, like me, you're tired of googling all the time to find information about a specific SSE instruction and how to do addition/multiplication/shuffling/etc on more than one SIMD engine, please help fund this project!

Powerbook G4 12" revamping, part 1

I decided to give my trusty powerbook G4 a second chance. But I thought it might be a good idea to upgrade some parts of it in the meantime. Now being as it is, I can't upgrade the CPU or RAM (G4 is fixed at 1Ghz and RAM at 1.2GB), but I could upgrade the disk and screen. This time I upgraded the disk plus I replaced the thermal toothpaste with something much more efficient so it wouldn't get as hot.

I won't go into the actual details of doing the upgrades, these are covered by the excellent ifixit.com articles:


Subscribe to Front page feed