freevec.org

  • about
  • benchmarks
Home › Architecture

AltiVec

Printer-friendly version

libfreevec is not dead!

markos — Wed, 09/02/2011 - 16:15

Lately, I've been receiving some emails about the project being dead. I understand it's been a while since my last post, and no release has been done in a long time -and the last one (1.0.4) is offline -because it has bugs and I was unable to fix them at the time.

The reason is that I've been working on libfreevec v2. This one is a complete rewrite, supporting all 3 of the major SIMD engines (SSE, AltiVec, ARM NEON). It will take a while to develop but it will get there. Of course, due to my current job at Genesi USA, and my current work on a new Debian port for arm (armhf), priority will be given to the ARM NEON port, then AltiVec and finally SSE, time permitting.

Konstantinos Margaritis

  • AltiVec
  • libfreevec
  • NEON
  • SSE
  • Login to post comments

Yellow Dog Linux 6.2 includes libfreevec!

markos — Fri, 10/07/2009 - 13:26

Here's the link to the announcement:

http://lists.fixstars.com/pipermail/yellowdog-announce/2009-June/000214.html

From the press release:

Quote:

YDL 6.2 now offers libfreevec, a (LGPL) library with replacement routines for GLIBC, such as memcpy(), strlen(), etc. These routines, which have been rewritten and optimized to use the AltiVec vector engine found in the G4/G4+ PowerPC CPUs, can provide for up to 25% increase in application performance.

  • AltiVec
  • libfreevec
  • 2 comments
  • Read more

32-bit *signed* integer multiplication with AltiVec

markos — Sat, 23/08/2008 - 21:55

While completing Eigen2 AltiVec support (should be almost complete now), I noticed that the 32-bit integer multiplication didn't work correctly all of the time. As AltiVec does not really include any instruction to do 32-bit integer multiplication, I used Apple's routine from the Apple Developer's site. But this didn't work and some results were totally off. With some debugging, I found out that this routine works for unsigned 32-bit integers, where Eigen2 uses signed integers! So, I had to search more, and to my surprise, I found no reference of any similar work. So I had 2 choices: a) ditch AltiVec integer vectorisation from Eigen2 (not acceptable!) b) implement my own method! It is obvious which choice I followed :)
UPDATE: Thanks to Matt Sealey, who noticed I could have used vec_abs() instead of vec_sub() and vec_max(). Duh! :D

  • Algebra
  • AltiVec
  • Code
  • 2 comments
  • Read more

Comments/Conclusions

markos — Thu, 21/08/2008 - 11:31

There are many comments to be made looking at the results:

  • First, the AltiVec unit is a very powerful SIMD engine which is totally underused throughout the OS(Linux that is) apart from specific applications. In fact, its use in the kernel is strongly discouraged, perhaps rightly perhaps not, that still remains to be proved wrong.

  • It's a fact that the Athlon X2 is a faster CPU than every PowerPC CPU we tested, but still there are plenty of cases where it proved to be slower than its counterparts and that was not always the "fault" of AltiVec and libfreevec.

  • libmotovec behaves strangely and apart from strlen() is slower than libfreevec anyway. Still, it seems to be doing some really clever tricks and I intend to look at it closer in the future.

  • The MPC8610 is a very powerful CPU and especially when we consider its specs (1.3Ghz, 25Wts maximum power draw, WITHOUT the need for a northbridge) we see that the CPU is made to be a winner! Imagine a PowerPC netbook with a 8610, builtin display, fast ram access, very low power consumption. We just have to remember that the Intel Atom, the Via Nano AND the AMD Athlon 64 2000 all require a northbridge CPU which consumes too much power (see here and here). However from our own tests, the MPC8610 developer system consumes just 35Wt total at idle, and 37Wts at full power!!! This is a complete system with many features not required for an end consumer product (eg. a netbook). Coupled with the power of the AltiVec unit, it is just a question of time for someone to produce (again) a consumer PowerPC system using the MPC8610.

  • Finally, with regard to glibc performance, even if we take into account that some common routines are optimised (like strlen(), memcpy(), memcmp() plus some more), most string functions are NOT optimised. Not only that, glibc only includes reference implementations that perform the operations one-byte-at-a-time! How's that for inefficient? We're not talking about dummy unused joke functions here like memfrob(), but really important string and memory functions that are used pretty much everywhere, like strcmp(), strncmp(), strncpy(), etc.

  • In times where power consumption has become so much important, I would think that the first thing to do to save power is optimise the software, and what better place to start than the core parts of an operating system? I can't speak for the kernel -though I'm sure it's very optimised actually- but having looked at the glibc code extensively the past years, I can say that it's grossly unoptimised, so much it hurts.

  • AltiVec
  • Benchmarking
  • libfreevec
  • Login to post comments

libfreevec 1.0.4 benchmarks updated!

markos — Thu, 21/08/2008 - 11:23

Hello again,

I managed to find time to update all of the libfreevec benchmarks to the latest version 1.0.4 and also include more complete tests and added a non-ppc architecture (an Athlon X2 5000 @2.6Ghz) where the same tests were run (as 32-bit apps on a 64-bit Linux) for comparison. This is important for two reasons:

  • to find how PowerPC CPUs compare to a current popular x86 CPU (the same benchmarks will be done on an Intel CPU soon)
  • to find any deficiencies in glibc itself (as you will see there are many).

All benchmarks were run on OpenSuse 11.0, except for the G5 which uses Debian Lenny/testing. The compiler used was gcc 4.3.2. All functions have been tested to work correctly on each platform.

  • AltiVec
  • Benchmarking
  • libfreevec
  • Login to post comments
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • next ›
  • last »
Syndicate content

Primary links

  • About
    • History of libfreevec
  • Benchmarks
    • libfreevec

Search

User login

  • Request new password

Popular tags

Addition Algorithms AltiVec ARM Benchmarking float Inverse libfreevec Matrix Memory Multiply NEON reference Scale SSE String searching Subtraction Translate Transpose Tutorial
more tags

Please donate to libfreevec to ensure its continuing development! Donations are done via Paypal.





  • about
  • benchmarks

Copyright (c)2008 by CODEX.
Powered by Drupal. Using theme Deco.
All Google charts have been created by the CSV Chart and Chart API Drupal modules.