freevec.org

  • about
  • benchmarks
Home › libfreevec 1.0.4 benchmarks updated!

Search

Primary links

  • About
    • History of libfreevec
  • Benchmarks
    • libfreevec

Please donate to libfreevec to ensure its continuing development! Donations are done via Paypal.





memcmp()

markos — Thu, 06/03/2008 - 13:57

Description

According to the man page, the memcmp() function compares the first n bytes of the memory areas s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2. It returns an integer less than, equal to, or greater than zero if the first n bytes of s1 is found, respectively, to be less than, to match, or be greater than the first n bytes of s2.

memcmp() is one of the few functions that are assembly optimised for most architectures. It definitely is optimised for x86 and x86_64, but sadly it is not optimised for the powerpc (actually, glibc offers a POWER4-only optimised version, but the rest of the powerpc subarches don't use an optimised one). In any case the glibc implementation compares 32-bit (or 64-bit depending on the arch) blocks where possible. In libfreevec we also do that, but as we already have stated, we also use modern SIMD units (AltiVec for PowerPC CPUs, and in the future SSE for x86 CPUs). This has the effect that for smaller sizes, the performance is the same but for larger sizes performance increases dramatically.

Each CPU in detail:

And for comparison here is the result of the same benchmark run on an Athlon X2 5000 (2.5Ghz), running 32-bit code:

Results/Comments

The Athlon has a much faster FSB (800Mhz vs 533Mhz for the G5 and the MPC8610), which explains its superior performance in memcmp(). At least for aligned comparisons, however, the G5 is actually faster than the Athlon as for some reason memcmp()'s performance is almost halved when comparing unaligned blocks. Sadly, performance for unaligned blocks drops on the MPC8610 as well. Actually, what's more amazing seems to be the excellent performance of the Athlon X2 in very small sizes.

SIMD

  • AltiVec
  • libfreevec
  • Memory operations
‹ memchr() up memcpy() ›
  • Login or register to post comments

SIMD

  • Algorithms (31)
    • Algebra (9)
      • Matrix operations (8)
    • Bit operations (0)
    • Codecs (0)
      • Audio (0)
      • Video (0)
    • Comparison (0)
      • image comparison (0)
      • Levenshtein (0)
    • Compression (0)
      • Bzip2 (0)
      • Gzip (0)
      • LZMA (0)
      • LZW (0)
      • Squashfs (0)
      • Zlib (0)
    • Encryption (0)
      • AES (0)
      • DES (0)
      • RSA (0)
      • Salsa (0)
      • SSL (0)
    • Hashing (1)
      • CRC (0)
      • TCP/IP checksum (0)
      • UMAC (0)
    • Memory operations (15)
    • Multiprecision (0)
    • Searching (5)
      • String searching (5)
    • Sorting (0)
  • Software (32)
    • Benchmarking (2)
    • Libraries (30)
      • Eigen2 (0)
      • libfreevec (22)
      • simdX86 (8)
  • Architecture (32)
    • AltiVec (32)
    • ARM NEON (0)
    • CELL SPU (0)
    • SSE (0)
    • VIS (0)

User login

  • Create new account
  • Request new password
  • about
  • benchmarks

Copyright (c)2008 by CODEX.
Powered by Drupal. Using theme Deco.
All Google charts have been created by the CSV Chart and Chart API Drupal modules.