- License: New BSD License
- Copyright (c) 2012, Piotr Tarsa
##Project description
This project contains implementations of a single compression algorithm in different languages and performance measurements of those implementations under available execution environments.
##Project goal
The goal is to evaluate programming platforms usefullness when it comes to develop algorithms that are heavy on computations. Ideally, a programming platform should have comparatively low resource usage and execute programs quickly while hiding complexities related to the programming platform and allowing the programmer to focus only on the problem that he was originally trying to solve. Usually, comparisons between execution environments for different programming languages are done using microbenchmarks. Unfortunately those microbenchmarks are too simple to draw any sensible conclusion. This project provides several implementations for a single nontrivial problem and focuses on side effects (ie produced output) while providing freedom to implementation details.
The requirements for any implementation are:
- Each implementation must produce identical output when fed with identical input and compression options
- Each implementation should expose all compression options that the format defines and allow for adjustment of that options at encoding phase
- Languages mixing is not allowed, eg mixing Python code with non-standard C modules
- There must be no side effects other than the output, ie the decompressed stream when decompressing, compressed stream when compressing and compression options when checking them
Parallelization is allowed.
##Performance results
All tests are performed on following system:
- CPU: Intel Core 2 Duo E8400 @ 3.00 GHz (stock),
- RAM: 8 GiB A-Data DDR3 RAM @ CL5, 1000 MHz
- OS: Ubuntu 12.04 64-bit
Encoding speed on [enwik8] 1 (command line interface implementations) - revision [53a85ed] 2:
Language | Real time | User time | Sys time | Execution environment | Notes |
---|---|---|---|---|---|
C | 7.473s | 7.165s | 0.276s | GNU GCC 4.6.3 | Three runs average, SSE optimizations (vectors and prefetching) |
C | 12.734s | 12.437s | 0.279s | GNU GCC 4.6.3 | Three runs average, no SSE optimizations |
Java | 20.587s | 20.326s | 0.280s | Oracle JDK 7 update 15 | Three runs average |
Python | 60.346s | 59.907s | 0.367s | ShedSkin 0.9.3 + GNU GCC 4.6.3 | Three runs average |
Python | 139.983s | 132.384s | 6.339s | PyPy 1.9.0 | Three runs average |
Python | 1806.230s | 1804.573s | 0.436s | CPython 2.7.3 | Single run, -OO |
Encoding speed on [enwik8] 1 (JavaScript in browsers) - revision [53a85ed] 2:
Browser | Reported time | Browser version | Notes |
---|---|---|---|
Chrome | 47.136s | 25.0.1364.97 | Three runs average |
Opera | 137.397s | 12.11 (build 1661) | Three runs average |
Firefox | 153.289s | 19.0 | Three runs average |
Encoding speed on [enwik9] 1 (command line interface implementations) - revision [53a85ed] 2:
Language | Real time | User time | Sys time | Execution environment | Notes |
---|---|---|---|---|---|
C | 67.853s | 66.129s | 0.764s | GNU GCC 4.6.3 | Three runs average, SSE optimizations (vectors and prefetching) |
C | 114.255s | 112.308s | 0.757s | GNU GCC 4.6.3 | Three runs average, no SSE optimizations |
Java | 189.163s | 186.413s | 0.845s | Oracle JDK 7 update 15 | Three runs average |
Python | 545.170s | 540.163s | 1.023s | ShedSkin 0.9.3 + GNU GCC 4.6.3 | Three runs average |
Python | 1086.181s | 1069.226s | 6.840s | PyPy 1.9.0 | Three runs average |
Python | 16799.145s | 16789.525s | 3.668s | CPython 2.7.3 | Single run, -OO |
Encoding speed on [enwik9] 1 (JavaScript in browsers) - revision [53a85ed] 2:
Browser | Reported time | Browser version | Notes |
---|---|---|---|
Chrome | 422.509s | 25.0.1364.97 | Three runs average |
Opera | 1287.090s | 12.11 (build 1661) | Three runs average |
Firefox | 1434.559s | 19.0 | Three runs average |