instaclustr · cjrolo · Nov 18, 2024 · Nov 11, 2024 · Nov 11, 2024 · Nov 11, 2024
diff --git a/README.md b/README.md
@@ -1,5 +1,30 @@
 # ATSC - Advance Time Series Compressor
 
+**NOTE:** This is still under development. Current status is unsupported!
+
+## Table of Contents
+
+1. [TL;DR;](#tldr)
+2. [Documentation](#documentation)
+3. [Building ATSC](#building-atsc)
+4. [What is ATSC?](#what-is-atsc)
+5. [Where does ATSC fit?](#where-does-atsc-fit)
+6. [ATSC Usage](#atsc-usage)
+7. [Releases](#releases)
+8. [Roadmap](#roadmap)
+
+## TL;DR
+
+The fastest way to test ATSC is with a CSV file!
+
+1. Download the latest [release](https://github.com/instaclustr/atsc/releases)
+2. Get a CSV file with the proper format (or get one from [tests folder](https://github.com/instaclustr/atsc/tree/main/atsc/tests/csv))
+3. Run it
+
+```bash
+cargo run --release -- --csv <input-file>
+```
+
 ## Documentation
 
 For full documentation please go to [Docs](https://github.com/instaclustr/atsc/tree/main/docs)
@@ -22,11 +47,16 @@ For full documentation please go to [Docs](https://github.com/instaclustr/atsc/t
 ## What is ATSC?
 
 Advanced Time Series Compressor (in short: ATSC), is a configurable, *lossy* compressor that uses the characteristics of a time-series to create a function approximation of the time series.
+
 This way, ATSC only needs to store the parametrization of the function and not the data.
+
 ATSC draws inspiration from established compression and signal analysis techniques, achieving significant compression ratios.
+
 In internal testing ATSC compressed from 46 times to 880 times the time series of our databases with a fitting error within 1% of the original time-series.
+
 In some cases, ATSC would produce highly compressed data without any data loss (Perfect fitting functions).
 ATSC is meant to be used in long term storage of time series, as it benefits from more points to do a better fitting.
+
 The decompression of data is faster (up to 40x) vs a slower compression speed, as it is expected that the data might be compressed once and decompressed several times.
 
 Internally ATSC uses the following methods for time series fitting:
@@ -36,11 +66,11 @@ Internally ATSC uses the following methods for time series fitting:
 * Interpolation - Catmull-Rom
 * Interpolation - Inverse Distance Weight
 
-For a more detailed insight into ATSC read the paper here: [ATSC - A novel approach to time-series compression](https://some.url.com)
+For a more detailed insight into ATSC read the paper here: [ATSC - A novel approach to time-series compression](https://github.com/instaclustr/atsc/tree/main/paper/ATCS-AdvancedTimeSeriesCompressor.pdf)
 
 Currently, ATSC uses an internal format to process time series (WBRO) and outputs a compressed format (BRO). A CSV to WBRO format is available here: [CSV Compressor](https://github.com/instaclustr/atsc/tree/main/csv-compressor)
 
-## Where does ATSC fits?
+## Where does ATSC fit?
 
 ATSC fits in any place that needs space reduction in trade for precision.
 ATSC is to time series what JPG/MP3 is to image/audio.
@@ -53,7 +83,7 @@ Example of use cases:
 * Long, slow moving data series (e.g. Weather data). Those will most probably follow an easy to fit pattern
 * Data that is meant to be visualized by humans and not machine processed (e.g. Operation teams). With such a small error, under 1%, it shouldn't impact analysis.
 
-## Usage ATSC
+## ATSC Usage
 
 ### Prerequisites
 
@@ -67,22 +97,38 @@ Those files would work as input for the compressor.
 
 Compressor usage:
 
-```bash
+```txt
 Usage: atsc [OPTIONS] <INPUT>
 
 Arguments:
   <INPUT>  input file
 
-   --compressor <COMPRESSOR>
+      --compressor <COMPRESSOR>
           Select a compressor, default is auto [default: auto] [possible values: auto, noop, fft, constant, polynomial, idw]
   -e, --error <ERROR>
-          Sets the maximum allowed error for the compressed data, must be between 0 and 50. Default is 5 (5%). 0 is lossless compression 50 will do a median filter on the data. In between will pick optimize for the error [default: 5]
+          Sets the maximum allowed error for the compressed data, must be between 0 and 50. Default is 5 (5%).
+          0 is lossless compression
+          50 will do a median filter on the data.
+          In between will pick optimize for the error [default: 5]
   -u
           Uncompresses the input file/directory
   -c, --compression-selection-sample-level <COMPRESSION_SELECTION_SAMPLE_LEVEL>
-          Samples the input data instead of using all the data for selecting the optimal compressor. Only impacts speed, might or not increased compression ratio. For best results use 0 (default). Only works when compression = Auto. 0 will use all the data (slowest) 6 will sample 128 data points (fastest) [default: 0]
+          Samples the input data instead of using all the data for selecting the optimal compressor.
+          Only impacts speed, might or not increased compression ratio. For best results use 0 (default).
+          Only works when compression = Auto.
+          0 will use all the data (slowest)
+          6 will sample 128 data points (fastest) [default: 0]
       --verbose
           Verbose output, dumps everysample in the input file (for compression) and in the ouput file (for decompression)
+      --csv
+          Defines user input as a CSV file
+      --no-header
+          Defines if the CSV has no header
+      --fields <FIELDS>
+          Defines names of fields in CSV file. It should follow this format:
+            --fields=TIME_FIELD_NAME,VALUE_FIELD_NAME
+          It assumes that the one before comma is a name of time field and the one
+          after comma is value field. [default: time,value]
   -h, --help
           Print help
   -V, --version
@@ -97,15 +143,27 @@ To compress a file using ATSC, run:
 atsc <input-file>
 ```
 
-### Decompress a File
+#### Decompress a File
+
 To decompress a file, use:
+
 ```bash
 atsc -u <input-file>
 ```
 
+## Releases
+
+### v0.5 - 30/11/2023
+
+* Added Polynomial Compressor (with 2 variants)
+* Created and Integrated a proper file type (wbro)
+* Benchmarks of the different compressors
+* Integration testing
+* Several fixes and cleanups
+
 ## Roadmap
 
 * Frame expansion (Allowing new data to be appended to existing frames)
 * Dynamic function loading (e.g. providing more functions without touching the whole code base)
 * Global/Per frame error storage
-* Efficient error encoding
+* Efficient error
diff --git a/atsc/demo/run_demo.sh b/atsc/demo/run_demo.sh
@@ -1,58 +1,106 @@
 #!/bin/bash
 infilename=$1
-
 echo "Original Size: "
 du -sb $infilename
-
-for i in 1 3; 
-do 
-    echo "### Error Level: $i";
-    mfile="comparison-error-$i.m"
-
-    cp $infilename tmp.wbro 
-
-    ../../target/debug/atsc --compressor fft --error $i --verbose tmp.wbro > $mfile
-    echo "FFT Size: "
-    du -sb tmp.bro
-    ../../target/debug/atsc -u --verbose tmp.bro >> $mfile
-
-    sed -i -e 's/Output/output_fft/g'  $mfile
-
-    cp $infilename tmp.wbro 
-
-    ../../target/debug/atsc --compressor idw --error $i tmp.wbro > /dev/null
-    echo "IDW Size: "
-    du -sb tmp.bro
-    ../../target/debug/atsc -u --verbose tmp.bro >> $mfile
-
-    sed -i -e 's/Output/output_idw/g'  $mfile
-
-    cp $infilename tmp.wbro 
-
-    ../../target/debug/atsc --compressor polynomial --error $i tmp.wbro > /dev/null
-    echo "Polynomial Size: "
-    du -sb tmp.bro
-    ../../target/debug/atsc -u --verbose tmp.bro >> $mfile
-
-    sed -i -e 's/Output/output_poly/g'  $mfile
-
-    cp $infilename tmp.wbro 
-
-    ../../target/debug/atsc --error $i tmp.wbro > /dev/null
-    echo "Auto Size: "
-    du -sb tmp.bro
-    ../../target/debug/atsc -u --verbose tmp.bro >> $mfile
-
-    sed -i -e 's/Output/output_auto/g'  $mfile
-
-    echo "hold on;" >> $mfile
-    echo "plot(Input,'g+', output_fft,'r', output_auto, 'b', output_poly, 'k')" >> $mfile
-    echo "plot(Input.*((100+$i)/100), 'color','#D95319');" >> $mfile
-    echo "plot(Input.*((100-$i)/100), 'color','#D95319');" >> $mfile
-    echo "legend('Data','FFT Compression', 'Auto Compression', 'Poly compression', 'Upper Error', 'Lower Error')" >> $mfile
-    echo "print -dpng comparison-$i.png" >> $mfile
-
-done
-
-rm tmp.wbro
-rm tmp.bro
+for i in 1 3; do
+  echo "### Error Level: $i"
+  htmlfile="comparison-error-$i.html"
+  cp $infilename tmp.wbro
+  ../../target/debug/atsc --compressor fft --error $i --verbose tmp.wbro > input.txt
+  echo "FFT Size: "
+  du -sb tmp.bro
+  ../../target/debug/atsc -u --verbose tmp.bro > tmp_fft.txt
+  cp $infilename tmp.wbro
+  ../../target/debug/atsc --compressor idw --error $i tmp.wbro > /dev/null
+  echo "IDW Size: "
+  du -sb tmp.bro
+  ../../target/debug/atsc -u --verbose tmp.bro > tmp_idw.txt
+  cp $infilename tmp.wbro
+  ../../target/debug/atsc --compressor polynomial --error $i tmp.wbro > /dev/null
+  echo "Polynomial Size: "
+  du -sb tmp.bro
+  ../../target/debug/atsc -u --verbose tmp.bro > tmp_poly.txt
+
+  # Create HTML file
+  echo "<!DOCTYPE html>" > $htmlfile
+  echo "<html lang=\"en\">" >> $htmlfile
+  echo "<head>" >> $htmlfile
+  echo "<meta charset=\"UTF-8\">" >> $htmlfile
+  echo "<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">" >> $htmlfile
+  echo "<title>Comparison Error Level $i</title>" >> $htmlfile
+  echo "<script src=\"https://cdn.jsdelivr.net/npm/chart.js\"></script>" >> $htmlfile
+  echo "<script src=\"https://cdn.jsdelivr.net/npm/chartjs-plugin-zoom\"></script>" >> $htmlfile
+  echo "</head>" >> $htmlfile
+  echo "<body>" >> $htmlfile
+  echo "<canvas id=\"myChart\" width=\"400\" height=\"200\"></canvas>" >> $htmlfile
+  echo "<script>" >> $htmlfile
+
+  # Read data from tmp files and convert to JavaScript arrays
+
+  file_content=$(<input.txt)
+  array_data=$(echo $file_content | grep -oP '\[.*?\]')
+  js_array="const inputData = $array_data;"
+  echo "$js_array" >> $htmlfile
+
+  file_content=$(<tmp_fft.txt)
+  array_data=$(echo $file_content | grep -oP '\[.*?\]')
+  js_array="const fftData = $array_data;"
+  echo "$js_array" >> $htmlfile
+
+   file_content=$(<tmp_idw.txt)
+  array_data=$(echo $file_content | grep -oP '\[.*?\]')
+  js_array="const idwData = $array_data;"
+  echo "$js_array" >> $htmlfile
+
+  file_content=$(<tmp_poly.txt)
+  array_data=$(echo $file_content | grep -oP '\[.*?\]')
+  js_array="const polyData = $array_data;"
+  echo "$js_array" >> $htmlfile
+
+  # JavaScript code to create the chart
+  echo "const ctx = document.getElementById('myChart').getContext('2d');" >> $htmlfile
+  echo "const myChart = new Chart(ctx, {" >> $htmlfile
+  echo "  type: 'line'," >> $htmlfile
+  echo "  data: {" >> $htmlfile
+  echo "    labels: Array.from({length: inputData.length}, (_, i) => i + 1)," >> $htmlfile
+  echo "    datasets: [" >> $htmlfile
+  echo "      { label: 'Data', data: inputData, borderColor: 'green', borderWidth: 1 }," >> $htmlfile
+  echo "      { label: 'FFT Compression', data: fftData, borderColor: 'red', borderWidth: 1 }," >> $htmlfile
+  echo "      { label: 'IDW Compression', data: idwData, borderColor: 'blue', borderWidth: 1 }," >> $htmlfile
+  echo "      { label: 'Poly Compression', data: polyData, borderColor: 'black', borderWidth: 1 }" >> $htmlfile
+  echo "    ]" >> $htmlfile
+  echo "  }," >> $htmlfile
+  echo "  options: {" >> $htmlfile
+  echo "    scales: {" >> $htmlfile
+  echo "      y: { beginAtZero: true }" >> $htmlfile
+  echo "    }," >> $htmlfile
+  echo "    plugins: {" >> $htmlfile
+  echo "      zoom: {" >> $htmlfile
+  echo "        pan: {" >> $htmlfile
+  echo "          enabled: true," >> $htmlfile
+  echo "          mode: 'xy'" >> $htmlfile
+  echo "        }," >> $htmlfile
+  echo "        zoom: {" >> $htmlfile
+  echo "          wheel: {" >> $htmlfile
+  echo "            enabled: true" >> $htmlfile
+  echo "          }," >> $htmlfile
+  echo "          pinch: {" >> $htmlfile
+  echo "            enabled: true" >> $htmlfile
+  echo "          }," >> $htmlfile
+  echo "          mode: 'xy'" >> $htmlfile
+  echo "        }" >> $htmlfile
+  echo "      }" >> $htmlfile
+  echo "    }" >> $htmlfile
+  echo "  }" >> $htmlfile
+  echo "});" >> $htmlfile
+  echo "</script>" >> $htmlfile
+  echo "</body>" >> $htmlfile
+  echo "</html>" >> $htmlfile
+
+  rm tmp.wbro
+  rm tmp.bro
+  rm tmp_fft.txt
+  rm tmp_idw.txt
+  rm tmp_poly.txt
+  rm input.txt
+done