Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV compressor implementation #94

Merged
merged 4 commits into from
Aug 1, 2024
Merged

CSV compressor implementation #94

merged 4 commits into from
Aug 1, 2024

Conversation

worryg0d
Copy link
Collaborator

@worryg0d worryg0d commented Jul 1, 2024

Overview

This PR adds csv-compressor cli tool which utilizes Brro Compressor functionality to compress CSV formatted metrics data.
All parsed data from CSV is stored in a single WavBrro and Vsri for solution simplification.

Compress flow

  1. Read Samples from CSV reader (file, buffer, etc.)
  2. Transform samples into WavBrro and generate VSRI
  3. Compress generated WavBrro via Brro Compressor
  4. Output both compressed data (.bro) and indexes (.vsri)

Currently, it only parses CSV data of the following structure:

timestamp value
000001 0.01
000006 0.05

Decompress flow

  1. Read compressed data
  2. Decompress via Brro Compressor
  3. Generate samples from decompressed data and indexes by iterating over decompressed data and retrieving timestamps from VSRI for each data point
  4. Output generated samples as CSV

@worryg0d worryg0d added the enhancement New feature or request label Jul 1, 2024
@worryg0d worryg0d self-assigned this Jul 1, 2024
@worryg0d worryg0d marked this pull request as draft July 1, 2024 07:55
@worryg0d worryg0d changed the title [WIP] CSV compressor implementation CSV compressor implementation Jul 8, 2024
@worryg0d worryg0d marked this pull request as ready for review July 8, 2024 06:04
Copy link
Collaborator

@cjrolo cjrolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this!

For review pass I would think we are wasting a good opportunity to kill prometheus-remote. It served its purpose, but CSV integration will be easier to use. For this lib_vsri.rs this to be moved into it's own crate (and we need a different PR for that).

Also remove any IDE artifacts.

Good work, thanks!

.idea/vcs.xml Outdated Show resolved Hide resolved
prometheus-remote/src/lib.rs Outdated Show resolved Hide resolved
@worryg0d
Copy link
Collaborator Author

Hello @cjrolo. Thanks a lot for your feedback!

I like the idea of creating a new crate for VSRI lib. I'll handle this one. Also, I'll update this PR after VSRI crate is merged.

@worryg0d worryg0d mentioned this pull request Jul 18, 2024
@worryg0d worryg0d force-pushed the csv-compressor branch 5 times, most recently from 416f594 to 00dfec8 Compare July 26, 2024 11:27
@cjrolo
Copy link
Collaborator

cjrolo commented Jul 29, 2024

@rukai Can you please review this?

Thanks!

Copy link
Contributor

@rukai rukai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved to avoid blocking with timezones, please address the feedback before merging.

csv-compressor/src/main.rs Outdated Show resolved Hide resolved
csv-compressor/src/main.rs Show resolved Hide resolved
csv-compressor/src/main.rs Outdated Show resolved Hide resolved
csv-compressor/src/main.rs Outdated Show resolved Hide resolved
csv-compressor/src/main.rs Outdated Show resolved Hide resolved
csv-compressor/src/metric.rs Outdated Show resolved Hide resolved
csv-compressor/src/metric.rs Outdated Show resolved Hide resolved
csv-compressor/src/metric.rs Outdated Show resolved Hide resolved
csv-compressor/src/metric.rs Outdated Show resolved Hide resolved
csv-compressor/src/metric.rs Outdated Show resolved Hide resolved
csv-compressor/src/csv.rs Outdated Show resolved Hide resolved
@worryg0d
Copy link
Collaborator Author

Hey @rukai, thanks a lot for your feedback!
I usually work with Go so your reviews are a good chance to improve my knowledge of Rust and coding skills!
Thanks!

@worryg0d worryg0d force-pushed the csv-compressor branch 2 times, most recently from 7a4713e to 0310eba Compare July 31, 2024 08:35
@worryg0d
Copy link
Collaborator Author

Oof, I finally did it. This GitHub linter makes me crazy because of those indents!

@rukai, could you please look at this one again? I've updated the PR following your suggestions. I hope it looks a bit better now.

Thanks!

@rukai
Copy link
Contributor

rukai commented Jul 31, 2024

I'll take a proper look tomorrow since I'm finished for the day, but just so you know you can run cargo fmt to automatically apply the required formatting.

log = "0.4.19"
env_logger = "0.10.0"
vsri = { version = "0.1.0", path = "../vsri" }
tempdir = "0.3.7"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is only used for tests it should be included as a dev-dependency so it does not need to be built when building just csv-compressor without the tests.

Suggested change
tempdir = "0.3.7"
[dev-dependencies]
tempdir = "0.3.7"

@rukai rukai merged commit 780ea7c into main Aug 1, 2024
2 checks passed
@rukai
Copy link
Contributor

rukai commented Aug 1, 2024

I made tempdir into a dev-dependency and merged the PR.
Nice work!
Moving file IO into the read/write methods is a good idea 👍

@worryg0d
Copy link
Collaborator Author

worryg0d commented Aug 1, 2024

@rukai Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants