Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory allocations during tag and metric serialization #294

Merged
merged 3 commits into from
Sep 18, 2024

Conversation

schlubbi
Copy link
Contributor

@schlubbi schlubbi commented Sep 10, 2024

During memory profiling we noticed a high number of allocations originating from the Tag- and StatSerializer.
This change reduces the amount of allocations on the happy path by applying conditions to return early if the tag or metric names don’t include edge case scenarios.

Please see the benchmark comparison below to see the effect

Before
rspec spec/statsd/serialization/tag_serializer_spec.rb:219
Run options: include {:locations=>{"./spec/statsd/serialization/tag_serializer_spec.rb"=>[219]}}

Datadog::Statsd::Serialization::TagSerializer
  #format
    benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags     2.992M i/100ms
         global tags     2.999M i/100ms
          tags Array   161.844k i/100ms
           tags Hash    92.146k i/100ms
tags Array + global tags
                       127.560k i/100ms
tags Hash + global tags
                        94.333k i/100ms
Calculating -------------------------------------
             no tags     29.033M (± 4.4%) i/s -    146.589M in   5.059500s
         global tags     28.592M (± 4.0%) i/s -    143.975M in   5.044502s
          tags Array      1.690M (± 8.4%) i/s -      8.416M in   5.017852s
           tags Hash      1.083M (± 6.2%) i/s -      5.437M in   5.042581s
tags Array + global tags
                          1.222M (± 8.2%) i/s -      6.123M in   5.044381s
tags Hash + global tags
                        831.802k (±12.1%) i/s -      4.151M in   5.074196s

Comparison:
             no tags: 29033428.5 i/s
         global tags: 28592301.5 i/s - same-ish: difference falls within error
          tags Array:  1690469.5 i/s - 17.17x  slower
tags Array + global tags:  1222161.7 i/s - 23.76x  slower
           tags Hash:  1082883.0 i/s - 26.81x  slower
tags Hash + global tags:   831802.3 i/s - 34.90x  slower

      measure IPS
Calculating -------------------------------------
             no tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
         global tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
          tags Array   240.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
           tags Hash   400.000  memsize (     0.000  retained)
                         8.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
tags Array + global tags
                       392.000  memsize (   200.000  retained)
                         6.000  objects (     5.000  retained)
                         4.000  strings (     3.000  retained)
tags Hash + global tags
                       552.000  memsize (   200.000  retained)
                         9.000  objects (     5.000  retained)
                         4.000  strings (     3.000  retained)

Comparison:
             no tags:          0 allocated
         global tags:          0 allocated - same
          tags Array:        240 allocated - Infx more
tags Array + global tags:        392 allocated - Infx more
           tags Hash:        400 allocated - Infx more
tags Hash + global tags:        552 allocated - Infx more
      measure memory

Finished in 42.34 seconds (files took 0.09172 seconds to load)
2 examples, 0 failures
After
rspec spec/statsd/serialization/tag_serializer_spec.rb:219
Run options: include {:locations=>{"./spec/statsd/serialization/tag_serializer_spec.rb"=>[219]}}

Datadog::Statsd::Serialization::TagSerializer
  #format
    benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags     3.014M i/100ms
         global tags     2.956M i/100ms
          tags Array   243.821k i/100ms
           tags Hash   124.236k i/100ms
tags Array + global tags
                       165.977k i/100ms
tags Hash + global tags
                        97.519k i/100ms
Calculating -------------------------------------
             no tags     29.620M (± 2.4%) i/s -    150.715M in   5.091350s
         global tags     29.984M (± 2.1%) i/s -    150.757M in   5.030275s
          tags Array      2.487M (± 2.1%) i/s -     12.435M in   5.003250s
           tags Hash      1.330M (± 2.5%) i/s -      6.709M in   5.046724s
tags Array + global tags
                          1.663M (± 3.7%) i/s -      8.465M in   5.097958s
tags Hash + global tags
                          1.050M (± 3.5%) i/s -      5.266M in   5.021279s

Comparison:
         global tags: 29984432.6 i/s
             no tags: 29620311.3 i/s - same-ish: difference falls within error
          tags Array:  2486521.6 i/s - 12.06x  slower
tags Array + global tags:  1662829.5 i/s - 18.03x  slower
           tags Hash:  1330273.7 i/s - 22.54x  slower
tags Hash + global tags:  1050157.0 i/s - 28.55x  slower

      measure IPS
Calculating -------------------------------------
             no tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
         global tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
          tags Array   120.000  memsize (     0.000  retained)
                         2.000  objects (     0.000  retained)
                         1.000  strings (     0.000  retained)
           tags Hash   280.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
tags Array + global tags
                       272.000  memsize (    80.000  retained)
                         3.000  objects (     2.000  retained)
                         1.000  strings (     0.000  retained)
tags Hash + global tags
                       432.000  memsize (   240.000  retained)
                         6.000  objects (     5.000  retained)
                         4.000  strings (     3.000  retained)

Comparison:
             no tags:          0 allocated
         global tags:          0 allocated - same
          tags Array:        120 allocated - Infx more
tags Array + global tags:        272 allocated - Infx more
           tags Hash:        280 allocated - Infx more
tags Hash + global tags:        432 allocated - Infx more
      measure memory

Finished in 42.27 seconds (files took 0.09182 seconds to load)
2 examples, 0 failures
Before
rspec spec/statsd/serialization/stat_serializer_spec.rb:97
Run options: include {:locations=>{"./spec/statsd/serialization/stat_serializer_spec.rb"=>[97]}}

Datadog::Statsd::Serialization::StatSerializer
  benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags   241.660k i/100ms
no tags + sample rate
                       134.558k i/100ms
           with tags   101.980k i/100ms
with tags + sample rate
                        76.582k i/100ms
Calculating -------------------------------------
             no tags      2.444M (± 1.0%) i/s -     12.325M in   5.043100s
no tags + sample rate
                          1.361M (± 1.5%) i/s -      6.862M in   5.043675s
           with tags      1.020M (± 0.8%) i/s -      5.099M in   5.001806s
with tags + sample rate
                        763.286k (± 2.3%) i/s -      3.829M in   5.019571s

Comparison:
             no tags:  2444105.8 i/s
no tags + sample rate:  1360915.8 i/s - 1.80x  slower
           with tags:  1019507.1 i/s - 2.40x  slower
with tags + sample rate:   763286.0 i/s - 3.20x  slower

    measure IPS
Calculating -------------------------------------
             no tags   160.000  memsize (     0.000  retained)
                         4.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
no tags + sample rate
                       240.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
           with tags   480.000  memsize (     0.000  retained)
                         8.000  objects (     0.000  retained)
                         7.000  strings (     0.000  retained)
with tags + sample rate
                       520.000  memsize (     0.000  retained)
                         9.000  objects (     0.000  retained)
                         8.000  strings (     0.000  retained)

Comparison:
             no tags:        160 allocated
no tags + sample rate:        240 allocated - 1.50x more
           with tags:        480 allocated - 3.00x more
with tags + sample rate:        520 allocated - 3.25x more
    measure memory

Finished in 28.13 seconds (files took 0.09106 seconds to load)
2 examples, 0 failure
After
rspec spec/statsd/serialization/stat_serializer_spec.rb:97
Run options: include {:locations=>{"./spec/statsd/serialization/stat_serializer_spec.rb"=>[97]}}

Datadog::Statsd::Serialization::StatSerializer
  benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags   343.036k i/100ms
no tags + sample rate
                       159.735k i/100ms
           with tags   140.044k i/100ms
with tags + sample rate
                        89.899k i/100ms
Calculating -------------------------------------
             no tags      3.401M (± 3.8%) i/s -     17.152M in   5.052869s
no tags + sample rate
                          1.596M (± 0.4%) i/s -      7.987M in   5.004351s
           with tags      1.400M (± 1.6%) i/s -      7.002M in   5.004013s
with tags + sample rate
                        960.181k (± 0.9%) i/s -      4.855M in   5.056339s

Comparison:
             no tags:  3400803.1 i/s
no tags + sample rate:  1595983.7 i/s - 2.13x  slower
           with tags:  1399691.8 i/s - 2.43x  slower
with tags + sample rate:   960181.4 i/s - 3.54x  slower

    measure IPS
Calculating -------------------------------------
             no tags   120.000  memsize (     0.000  retained)
                         3.000  objects (     0.000  retained)
                         2.000  strings (     0.000  retained)
no tags + sample rate
                       200.000  memsize (     0.000  retained)
                         4.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
           with tags   320.000  memsize (     0.000  retained)
                         4.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
with tags + sample rate
                       360.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)

Comparison:
             no tags:        120 allocated
no tags + sample rate:        200 allocated - 1.67x more
           with tags:        320 allocated - 2.67x more
with tags + sample rate:        360 allocated - 3.00x more
    measure memory

Finished in 28.13 seconds (files took 0.08992 seconds to load)
2 examples, 0 failures

Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>

During memory profiling we noticed a high number of allocations
originating from the `Tag`- and `StatSerializer`.
This change reduces the amount of allocations on the happy path
by applying conditions to return early if the tag or metric names
don’t include edge case scenarios.

Please see the benchmark comparison below to see the effect

<details>
<summary>Before</summary>

```
rspec spec/statsd/serialization/tag_serializer_spec.rb:219
Run options: include {:locations=>{"./spec/statsd/serialization/tag_serializer_spec.rb"=>[219]}}

Datadog::Statsd::Serialization::TagSerializer
  #format
    benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags     2.992M i/100ms
         global tags     2.999M i/100ms
          tags Array   161.844k i/100ms
           tags Hash    92.146k i/100ms
tags Array + global tags
                       127.560k i/100ms
tags Hash + global tags
                        94.333k i/100ms
Calculating -------------------------------------
             no tags     29.033M (± 4.4%) i/s -    146.589M in   5.059500s
         global tags     28.592M (± 4.0%) i/s -    143.975M in   5.044502s
          tags Array      1.690M (± 8.4%) i/s -      8.416M in   5.017852s
           tags Hash      1.083M (± 6.2%) i/s -      5.437M in   5.042581s
tags Array + global tags
                          1.222M (± 8.2%) i/s -      6.123M in   5.044381s
tags Hash + global tags
                        831.802k (±12.1%) i/s -      4.151M in   5.074196s

Comparison:
             no tags: 29033428.5 i/s
         global tags: 28592301.5 i/s - same-ish: difference falls within error
          tags Array:  1690469.5 i/s - 17.17x  slower
tags Array + global tags:  1222161.7 i/s - 23.76x  slower
           tags Hash:  1082883.0 i/s - 26.81x  slower
tags Hash + global tags:   831802.3 i/s - 34.90x  slower

      measure IPS
Calculating -------------------------------------
             no tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
         global tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
          tags Array   240.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
           tags Hash   400.000  memsize (     0.000  retained)
                         8.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
tags Array + global tags
                       392.000  memsize (   200.000  retained)
                         6.000  objects (     5.000  retained)
                         4.000  strings (     3.000  retained)
tags Hash + global tags
                       552.000  memsize (   200.000  retained)
                         9.000  objects (     5.000  retained)
                         4.000  strings (     3.000  retained)

Comparison:
             no tags:          0 allocated
         global tags:          0 allocated - same
          tags Array:        240 allocated - Infx more
tags Array + global tags:        392 allocated - Infx more
           tags Hash:        400 allocated - Infx more
tags Hash + global tags:        552 allocated - Infx more
      measure memory

Finished in 42.34 seconds (files took 0.09172 seconds to load)
2 examples, 0 failures
```

</details>

<details>
<summary>After</summary>

```
rspec spec/statsd/serialization/tag_serializer_spec.rb:219
Run options: include {:locations=>{"./spec/statsd/serialization/tag_serializer_spec.rb"=>[219]}}

Datadog::Statsd::Serialization::TagSerializer
  #format
    benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags     3.014M i/100ms
         global tags     2.956M i/100ms
          tags Array   243.821k i/100ms
           tags Hash   124.236k i/100ms
tags Array + global tags
                       165.977k i/100ms
tags Hash + global tags
                        97.519k i/100ms
Calculating -------------------------------------
             no tags     29.620M (± 2.4%) i/s -    150.715M in   5.091350s
         global tags     29.984M (± 2.1%) i/s -    150.757M in   5.030275s
          tags Array      2.487M (± 2.1%) i/s -     12.435M in   5.003250s
           tags Hash      1.330M (± 2.5%) i/s -      6.709M in   5.046724s
tags Array + global tags
                          1.663M (± 3.7%) i/s -      8.465M in   5.097958s
tags Hash + global tags
                          1.050M (± 3.5%) i/s -      5.266M in   5.021279s

Comparison:
         global tags: 29984432.6 i/s
             no tags: 29620311.3 i/s - same-ish: difference falls within error
          tags Array:  2486521.6 i/s - 12.06x  slower
tags Array + global tags:  1662829.5 i/s - 18.03x  slower
           tags Hash:  1330273.7 i/s - 22.54x  slower
tags Hash + global tags:  1050157.0 i/s - 28.55x  slower

      measure IPS
Calculating -------------------------------------
             no tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
         global tags     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
          tags Array   120.000  memsize (     0.000  retained)
                         2.000  objects (     0.000  retained)
                         1.000  strings (     0.000  retained)
           tags Hash   280.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
tags Array + global tags
                       272.000  memsize (    80.000  retained)
                         3.000  objects (     2.000  retained)
                         1.000  strings (     0.000  retained)
tags Hash + global tags
                       432.000  memsize (   240.000  retained)
                         6.000  objects (     5.000  retained)
                         4.000  strings (     3.000  retained)

Comparison:
             no tags:          0 allocated
         global tags:          0 allocated - same
          tags Array:        120 allocated - Infx more
tags Array + global tags:        272 allocated - Infx more
           tags Hash:        280 allocated - Infx more
tags Hash + global tags:        432 allocated - Infx more
      measure memory

Finished in 42.27 seconds (files took 0.09182 seconds to load)
2 examples, 0 failures
```

</details>

<details>
<summary>Before</summary>

```
rspec spec/statsd/serialization/stat_serializer_spec.rb:97
Run options: include {:locations=>{"./spec/statsd/serialization/stat_serializer_spec.rb"=>[97]}}

Datadog::Statsd::Serialization::StatSerializer
  benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags   241.660k i/100ms
no tags + sample rate
                       134.558k i/100ms
           with tags   101.980k i/100ms
with tags + sample rate
                        76.582k i/100ms
Calculating -------------------------------------
             no tags      2.444M (± 1.0%) i/s -     12.325M in   5.043100s
no tags + sample rate
                          1.361M (± 1.5%) i/s -      6.862M in   5.043675s
           with tags      1.020M (± 0.8%) i/s -      5.099M in   5.001806s
with tags + sample rate
                        763.286k (± 2.3%) i/s -      3.829M in   5.019571s

Comparison:
             no tags:  2444105.8 i/s
no tags + sample rate:  1360915.8 i/s - 1.80x  slower
           with tags:  1019507.1 i/s - 2.40x  slower
with tags + sample rate:   763286.0 i/s - 3.20x  slower

    measure IPS
Calculating -------------------------------------
             no tags   160.000  memsize (     0.000  retained)
                         4.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
no tags + sample rate
                       240.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
           with tags   480.000  memsize (     0.000  retained)
                         8.000  objects (     0.000  retained)
                         7.000  strings (     0.000  retained)
with tags + sample rate
                       520.000  memsize (     0.000  retained)
                         9.000  objects (     0.000  retained)
                         8.000  strings (     0.000  retained)

Comparison:
             no tags:        160 allocated
no tags + sample rate:        240 allocated - 1.50x more
           with tags:        480 allocated - 3.00x more
with tags + sample rate:        520 allocated - 3.25x more
    measure memory

Finished in 28.13 seconds (files took 0.09106 seconds to load)
2 examples, 0 failure
```

</details>

<details>
<summary>After</summary>

```
rspec spec/statsd/serialization/stat_serializer_spec.rb:97
Run options: include {:locations=>{"./spec/statsd/serialization/stat_serializer_spec.rb"=>[97]}}

Datadog::Statsd::Serialization::StatSerializer
  benchmark
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
             no tags   343.036k i/100ms
no tags + sample rate
                       159.735k i/100ms
           with tags   140.044k i/100ms
with tags + sample rate
                        89.899k i/100ms
Calculating -------------------------------------
             no tags      3.401M (± 3.8%) i/s -     17.152M in   5.052869s
no tags + sample rate
                          1.596M (± 0.4%) i/s -      7.987M in   5.004351s
           with tags      1.400M (± 1.6%) i/s -      7.002M in   5.004013s
with tags + sample rate
                        960.181k (± 0.9%) i/s -      4.855M in   5.056339s

Comparison:
             no tags:  3400803.1 i/s
no tags + sample rate:  1595983.7 i/s - 2.13x  slower
           with tags:  1399691.8 i/s - 2.43x  slower
with tags + sample rate:   960181.4 i/s - 3.54x  slower

    measure IPS
Calculating -------------------------------------
             no tags   120.000  memsize (     0.000  retained)
                         3.000  objects (     0.000  retained)
                         2.000  strings (     0.000  retained)
no tags + sample rate
                       200.000  memsize (     0.000  retained)
                         4.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
           with tags   320.000  memsize (     0.000  retained)
                         4.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
with tags + sample rate
                       360.000  memsize (     0.000  retained)
                         5.000  objects (     0.000  retained)
                         4.000  strings (     0.000  retained)

Comparison:
             no tags:        120 allocated
no tags + sample rate:        200 allocated - 1.67x more
           with tags:        320 allocated - 2.67x more
with tags + sample rate:        360 allocated - 3.00x more
    measure memory

Finished in 28.13 seconds (files took 0.08992 seconds to load)
2 examples, 0 failures
```

</details>
@schlubbi schlubbi requested a review from a team as a code owner September 10, 2024 07:30
Copy link
Contributor

@rayz rayz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will get this in the next release 🙂

@schlubbi
Copy link
Contributor Author

Thanks! Will get this in the next release 🙂

You're welcome. Thanks for the review! Could I bother you to take a look at #295 too? Do you know already when the next release will be cut?

@rayz rayz merged commit a3291f2 into DataDog:master Sep 18, 2024
3 checks passed
@rayz
Copy link
Contributor

rayz commented Sep 20, 2024

The new release should be published now 🙂

@nikitug
Copy link

nikitug commented Sep 24, 2024

Hi Everyone, I guess this line introduced regression: undefined method name for :...:Symbol. Condition may be inverted I guess.

@ehdieunguyen
Copy link

Hi team,
We get the same issue when trying to bump to the version that includes these changes.
Screenshot 2024-10-22 at 22 09 34

@jhawthorn
Copy link
Contributor

Symbol#name only exists on Ruby 3.0+. Ruby 2.7 has been EOL from Ruby for a year and a half now, but if dogstatsd-ruby aims to continue support older versions probably this needs a conditional or a rescue to work around that.

@ivoanjo
Copy link
Member

ivoanjo commented Oct 30, 2024

Fix WIP in #297

@rayz
Copy link
Contributor

rayz commented Nov 4, 2024

Hi, sorry for the delay. I've published a new release5.6.3 with the fixes in #297 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants