Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetricSummary::bounds cleanup #390

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

MetricSummary::bounds cleanup #390

wants to merge 2 commits into from

Conversation

epgts
Copy link
Contributor

@epgts epgts commented Apr 14, 2022

MetricSummary:

  • Delegate to I64Range for infinity rather than duplicating the concept.

I64Range

  • Cleanup and encapsulate I64Range's handling of infinite bounds.
  • Enforce validity in new factory functions empty and infinite.
    (method extend already did)
  • Expose #[inline] left and right methods.
  • Privatize left and right fields.
  • Factor out new method both out of various places that needed access
    to finite left and right.

Closes issue #386.

epgts added 2 commits April 14, 2022 10:36
MetricSummary:
- Delegate to I64Range for infinity rather than duplicating the concept.

I64Range
- Cleanup and encapsulate I64Range's handling of infinite bounds.
- Enforce validity in new factory functions `empty` and `infinite`.
  (method `extend` already did)
- Expose #[inline] left and right methods.
- Privatize left and right fields.
- Factor out new method `both` out of various places that needed access
  to finite left and right.

Closes issue #386.
Comment on lines +234 to +241
let (left, right);
match self.bounds.both() {
None => return Err(CounterError::BoundsInvalid),
Some((l, r)) => {
left = l;
right = r;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there really should be a cleaner way to do this unwrap_or_return behavior, but if there is I don't know it.

@@ -0,0 +1,62 @@
//! These data structures are stable, meaning they may not change even in
//! their layout in memory, as the raw bytes-in-memory are serialized and
//! exchanged by PostgreSQL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JLockerman can correct me if I have this wrong, but this is not actually the case for these objects. The MetricSummary is only used in the internal state, which is never persisted across version upgrades. The objects that aren't allowed to change are the pg_type declared final objects, CounterSummary and FlatSummary in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that would be a relief! If that's so, why do we test it? If we don't actually require stability of that serialization, we're just paying a tax every time we change it:

https://github.com/timescale/timescaledb-toolkit/blob/main/extension/src/counter_agg.rs#L972

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR: you're both partially right, the situation is confusing, and the problem goes away once we stop caring about timescaledb versions older than 2.7.0

Ah this is the confusing bit: postgres has two different methods of serialziation

  1. varlena types must exist as a flat memory buffer, and are written to disk in the same format as they exist in memory.
  2. If you want to use an aggregate in parallel execution, you need to be able to serialize the transition state into some (machine-agnostic?) wire format for IPC. Notably, we used to store the output of this in continuous aggregates, though that's slated to change in timescaledb 2.7.0

@WireBaron is saying that these types aren't the first kind; they only exist in transition functions–ephemerally in memory–and therefore they can change between version. The test is checking that we can serialize this to disk stably in continuous aggregates, which is a slightly weaker invariant; we can change the in-memory representation freely as long as we can keep deserializing old partials. Hopefully we'll be able to get everyone onto the new continuous aggregates format soon, and we'll be able to stop caring about the second form of stability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably this means that the following comment

//! Note that [MetricSummary] is already in violation, as it does not lock in
//! a memory representation and the Rust project makes no guarantees to
//! preserve this across releases of the compiler.  We should bump its
//! serialization version and repr(C) the new one.

is incorrect: the rust-memory layout doesn't really matter here, what matters is that serde guarantees layout-stability for the serialized form of a given struct definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the incorrect leap i was making was from "we directly derive the serialization form from the in-memory layout" (true) to "we serialize the actual in-memory layout" (false); the latter necessarily leads to "we need repr(C)", it's just not so! :)

Josh points out on a call that what I'm doing here is backwards: adding a second struct from which to derive (de)serialize code from when instead I should stop deriving and implement them directly. Which I'll do when I return to this, likely next week.

Thanks!

#[derive(Clone, Debug, PartialEq, Deserialize, Serialize)]
// TODO Our one serialization test (counter_byte_io) passes, but is that just luck?
//#[repr(C)]
pub struct MetricSummary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love creating this new struct that only differs by bounds being an option. Is this because you thought the original MetricSummary wasn't something that could be changed? In this case, I'd prefer changing the original object. Failing that, maybe we could make bounds non-public and add a bounds() accessor that returns an option?

@epgts epgts marked this pull request as draft April 27, 2022 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants