Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine: fix range query duplicate label handling #402

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MichaHoffmann
Copy link
Contributor

Prometheus seems to care about duplicate samples for label only during the same timestamp. Since our operators might return samples for one result in two different series we need to merge them to one result again.

@@ -144,6 +144,13 @@ func TestQueriesAgainstOldEngine(t *testing.T) {
(http_requests_total < 2*http_requests_total)
)`,
},
{
name: "duplicate label - top level presentation",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extend the description with what the expected result here is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a slightly better test name! The underlying issue is that we would return something like

{} =>
a @ [0]
b @ [3000]
{} =>
c @ [6000]
d @ [9000]

on main since the results appear in different series slots at different times

@MichaHoffmann MichaHoffmann force-pushed the mhoffm-fix-range-query-duplicate-sample-detection branch from db54372 to 515a5a8 Compare January 14, 2024 14:37
Prometheus seems to care about duplicate samples for label only during
the same timestamp. Since our operators might return samples for one
result in two different series we need to merge them to one result
again.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
@MichaHoffmann MichaHoffmann force-pushed the mhoffm-fix-range-query-duplicate-sample-detection branch from 515a5a8 to 57e4899 Compare January 14, 2024 14:41
@fpetkovski
Copy link
Collaborator

I would recommend that we do the mapping between different series IDs in the dedup checking operator. This way two series with the same ID would get mapped to a single ID and we will not break the operator contract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants