Releases: thanos-io/thanos
v0.37.0-rc.0
The first release candidate of v0.37.0 is out!
We have some really interesting features this time around, with several improvements across components, a new replication protocol for Receivers, and even fixes for Prometheus v3! Do take a look at some of the breaking changes below!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you! 💜
Please try it out and let us know if you find any issues! 🚀
Changelog
Fixed
- #7511 Query Frontend: fix doubled gzip compression for response body.
- #7592 Ruler: Only increment
thanos_rule_evaluation_with_warnings_total
metric for non PromQL warnings. - #7614 *: fix debug log formatting.
- #7492 Compactor: update filtered blocks list before second downsample pass.
- #7658 Store: Fix panic because too small buffer in pool.
- #7643 Receive: fix thanos_receive_write_{timeseries,samples} stats
- #7644 fix(ui): add null check to find overlapping blocks logic
- #7814 Store: label_values: if matchers contain name=="something", do not add != "" to fetch less postings.
- #7679 Query: respect store.limit.* flags when evaluating queries
- #7821 Query/Receive: Fix coroutine leak introduced in #7796.
- #7843 Query Frontend: fix slow query logging for non-query endpoints.
- #7852 Query Frontend: pass "stats" parameter forward to queriers and fix Prometheus stats merging.
- #7832 Query Frontend: Fix cache keys for dynamic split intervals.
- #7885 Store: Return chunks to the pool after completing a Series call.
- #7893 Sidecar: Fix retrieval of external labels for Prometheus v3.0.0.
- #7903 Query: Fix panic on regex store matchers.
- #7915 Store: Close block series client at the end to not reuse chunk buffer
Added
- #7763 Ruler: use native histograms for client latency metrics.
- #7609 API: Add limit param to metadata APIs (series, label names, label values).
- #7429: Reloader: introduce
TolerateEnvVarExpansionErrors
to allow suppressing errors when expanding environment variables in the configuration file. When set, this will ensure that the reloader won't consider the operation to fail when an unset environment variable is encountered. Note that all unset environment variables are left as is, whereas all set environment variables are expanded as usual. - #7560 Query: Added the possibility of filtering rules by rule_name, rule_group or file to HTTP api.
- #7652 Store: Implement metadata API limit in stores.
- #7659 Receive: Add support for replication using Cap'n Proto. This protocol has a lower CPU and memory footprint, which leads to a reduction in resource usage in Receivers. Before enabling it, make sure that all receivers are updated to a version which supports this replication method.
- #7853 UI: Add support for selecting graph time range with mouse drag.
- #7855 Compcat/Query: Add support for comma separated replica labels.
- #7654 *: Add '--grpc-server-tls-min-version' flag to allow user to specify TLS version, otherwise default to TLS 1.3
- #7854 Query Frontend: Add
--query-frontend.force-query-stats
flag to force collection of query statistics from upstream queriers. - #7860 Store: Support hedged requests
- #7924 *: Upgrade promql-engine to
v0.0.0-20241106100125-097e6e9f425a
and objstore tov0.0.0-20241111205755-d1dd89d41f97
- #7835 Ruler: Add ability to do concurrent rule evaluations
- #7722 Query: Add partition labels flag to partition leaf querier in distributed mode
Changed
- #7494 Ruler: remove trailing period from SRV records returned by discovery
dnsnosrva
lookups - #7567 Query: Use thanos resolver for endpoint groups.
- #7741 Deps: Bump Objstore to
v0.0.0-20240913074259-63feed0da069
- #7813 Receive: enable initial TSDB compaction time randomization
- #7820 Sidecar: Use prometheus metrics for min timestamp
- #7886 Discovery: Preserve results from other resolve calls
- #7669 Receive: Change quorum calculation for rf=2
Removed
- #7704 *: breaking
⚠️ remove Store gRPC Info function. This has been deprecated for 3 years, its time to remove it. - #7793 Receive: Disable dedup proxy in multi-tsdb
- #7678 Query: Skip formatting strings if debug logging is disabled
New Contributors
- @eqfarhad made their first contribution in #7335
- @derrix060 made their first contribution in #7389
- @wndhydrnt made their first contribution in #7363
- @jeroenvandelockand made their first contribution in #7412
- @aritra24 made their first contribution in #7428
- @axeoman made their first contribution in #7469
- @rexagod made their first contribution in #7429
- @rootxrishabh made their first contribution in #7506
- @tghartland made their first contribution in #7492
- @NishantBansal2003 made their first contribution in #7552
- @djosix made their first contribution in #7581
- @mjtrangoni made their first contribution in #7641
- @harshitasao made their first contribution in #7650
- @riteshsonawane1372 made their first contribution in #7653
- @pureiboi made their first contribution in #7644
- @milinddethe15 made their first contribution in #7642
- @dominicqi made their first contribution in #7658
- @dongjiang1989 made their first contribution in #7764
- @xnet-mobile made their first contribution in #7785
- @niaurys made their first contribution in #7787
- @ntk148v made their first contribution in #7834
- @Reimirno made their first contribution in #7853
- @lachruzam made their first contribution in #7859
Full Commit History: v0.35.1...v0.37.0-rc.0
v0.36.1
v0.36.0
v0.36.0 is out now!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!
Please try it out and let us know if you find any issues! 🚀
Changelog
Fixed
- #7326 Query: fixing exemplars proxy when querying stores with multiple tenants.
- #7403 Sidecar: fix startup sequence
- #7484 Proxy: fix panic in lazy response set
Added
- #7317 Tracing: allow specifying resource attributes for the OTLP configuration.
- #7367 Store Gateway: log request ID in request logs.
- #7361 Query: breaking
⚠️ pass query stats from remote execution from server to client. We changed the protobuf of the QueryAPI, if you usequery.mode=distributed
you need to update your client (upper level Queriers) first, before updating leaf Queriers (servers). - #7363 Query-frontend: set value of remote_user field in Slow Query Logs from HTTP header
- #7335 Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.
- #7477 *: Bump objstore to
20240622095743-1afe5d4bc3cd
Changed
- #7334 Compactor: do not vertically compact downsampled blocks. Such cases are now marked with
no-compact-mark.json
. Fixes panicpanic: unexpected seriesToChunkEncoder lack of iterations
. - #7393 *: breaking
⚠️ Using native histograms for grpc middleware metrics. Metricsgrpc_client_handling_seconds
andgrpc_server_handling_seconds
will now be native histograms, if you have enabled native histogram scraping you will need to update your PromQL expressions to use the new metric names.
New Contributors
- @eqfarhad made their first contribution in #7335
- @derrix060 made their first contribution in #7389
- @wndhydrnt made their first contribution in #7363
- @jeroenvandelockand made their first contribution in #7412
- @aritra24 made their first contribution in #7428
- @axeoman made their first contribution in #7469
Full Changelog: v0.35.1...v0.36.0
v0.36.0-rc.1
The second release candidate of v0.36.0 is out!
We include a server gRPC histogram fix in this release.
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!
Please try it out and let us know if you find any issues! 🚀
Changelog
Fixed
- #7326 Query: fixing exemplars proxy when querying stores with multiple tenants.
- #7403 Sidecar: fix startup sequence
- #7484 Proxy: fix panic in lazy response set
Added
- #7317 Tracing: allow specifying resource attributes for the OTLP configuration.
- #7367 Store Gateway: log request ID in request logs.
- #7361 Query: breaking
⚠️ pass query stats from remote execution from server to client. We changed the protobuf of the QueryAPI, if you usequery.mode=distributed
you need to update your client (upper level Queriers) first, before updating leaf Queriers (servers). - #7363 Query-frontend: set value of remote_user field in Slow Query Logs from HTTP header
- #7335 Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.
- #7477 *: Bump objstore to
20240622095743-1afe5d4bc3cd
Changed
- #7334 Compactor: do not vertically compact downsampled blocks. Such cases are now marked with
no-compact-mark.json
. Fixes panicpanic: unexpected seriesToChunkEncoder lack of iterations
. - #7393 *: breaking
⚠️ Using native histograms for grpc middleware metrics. Metricsgrpc_client_handling_seconds
andgrpc_server_handling_seconds
will now be native histograms, if you have enabled native histogram scraping you will need to update your PromQL expressions to use the new metric names.
New Contributors
- @eqfarhad made their first contribution in #7335
- @derrix060 made their first contribution in #7389
- @wndhydrnt made their first contribution in #7363
- @jeroenvandelockand made their first contribution in #7412
- @aritra24 made their first contribution in #7428
- @axeoman made their first contribution in #7469
Full Changelog: v0.35.1...v0.36.0-rc.0
0.36.0-rc.0
The first release candidate of v0.36.0 is out!
We have mostly dependency bumps and bugfixes but some minor breaking changes, please see the changelog below for details.
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!
Please try it out and let us know if you find any issues! 🚀
Changelog
Fixed
- #7326 Query: fixing exemplars proxy when querying stores with multiple tenants.
- #7403 Sidecar: fix startup sequence
- #7484 Proxy: fix panic in lazy response set
Added
- #7317 Tracing: allow specifying resource attributes for the OTLP configuration.
- #7367 Store Gateway: log request ID in request logs.
- #7361 Query: breaking
⚠️ pass query stats from remote execution from server to client. We changed the protobuf of the QueryAPI, if you usequery.mode=distributed
you need to update your client (upper level Queriers) first, before updating leaf Queriers (servers). - #7363 Query-frontend: set value of remote_user field in Slow Query Logs from HTTP header
- #7335 Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.
- #7477 *: Bump objstore to
20240622095743-1afe5d4bc3cd
Changed
- #7334 Compactor: do not vertically compact downsampled blocks. Such cases are now marked with
no-compact-mark.json
. Fixes panicpanic: unexpected seriesToChunkEncoder lack of iterations
. - #7393 *: breaking
⚠️ Using native histograms for grpc middleware metrics. Metricsgrpc_client_handling_seconds
andgrpc_server_handling_seconds
will now be native histograms, if you have enabled native histogram scraping you will need to update your PromQL expressions to use the new metric names.
New Contributors
- @eqfarhad made their first contribution in #7335
- @derrix060 made their first contribution in #7389
- @wndhydrnt made their first contribution in #7363
- @jeroenvandelockand made their first contribution in #7412
- @aritra24 made their first contribution in #7428
- @axeoman made their first contribution in #7469
Full Changelog: v0.35.1...v0.36.0-rc.0
v0.35.1
This patch release bring a few fixes to all components and addresses a security concern! Please try it out and let us know if you face issues! 🚀
Changelog
Fixed
- #7323 Sidecar: wait for prometheus on startup
- #6948 Receive: fix goroutines leak during series requests to thanos store api.
- #7382 *: Ensure objstore flag values are masked & disable debug/pprof/cmdline
- #7392 Query: fix broken min, max for pre 0.34.1 sidecars
- #7373 Receive: Fix stats for remote write
- #7318 Compactor: Recover from panic to log block ID
Full Changelog: v0.35.0...v0.35.1
v0.35.0
v0.35.0 is out now!
We have several amazing features this time, including distributed query execution, receive tenant-label based request splitting, better query analysis, and loads of bugfixes and optimizations!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!
Please try it out and let us know if you find any issues! 🚀
Changelog
Fixed
- #7083 Store Gateway: Fix lazy expanded postings with 0 length failed to be cached.
- #7080 Receive: race condition in handler Close() when stopped early
- #7132 Documentation: fix broken helm installation instruction
- #7134 Store, Compact: Revert the recursive block listing mechanism introduced in #6474 and use the same strategy as in 0.31. Introduce a
--block-discovery-strategy
flag to control the listing strategy so that a recursive lister can still be used if the tradeoff of slower but cheaper discovery is preferred. - #7122 Store Gateway: Fix lazy expanded postings estimate base cardinality using posting group with remove keys.
- #7166 Receive/MultiTSDB: Do not delete non-uploaded blocks
- #7179 Query: Fix merging of query analysis
- #7224 Query-frontend: Add Redis username to the client configuration.
- #7220 Store Gateway: Fix lazy expanded postings caching partial expanded postings and bug of estimating remove postings with non existent value. Added
PromQLSmith
based fuzz test to improve correctness. - #7225 Compact: Don't halt due to overlapping sources when vertical compaction is enabled
- #7244 Query: Fix Internal Server Error unknown targetHealth: "unknown" when trying to open the targets page.
- #7248 Receive: Fix RemoteWriteAsync was sequentially executed causing high latency in the ingestion path.
- #7271 Query: fixing dedup iterator when working on mixed sample types.
- #7289 Query Frontend: show warnings from downstream queries.
- #7308 Store: Batch TSDB Infos for blocks.
Added
- #7155 Receive: Add tenant globbing support to hashring config
- #7231 Tracing: added missing sampler types
- #7194 Downsample: retry objstore related errors
- #7105 Rule: add flag
--query.enable-x-functions
to allow usage of extended promql functions (xrate, xincrease, xdelta) in loaded rules - #6867 Query UI: Tenant input box added to the Query UI, in order to be able to specify which tenant the query should use.
- #7186 Query UI: Only show tenant input box when query tenant enforcement is enabled
- #7175 Query: Add
--query.mode=distributed
which enables the new distributed mode of the Thanos query engine. - #7199 Reloader: Add support for watching and decompressing Prometheus configuration directories
- #7200 Query: Add
--selector.relabel-config
and--selector.relabel-config-file
flags which allows scoping the Querier to a subset of matched TSDBs. - #7233 UI: Showing Block Size Stats
- #7256 Receive: Split remote-write HTTP requests via tenant labels of series
- #7269 Query UI: Show peak/total samples in query analysis
- #7280 *: Adding User-Agent to request logs
- #7219 Receive: add
--remote-write.client-tls-secure
and--remote-write.client-tls-skip-verify
flags to stop relying on grpc server config to determine grpc client secure/skipVerify. - #7297 *: mark as not queryable if status is not ready
- #7302 Considering the
X-Forwarded-For
header for the remote address in the logs. - #7304 Store: Use loser trees for merging results
Changed
- #7123 Rule: Change default Alertmanager API version to v2.
- #7192 Rule: Do not turn off ruler even if resolving fails
- #7223 Automatic detection of memory limits and configure GOMEMLIMIT to match.
- #7283 Compact: breaking
⚠️ Replace group with resolution in compact downsample metrics to avoid cardinality explosion with large numbers of groups. - #7305 Query|Receiver: Do not log full request on ProxyStore by default.
New Contributors
- @hanyuting8 made their first contribution in #7078
- @cincinnat made their first contribution in #7087
- @bavarianbidi made their first contribution in #7132
- @chetanpdeshmukh made their first contribution in #7141
- @munir131 made their first contribution in #7181
- @payalraviya made their first contribution in #7193
- @TheSpiritXIII made their first contribution in #7199
- @outofrange made their first contribution in #7233
- @roth-wine made their first contribution in #7250
- @suhas-chikkanna made their first contribution in #7254
- @NeerajNagure made their first contribution in #7231
- @tizki made their first contribution in #7245
- @NotAFile made their first contribution in #7266
- @yj-yoo made their first contribution in #7143
- @magiceses made their first contribution in #7268
- @guillaumelecerf made their first contribution in #7219
Full Commit History: v0.34.1...v0.35.0-rc.0
v0.35.0-rc.0
The first release candidate of v0.35.0 is out!
We have several amazing features this time, including distributed query execution, receive tenant-label based request splitting, better query analysis, and loads of bugfixes and optimizations!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!
Please try it out and let us know if you find any issues! 🚀
Changelog
Fixed
- #7083 Store Gateway: Fix lazy expanded postings with 0 length failed to be cached.
- #7080 Receive: race condition in handler Close() when stopped early
- #7132 Documentation: fix broken helm installation instruction
- #7134 Store, Compact: Revert the recursive block listing mechanism introduced in #6474 and use the same strategy as in 0.31. Introduce a
--block-discovery-strategy
flag to control the listing strategy so that a recursive lister can still be used if the tradeoff of slower but cheaper discovery is preferred. - #7122 Store Gateway: Fix lazy expanded postings estimate base cardinality using posting group with remove keys.
- #7166 Receive/MultiTSDB: Do not delete non-uploaded blocks
- #7179 Query: Fix merging of query analysis
- #7224 Query-frontend: Add Redis username to the client configuration.
- #7220 Store Gateway: Fix lazy expanded postings caching partial expanded postings and bug of estimating remove postings with non existent value. Added
PromQLSmith
based fuzz test to improve correctness. - #7225 Compact: Don't halt due to overlapping sources when vertical compaction is enabled
- #7244 Query: Fix Internal Server Error unknown targetHealth: "unknown" when trying to open the targets page.
- #7248 Receive: Fix RemoteWriteAsync was sequentially executed causing high latency in the ingestion path.
- #7271 Query: fixing dedup iterator when working on mixed sample types.
- #7289 Query Frontend: show warnings from downstream queries.
- #7308 Store: Batch TSDB Infos for blocks.
Added
- #7155 Receive: Add tenant globbing support to hashring config
- #7231 Tracing: added missing sampler types
- #7194 Downsample: retry objstore related errors
- #7105 Rule: add flag
--query.enable-x-functions
to allow usage of extended promql functions (xrate, xincrease, xdelta) in loaded rules - #6867 Query UI: Tenant input box added to the Query UI, in order to be able to specify which tenant the query should use.
- #7186 Query UI: Only show tenant input box when query tenant enforcement is enabled
- #7175 Query: Add
--query.mode=distributed
which enables the new distributed mode of the Thanos query engine. - #7199 Reloader: Add support for watching and decompressing Prometheus configuration directories
- #7200 Query: Add
--selector.relabel-config
and--selector.relabel-config-file
flags which allows scoping the Querier to a subset of matched TSDBs. - #7233 UI: Showing Block Size Stats
- #7256 Receive: Split remote-write HTTP requests via tenant labels of series
- #7269 Query UI: Show peak/total samples in query analysis
- #7280 *: Adding User-Agent to request logs
- #7219 Receive: add
--remote-write.client-tls-secure
and--remote-write.client-tls-skip-verify
flags to stop relying on grpc server config to determine grpc client secure/skipVerify. - #7297 *: mark as not queryable if status is not ready
- #7302 Considering the
X-Forwarded-For
header for the remote address in the logs. - #7304 Store: Use loser trees for merging results
Changed
- #7123 Rule: Change default Alertmanager API version to v2.
- #7192 Rule: Do not turn off ruler even if resolving fails
- #7223 Automatic detection of memory limits and configure GOMEMLIMIT to match.
- #7283 Compact: breaking
⚠️ Replace group with resolution in compact downsample metrics to avoid cardinality explosion with large numbers of groups. - #7305 Query|Receiver: Do not log full request on ProxyStore by default.
New Contributors
- @hanyuting8 made their first contribution in #7078
- @cincinnat made their first contribution in #7087
- @bavarianbidi made their first contribution in #7132
- @chetanpdeshmukh made their first contribution in #7141
- @munir131 made their first contribution in #7181
- @payalraviya made their first contribution in #7193
- @TheSpiritXIII made their first contribution in #7199
- @outofrange made their first contribution in #7233
- @roth-wine made their first contribution in #7250
- @suhas-chikkanna made their first contribution in #7254
- @NeerajNagure made their first contribution in #7231
- @tizki made their first contribution in #7245
- @NotAFile made their first contribution in #7266
- @yj-yoo made their first contribution in #7143
- @magiceses made their first contribution in #7268
- @guillaumelecerf made their first contribution in #7219
Full Commit History: v0.34.1...v0.35.0-rc.0
v0.34.1
This patch release includes a fix for CVE-2023-44478, thanks @hanyuting8!
Changelog
Fixed
- #7078 *: Bump gRPC to 1.57.2
Added
Changed
Removed
Full Changelog: v0.34.0...v0.34.1
v0.34.0
v0.34.0 is out!
Thank you to all contributors who have contributed to this release. It wouldn't be possible without you.
Please take note that the default value of the flag --sync-block-duration
has been updated from 3m to 15m!
You can find the changelog with all of the details below. Let's also celebrate all our new contributors!
Changelog
Fixed
- #7011 Query Frontend: queries with negative offset should check whether it is cacheable or not.
- #6874 Sidecar: fix labels returned by 'api/v1/series' in presence of conflicting external and inner labels.
- #7009 Rule: Fix spacing error in URL.
- #7082 Stores: fix label values edge case when requesting external label values with matchers
Added
- #6756 Query: Add
query.enable-tenancy
&query.tenant-label-name
options to allow enforcement of tenancy on the query path, by injecting labels into queries (uses prom-label-proxy internally). - #6944 Receive: Added a new flag for maximum retention bytes.
- #6891 Objstore: Bump
objstore
which adds support for Azure Workload Identity. - #6453 Sidecar: Added
--reloader.method
to support configuration reloads via SIHUP signal. - #6925 Store Gateway: Support float native histogram.
- #6954 Index Cache: Support tracing for fetch APIs.
- #6943 Ruler: Added
keep_firing_for
field in alerting rule. - #6972 Store Gateway: Apply series limit when streaming series for series actually matched if lazy postings is enabled.
- #6984 Store Gateway: Added
--store.index-header-lazy-download-strategy
to specify how to lazily download index headers when lazy mmap is enabled. - #6887 Query Frontend: breaking
⚠️ Add tenant label to relevant exported metrics. Note that this change may cause some pre-existing custom dashboard queries to be incorrect due to the added label. - #7028 Query|Query Frontend: Add new
--query-frontend.enable-x-functions
flag to enable experimental extended functions. - #6884 Tools: Add upload-block command to upload blocks to object storage.
Changed
- #6539 Store: breaking
⚠️ Changed--sync-block-duration
default 3m to 15m.
Removed
New Contributors
- @lpreethvika made their first contribution in #6829
- @danielblando made their first contribution in #6850
- @rikhil-s made their first contribution in #6891
- @sinkingpoint made their first contribution in #6886
- @MeenuyD made their first contribution in #6907
- @mercxry made their first contribution in #6933
- @wenxu1024 made their first contribution in #6902
- @kartikaysaxena made their first contribution in #6927
- @sagnik3788 made their first contribution in #6952
- @JHeilCoveo made their first contribution in #6943
- @pawarpranav83 made their first contribution in #6998
- @tasrieit made their first contribution in #7023
- @Pratham1812 made their first contribution in #7026
- @alecrajeev made their first contribution in #7032
- @Player256 made their first contribution in #6539
Full Changelog: v0.33.0...v0.34.0-rc.0