Releases: grafana/mimir
2.6.2
Changelog
2.6.2
Grafana Mimir
- [BUGFIX] Security: updates Go to version 1.20.4 to fix CVE-2023-24539, CVE-2023-24540, CVE-2023-29400. #4903
Full Changelog: mimir-2.6.1...mimir-2.6.2
2.8.0-rc.2
This release contains 2 PRs from 2 authors. Thank you!
Changelog
2.8.0-rc.2
Grafana Mimir
- [ENHANCEMENT] Ruler: Improve rule upload performance when not enforcing per-tenant rule group limits. #4828
All changes in this release: mimir-2.8.0-rc.1...mimir-2.8.0-rc.2
2.8.0-rc.1
This release contains 8 PRs from 2 authors. Thank you!
Changelog
2.8.0-rc.1
Grafana Mimir
- [ENHANCEMENT] Improved memory limit on the in-memory cache used for regular expression matchers. #4751
- [ENHANCEMENT] Go: update to 1.20.3. #4773
- [BUGFIX] Packaging: fix preremove script preventing upgrades. #4801
All changes in this release: mimir-2.8.0-rc.0...mimir-2.8.0-rc.1
2.6.1
This release contains 3 PRs from 2 authors. Thank you!
Changelog
2.6.1
Grafana Mimir
- [BUGFIX] Security: updates Go to version 1.20.3 to fix CVE-2023-24538 #4798
All changes in this release: mimir-2.6.0...mimir-2.6.1
2.7.2
This release contains 2 PRs from 2 authors. Thank you!
Changelog
2.7.2
Grafana Mimir
- [BUGFIX] Security: updated Go version to 1.20.3 to fix CVE-2023-24538 #4795
All changes in this release: mimir-2.7.1...mimir-2.7.2
2.8.0-rc.0
This release contains 210 PRs from 53 authors, including new contributors Abdurrahman J. Allawala, Ashray Jain, Cyrill N, Daniel Barnes, Dave, David van der Spek, day4me, Devin Trejo, Dmitriy Okladin, Gabriel Santos, inbarpatashnik, Johannes Tandler, Julien Girard, KingJ, Miller, Rafał Boniecki, Raphael Ferreira, Raúl Marín, Ruslan Kovalov, Shagit Ziganshin, shanmugara, Wilfried ROSET. Thank you!
Grafana Mimir version 2.8.0-rc.0 release notes
Grafana Labs is excited to announce version 2.8 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.
Features and enhancements
- Changed default value of block storage retention period The default value for
-blocks-storage.tsdb.retention-periodwas24hand now is13h. - Query-frontend cached results now contain timestamp This allows Mimir to check if cached results are still valid based on current TTL configured for tenant. Results cached by previous Mimir version are used until they expire from cache, which can take up to 7 days. If you need to use per-tenant TTL sooner, please flush results cache manually.
- Experimental support for using Redis as cache Mimir now can use Redis for caching results, chunks, index and metadata.
- Experimental support for fetching secret from Vault for TLS configuration.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the Grafana Mimir Helm chart documentation.
Important changes
In Grafana Mimir 2.8 we have removed the following previously deprecated or experimental configuration options or metrics.
The following metrics have been removed cortex_bucket_store_series_get_all_duration_seconds, cortex_bucket_store_series_merge_duration_seconds,
cortex_ingester_tsdb_wal_replay_duration_seconds.
The following configuration options are deprecated and will be removed in Grafana Mimir 2.10:
- The CLI flag
-blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startupand its respective YAML configuration optiontsdb.max_tsdb_opening_concurrency_on_startup.
The following experimental options and features are now stable:
- Use protobuf internal query result payload format by default.
Bug fixes
- Querier: Streaming remote read will now continue to return multiple chunks per frame after the first frame. PR 4423
- Query-frontend: don't retry queries which error inside PromQL. PR 4643
- Store-gateway & query-frontend: report more consistent statistics for fetched index bytes. PR 4671
- Native histograms: fix how IsFloatHistogram determines if mimirpb.Histogram is a float histogram. PR 4706
- Query-frontend: fix query sharding for native histograms. PR 4666
Changelog
2.8.0-rc.0
Grafana Mimir
- [CHANGE] Ingester: changed experimental CLI flag from
-out-of-order-blocks-external-label-enabledto-ingester.out-of-order-blocks-external-label-enabled#4440 - [CHANGE] Store-gateway: The following metrics have been removed: #4332
cortex_bucket_store_series_get_all_duration_secondscortex_bucket_store_series_merge_duration_seconds
- [CHANGE] Ingester: changed default value of
-blocks-storage.tsdb.retention-periodfrom24hto13h. If you're running Mimir with a custom configuration and you're overriding-querier.query-store-afterto a value greater than the default12hthen you should increase-blocks-storage.tsdb.retention-periodaccordingly. #4382 - [CHANGE] Ingester: the configuration parameter
-blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startuphas been deprecated and will be removed in Mimir 2.10. #4445 - [CHANGE] Query-frontend: Cached results now contain timestamp which allows Mimir to check if cached results are still valid based on current TTL configured for tenant. Results cached by previous Mimir version are used until they expire from cache, which can take up to 7 days. If you need to use per-tenant TTL sooner, please flush results cache manually. #4439
- [CHANGE] Ingester: the
cortex_ingester_tsdb_wal_replay_duration_secondsmetrics has been removed. #4465 - [CHANGE] Query-frontend and ruler: use protobuf internal query result payload format by default. This feature is no longer considered experimental. #4557 #4709
- [CHANGE] Ruler: reject creating federated rule groups while tenant federation is disabled. Previously the rule groups would be silently dropped during bucket sync. #4555
- [CHANGE] Compactor: the
/api/v1/upload/block/{block}/finishendpoint now returns a429status code when the compactor has reached the limit specified by-compactor.max-block-upload-validation-concurrency. #4598 - [CHANGE] Compactor: when starting a block upload the maximum byte size of the block metadata provided in the request body is now limited to 1 MiB. If this limit is exceeded a
413status code is returned. #4683 - [CHANGE] Store-gateway: cache key format for expanded postings has changed. This will invalidate the expanded postings in the index cache when deployed. #4667
- [FEATURE] Cache: Introduce experimental support for using Redis for results, chunks, index, and metadata caches. #4371
- [FEATURE] Vault: Introduce experimental integration with Vault to fetch secrets used to configure TLS for clients. Server TLS secrets will still be read from a file.
tls-ca-path,tls-cert-pathandtls-key-pathwill denote the path in Vault for the following CLI flags when-vault.enabledis true: #4446.-distributor.ha-tracker.etcd.*-distributor.ring.etcd.*-distributor.forwarding.grpc-client.*-querier.store-gateway-client.*-ingester.client.*-ingester.ring.etcd.*-querier.frontend-client.*-query-frontend.grpc-client-config.*-query-frontend.results-cache.redis.*-blocks-storage.bucket-store.index-cache.redis.*-blocks-storage.bucket-store.chunks-cache.redis.*-blocks-storage.bucket-store.metadata-cache.redis.*-compactor.ring.etcd.*-store-gateway.sharding-ring.etcd.*-ruler.client.*-ruler.alertmanager-client.*-ruler.ring.etcd.*-ruler.query-frontend.grpc-client-config.*-alertmanager.sharding-ring.etcd.*-alertmanager.alertmanager-client.*-memberlist.*-query-scheduler.grpc-client-config.*-query-scheduler.ring.etcd.*-overrides-exporter.ring.etcd.*
- [FEATURE] Distributor, ingester, querier, query-frontend, store-gateway: add experimental support for native histograms. Requires that the experimental protobuf query result response format is enabled by
-query-frontend.query-result-response-format=protobufon the query frontend. #4286 #4352 #4354 #4376 #4377 #4387 #4396 #4425 #4442 #4494 #4512 #4513 #4526 - [FEATURE] Added
-<prefix>.s3.storage-classflag to configure the S3 storage class for objects written to S3 buckets. #4300 - [FEATURE] Add
freebsdto the target OS when generating binaries for a Mimir release. #4654 - [FEATURE] Ingester: Add
prepare-shutdownendpoint which can be used as part of Kubernetes scale down automations. #4718 - [ENHANCEMENT] Add timezone information to Alpine Docker images. #4583
- [ENHANCEMENT] Ruler: Sync rules when ruler JOINING the ring instead of ACTIVE, In order to reducing missed rule iterations during ruler restarts. #4451
- [ENHANCEMENT] Allow to define service name used for tracing via
JAEGER_SERVICE_NAMEenvironment variable. #4394 - [ENHANCEMENT] Querier and query-frontend: add experimental, more performant protobuf query result response format enabled with
-query-frontend.query-result-response-format=protobuf. #4304 #4318 #4375 - [ENHANCEMENT] Compactor: added experimental configuration parameter
-compactor.first-level-compaction-wait-period, to configure how long the compactor should wait before compacting 1st level blocks (uploaded by ingesters). This configuration option allows to reduce the chances compactor begins compacting blocks before all ingesters have uploaded their blocks to the storage. #4401 - [ENHANCEMENT] Store-gateway: use more efficient chunks fetching and caching. #4255
- [ENHANCEMENT] Query-frontend and ruler: add experimental, more performant protobuf internal query result response format enabled with
-ruler.query-frontend.query-result-response-format=protobuf. #4331 - [ENHANCEMENT] Ruler: increased tolerance for missed iterations on alerts, reducing the chances of flapping firing alerts during ruler restarts. #4432
- [ENHANCEMENT] Optimized
.*and.+regular expression label matchers. #4432 - [ENHANCEMENT] Optimized regular expression label matchers with alternates (e.g.
a|b|c). #4647 - [ENHANCEMENT] Added an in-memory cache for regular expression matchers, to avoid parsing and compiling the same expression multiple times when used in recurring queries. #4633
- [ENHANCEMENT] Query-frontend: results cache TTL is now configurable by using
-query-frontend.results-cache-ttland-query-frontend.results-cache-ttl-for-out-of-order-time-windowoptions. These values can also be specified per tenant. Default values are unchanged (7 days and 10 minutes respectively). #4385 - [ENHANCEMENT] Ingester: added advanced configuration parameter
-blocks-storage.tsdb.wal-replay-concurrencyrepresenting the maximum number of CPUs used during WAL replay. #4445 - [ENHANCEMENT] Ingester: added metrics
cortex_ingester_tsdb_open_duration_seconds_totalto measure the total time it takes to open all existing TSDBs. T...
2.7.1
This release contains 177 PRs from 43 authors, including new contributors Bartosz Cisek, dggmsa, gmintoco, Ihor Urazov, James Ross, Jean-Philippe Quéméner, Jon Gutschon, l3ioo, lpugoy, Nicolás Pazos, Oscar, Reto Kupferschmid, ying-jeanne. Thank you!
Grafana Mimir version 2.7.1 release notes
Grafana Labs is excited to announce version 2.7.1 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.
Note: During the release process, version 2.7.0 was tagged too early, before completing the release checklist and production testing. Release 2.7.1 doesn't include any code changes since 2.7.0, but now has proper release notes, published documentation, and has been fully tested in our production environment.
Features and enhancements
- Store-gateway streaming enabled by default The new default value of
5000for-blocks-storage.bucket-store.batch-series-sizeenables store-gateway streaming in the default configuration. This means that series are loaded from object storage in batches rather than buffering them all in memory before returning to the querier. Enabling streaming can reduce memory utilization peaks in the store-gateway. - Store-gateway index header reader no longer uses mmap by default Along with streaming enabled in the store-gateway, this change contributes to more efficient memory usage. See the Important changes section for more details.
- Support for
keep_firing_foroption to ruler configuration This new option determines the amount of time an alert should keep firing while the ruler expression doesn't return results. - More efficient chunks fetching and caching Enable with the new experimental feature flag
-blocks-storage.bucket-store.chunks-cache.fine-grained-chunks-caching-enabled=true. This should reduce CPU, memory utilization, and receive bandwidth of a store-gateway. - Experimental query sharding improvements:
A new configuration parameter,-query-frontend.query-sharding-target-series-per-shard, allows query sharding to take into account cardinality of similar requests executed previously when computing the maximum number of shards to use. If you want to try it out, we recommend starting with a value of2500. - Experimental support for native histogram ingestion:
Native histograms can now be ingested. The new per-tenant limit-ingester.native-histograms-ingestion-enabledcontrols whether native histograms are stored or ignored. The support for querying native histograms is not complete yet and it's expected to be available in the next release.
Alertmanager improvements
- New metrics The following upstream metrics are now exposed:
cortex_alertmanager_dispatcher_aggregation_groupscortex_alertmanager_dispatcher_alert_processing_duration_seconds
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the Grafana Mimir Helm chart documentation.
Important changes
In Grafana Mimir 2.7, the default vaules of the following configuration options have changed:
-blocks-storage.bucket-store.batch-series-sizeis now enabled by default with a value of5000.-ruler.evaluation-delay-durationhas changed from0to1m.
In Grafana Mimir 2.7, the following configuration options are now deprecated:
-blocks-storage.bucket-store.chunks-cache.subrange-sizesince there's no benefit to changing the default of16000-blocks-storage.bucket-store.consistency-delayhas been deprecated and will be removed in Mimir 2.9.-compactor.consistency-delayhas been deprecated and will be removed in Mimir 2.9.-ingester.ring.readiness-check-ring-healthhas been deprecated and will be removed in Mimir 2.9.
In Grafana Mimir 2.7, the following options, metrics, and labels have been removed:
- Experimental support for ephemeral storage introduced in Mimir 2.6.0 has been removed.
- Following options are no longer available:
-blocks-storage.ephemeral-tsdb.*-distributor.ephemeral-series-enabled-distributor.ephemeral-series-matchers-ingester.max-ephemeral-series-per-user-ingester.instance-limits.max-ephemeral-series
- The following metrics have been removed:
cortex_ingester_ephemeral_seriescortex_ingester_ephemeral_series_created_totalcortex_ingester_ephemeral_series_removed_totalcortex_ingester_ingested_ephemeral_samples_totalcortex_ingester_ingested_ephemeral_samples_failures_totalcortex_ingester_memory_ephemeral_userscortex_ingester_queries_ephemeral_totalcortex_ingester_queried_ephemeral_samplescortex_ingester_queried_ephemeral_series
- Additionally, querying using the
{__mimir_storage__="ephemeral"}selector no longer works. All label values with theephemeral-prefix within thereasonlabel of thecortex_discarded_samples_totalmetric are no longer available.
- Following options are no longer available:
- The store-gateway default index header reader no longer uses mmap and the mmap-based index header reader has been removed. The following flags have been changed:
-blocks-storage.bucket-store.index-header.map-populate-enabledhas been removed-blocks-storage.bucket-store.index-header.stream-reader-enabledhas been removed-blocks-storage.bucket-store.index-header.stream-reader-max-idle-file-handleshas been renamed to-blocks-storage.bucket-store.index-header.max-idle-file-handles, and the corresponding configuration file option has been renamed fromstream_reader_max_idle_file_handlestomax_idle_file_handles
Bug fixes
- Store-gateway: return Canceled rather than Aborted or Internal error when the calling querier cancels a label names or values request, and return Internal if processing the request fails for another reason. PR 4061
- Querier: track canceled requests with status code 499 in the metrics instead of 503 or 422. PR 4099
- Ingester: compact out-of-order data during /ingester/flush or when TSDB is idle. PR 4180
- Ingester: conversion of global limits max-series-per-user, max-series-per-metric, max-metadata-per-user and max-metadata-per-metric into corresponding local limits now takes into account the number of ingesters in each zone. PR 4238
- Ingester: track cortex_ingester_memory_series metric consistently with cortex_ingester_memory_series_created_total and cortex_ingester_memory_series_removed_total. PR 4312
- Querier: fixed a bug which was incorrectly matching series with regular expression label matchers with begin/end anchors in the middle of the regular expression. PR 4340
Changelog
2.7.1
Grafana Mimir
- [CHANGE] Ingester: the configuration parameter
-ingester.ring.readiness-check-ring-healthhas been deprecated and will be removed in Mimir 2.9. #4422 - [CHANGE] Ruler: changed default value of
-ruler.evaluation-delay-durationoption from 0 to 1m. #4250 - [CHANGE] Querier: Errors with status code
422coming from the store-gateway are propagated and not converted to the consistency check error anymore. #4100 - [CHANGE] Store-gateway: When a query hits
max_fetched_chunks_per_queryandmax_fetched_series_per_querylimits, an error with the status code422is created and returned. #4056 - [CHANGE] Packaging: Migrate FPM packaging solution to NFPM. Rationalize packages dependencies and add package for all binaries. #3911
- [CHANGE] Store-gateway: Deprecate flag
-blocks-storage.bucket-store.chunks-cache.subrange-sizesince there's no benefit to changing the default of16000. #4135 - [CHANGE] Experimental support for ephemeral storage introduced in Mimir 2.6.0 has been removed. Following options are no longer available: #4252
-blocks-storage.ephemeral-tsdb.*-distributor.ephemeral-series-enabled-distributor.ephemeral-series-matchers-ingester.max-ephemeral-series-per-user-ingester.instance-limits.max-ephemeral-series
Querying with using{__mimir_storage__="ephemeral"}selector no longer works. All label values withephemeral-prefix inreasonlabel ofcortex_discarded_samples_totalmetric are no longer available. Following metrics have been removed:cortex_ingester_ephemeral_seriescortex_ingester_ephemeral_series_created_totalcortex_ingester_ephemeral_series_removed_totalcortex_ingester_ingested_ephemeral_samples_totalcortex_ingester_ingested_ephemeral_samples_failures_totalcortex_ingester_memory_ephemeral_userscortex_ingester_queries_ephemeral_totalcortex_ingester_queried_ephemeral_samplescortex_ingester_queried_ephemeral_series
- [CHANGE] Store-gateway: use mmap-less index-header reader by default and remove mmap-based index header reader. The following flags have changed: #4280
-blocks-storage.bucket-store.index-header.map-populate-enabledhas been removed-blocks-storage.bucket-store.index-header.stream-reader-enabledhas been removed-blocks-storage.bucket-store.index-header.stream-reader-max-idle-file-handleshas been renamed to-blocks-storage.bucket-store.index-header.max-idle-file-handles, and the corresponding configuration file option has been renamed fromstream_reader_max_idle_file_handlestomax_idle_file_handles
- [CHANGE] Store-gateway: the streaming store-gateway is now enabled by default. The new default setting for `-blocks-storage.bucket-store.batc...
2.6.0
This release contains 259 PRs from 40 authors, including new contributors breadly7, bubu11e, Đurica Yuri Nikolić, Felix Beuke, Jack, klagroix, Martin Chodur, Ørjan Ommundsen, Sascha Sternheim, Wu Zhiyuan. Thank you!
Grafana Mimir version 2.6.0 release notes
Grafana Labs is excited to announce version 2.6 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.
Features and enhancements
-
Lower memory usage in store-gateway by streaming series results
The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. This is still an experimental feature but Grafana Labs is already running it in production and there's no known issue. This feature can be enabled setting the-blocks-storage.bucket-store.batch-series-sizeconfiguration option (if you want to try it out, we recommend you setting to 5000). -
Improved stability in store-gateway by removing mmap usage
The store-gateway can now use an alternate code path to read index-headers that does not use memory mapped files. This is expected to improve stability of the store-gateway. This is still an experimental feature but Grafana Labs is already running it in production and there's no known issue. This feature can be enabled setting-blocks-storage.bucket-store.index-header.stream-reader-enabled=true.
Alertmanager improvements
-
Webex support Alertmanager can now use Webex to send alerts.
-
tenantID template function A new template function
tenantID, returning the ID of the tenant owning the alert, has been added. -
grafanaExploreURL template function A new template function
grafanaExploreURL, returning the URL to the Grafana explore page with range query, has been added.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the corresponding documentation for more information.
Important changes
In Grafana Mimir 2.6 we have removed the following previously deprecated or experimental configuration options:
- The CLI flag
-blocks-storage.bucket-store.max-concurrent-reject-over-limitand its respective YAML configuration optionblocks_storage.bucket_store.max_concurrent_reject_over_limit. - The CLI flag
-query-frontend.align-querier-with-stepand its respective YAML configuration optionfrontend.align_querier_with_step.
The following configuration options are deprecated and will be removed in Grafana Mimir 2.8:
- The CLI flag
-store.max-query-lengthand its respective YAML configuration optionlimits.max_query_lengthhave been replaced with-querier.max-partial-query-lengthandlimits.max_partial_query_length.
The following experimental options and features are now stable:
- The CLI flag
-query-frontend.max-total-query-lengthand its respective YAML configuration optionlimits.max_total_query_length. - The CLI flags
-distributor.request-rate-limitand-distributor.request-burst-limitand their respective YAML configuration optionslimits.request_rate_limitandlimits.request_rate_burst. - The CLI flag
-ingester.max-global-exemplars-per-userand its respective YAML configuration optionlimits.max_global_exemplars_per_user. - The CLI flag
-ingester.tsdb-config-update-periodits respective YAML configuration optioningester.tsdb_config_update_period. - The API endpoint
/api/v1/query_exemplars.
Bug fixes
- Alertmanager: Fix template spurious deletion with relative data dir. PR 3604
- Security: Update prometheus/exporter-toolkit for CVE-2022-46146. PR 3675
- Security: Update golang.org/x/net for CVE-2022-41717. PR 3755
- Debian package: Fix post-install, environment file path and user creation. PR 3720
- Memberlist: Fix panic during Mimir startup when Mimir receives gossip message before it's ready. PR 3746
- Update
github.com/thanos-io/objstoreto address issue with Multipart PUT on s3-compatible Object Storage. PR 3802 PR 3821 - Querier: Canceled requests are no longer reported as "consistency check" failures. PR 3837 PR 3927
- Distributor: Don't panic when
metric_relabel_configsin overrides contains null element. PR 3868 - Ingester, Compactor: Fix panic that can occur when compaction fails. PR 3955
Changelog
2.6.0
Grafana Mimir
- [CHANGE] Querier: Introduce
-querier.max-partial-query-lengthto limit the time range for partial queries at the querier level and deprecate-store.max-query-length. #3825 #4017 - [CHANGE] Store-gateway: Remove experimental
-blocks-storage.bucket-store.max-concurrent-reject-over-limitflag. #3706 - [CHANGE] Ingester: If shipping is enabled block retention will now be relative to the upload time to cloud storage. If shipping is disabled block retention will be relative to the creation time of the block instead of the mintime of the last block created. #3816
- [CHANGE] Query-frontend: Deprecated CLI flag
-query-frontend.align-querier-with-stephas been removed. #3982 - [FEATURE] Store-gateway: streaming of series. The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. You can enable this feature by setting
-blocks-storage.bucket-store.batch-series-sizeto a value in the high thousands (5000-10000). This is still an experimental feature and is subject to a changing API and instability. #3540 #3546 #3587 #3606 #3611 #3620 #3645 #3355 #3697 #3666 #3687 #3728 #3739 #3751 #3779 #3839 - [FEATURE] Alertmanager: Added support for the Webex receiver. #3758
- [FEATURE] Limits: Added the
-validation.separate-metrics-group-labelflag. This allows further separation of thecortex_discarded_samples_totalmetric by an additionalgrouplabel - which is configured by this flag to be the value of a specific label on an incoming timeseries. Active groups are tracked and inactive groups are cleaned up on a defined interval. The maximum number of groups tracked is controlled by the-max-separate-metrics-groups-per-userflag. #3439 - [FEATURE] Overrides-exporter: Added experimental ring support to overrides-exporter via
-overrides-exporter.ring.enabled. When enabled, the ring is used to establish a leader replica for the export of limit override metrics. #3908 #3953 - [FEATURE] Ephemeral storage (experimental): Mimir can now accept samples into "ephemeral storage". Such samples are available for querying for a short amount of time (
-blocks-storage.ephemeral-tsdb.retention-period, defaults to 10 minutes), and then removed from memory. To use ephemeral storage, distributor must be configured with-distributor.ephemeral-series-enabledoption. Series matching-distributor.ephemeral-series-matcherswill be marked for storing into ephemeral storage in ingesters. Each tenant needs to have ephemeral storage enabled by using-ingester.max-ephemeral-series-per-userlimit, which defaults to 0 (no ephemeral storage). Ingesters have new-ingester.instance-limits.max-ephemeral-serieslimit for total number of series in ephemeral storage across all tenants. If ingestion of samples into ephemeral storage fails,cortex_discarded_samples_totalmetric will use values prefixed withephemeral-forreasonlabel. Querying of ephemeral storage is possible by using{__mimir_storage__="ephemeral"}as metric selector. Following new metrics related to ephemeral storage are introduced: #3897 #3922 #3961 #3997 #4004cortex_ingester_ephemeral_seriescortex_ingester_ephemeral_series_created_totalcortex_ingester_ephemeral_series_removed_totalcortex_ingester_ingested_ephemeral_samples_totalcortex_ingester_ingested_ephemeral_samples_failures_totalcortex_ingester_memory_ephemeral_userscortex_ingester_queries_ephemeral_totalcortex_ingester_queried_ephemeral_samplescortex_ingester_queried_ephemeral_series
- [ENHANCEMENT] Added new metric
thanos_shipper_last_successful_upload_time: Unix timestamp (in seconds) of the last successful TSDB block uploaded to the bucket. #3627 - [ENHANCEMENT] Ruler: Added
-ruler.alertmanager-client.tls-enabledconfiguration for alertmanager client. #3432 #3597 - [ENHANCEMENT] Activity tracker logs now have
component=activity-trackerlabel. #3556 - [ENHANCEMENT] Distributor: remove labels with empty values #2439
- [ENHANCEMENT] Query-frontend: track query HTTP requests in the Activity Tracker. #3561
- [ENHANCEMENT] Store-gateway: Add experimental alternate implementation of index-header reader that does not use memory mapped files. The index-header reader is expected to improve stability of the store-gateway. You can enable this implementation with the flag
-blocks-storage.bucket-store.index-header.stream-reader-enabled. #3639 #3691 #3703 #3742 #3785 #3787 #3797 - [ENHANCEMENT] Query-scheduler: add
cortex_query_scheduler_cancelled_requests_totalmetric to track the number of requests that are already cancelled when dequeued. #3696 - [ENHANCEMENT] Store-gateway: add
cortex_bucket_store_partitioner_extended_ranges_totalmetric to keep ...
2.6.0-rc.0
This release contains 255 PRs from 40 authors, including new contributors breadly7, bubu11e, Đurica Yuri Nikolić, Felix Beuke, Jack, klagroix, Martin Chodur, Ørjan Ommundsen, Sascha Sternheim, Wu Zhiyuan. Thank you!
Grafana Mimir version 2.6.0-rc.0 release notes
Grafana Labs is excited to announce version 2.6.0-rc.0 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.
Features and enhancements
-
Lower memory usage in store-gateway by streaming series results
The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. This is still an experimental feature but Grafana Labs is already running it in production and there's no known issue. This feature can be enabled setting the-blocks-storage.bucket-store.batch-series-sizeconfiguration option (if you want to try it out, we recommend you setting to 5000). -
Improved stability in store-gateway by removing mmap usage
The store-gateway can now use an alternate code path to read index-headers that does not use memory mapped files. This is expected to improve stability of the store-gateway. This is still an experimental feature but Grafana Labs is already running it in production and there's no known issue. This feature can be enabled setting-blocks-storage.bucket-store.index-header.stream-reader-enabled=true.
Alertmanager improvements
-
Webex support Alertmanager can now use Webex to send alerts.
-
tenantID template function A new template function
tenantID, returning the ID of the tenant owning the alert, has been added. -
grafanaExploreURL template function A new template function
grafanaExploreURL, returning the URL to the Grafana explore page with range query, has been added.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm chart is now released independently. See the corresponding documentation for more information.
Important changes
In Grafana Mimir 2.6 we have removed the following previously deprecated or experimental configuration options:
- The CLI flag
-blocks-storage.bucket-store.max-concurrent-reject-over-limitand its respective YAML configuration optionblocks_storage.bucket_store.max_concurrent_reject_over_limit. - The CLI flag
-query-frontend.align-querier-with-stepand its respective YAML configuration optionfrontend.align_querier_with_step.
The following configuration options are deprecated and will be removed in Grafana Mimir 2.8:
- The CLI flag
-store.max-query-lengthand its respective YAML configuration optionlimits.max_query_lengthhave been replaced with-querier.max-partial-query-lengthandlimits.max_partial_query_length.
The following experimental options and features are now stable:
- The CLI flag
-query-frontend.max-total-query-lengthand its respective YAML configuration optionlimits.max_total_query_length. - The CLI flags
-distributor.request-rate-limitand-distributor.request-burst-limitand their respective YAML configuration optionslimits.request_rate_limitandlimits.request_rate_burst. - The CLI flag
-ingester.max-global-exemplars-per-userand its respective YAML configuration optionlimits.max_global_exemplars_per_user. - The CLI flag
-ingester.tsdb-config-update-periodits respective YAML configuration optioningester.tsdb_config_update_period. - The API endpoint
/api/v1/query_exemplars.
Bug fixes
- Alertmanager: Fix template spurious deletion with relative data dir. PR 3604
- Security: Update prometheus/exporter-toolkit for CVE-2022-46146. PR 3675
- Security: Update golang.org/x/net for CVE-2022-41717. PR 3755
- Debian package: Fix post-install, environment file path and user creation. PR 3720
- Memberlist: Fix panic during Mimir startup when Mimir receives gossip message before it's ready. PR 3746
- Update
github.com/thanos-io/objstoreto address issue with Multipart PUT on s3-compatible Object Storage. PR 3802 PR 3821 - Querier: Canceled requests are no longer reported as "consistency check" failures. PR 3837 PR 3927
- Distributor: Don't panic when
metric_relabel_configsin overrides contains null element. PR 3868 - Ingester, Compactor: Fix panic that can occur when compaction fails. PR 3955
Changelog
2.6.0-rc.0
Grafana Mimir
- [CHANGE] Querier: Introduce
-querier.max-partial-query-lengthto limit the time range for partial queries at the querier level and deprecate-store.max-query-length. #3825 #4017 - [CHANGE] Store-gateway: Remove experimental
-blocks-storage.bucket-store.max-concurrent-reject-over-limitflag. #3706 - [CHANGE] Ingester: If shipping is enabled block retention will now be relative to the upload time to cloud storage. If shipping is disabled block retention will be relative to the creation time of the block instead of the mintime of the last block created. #3816
- [CHANGE] Query-frontend: Deprecated CLI flag
-query-frontend.align-querier-with-stephas been removed. #3982 - [FEATURE] Store-gateway: streaming of series. The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. You can enable this feature by setting
-blocks-storage.bucket-store.batch-series-sizeto a value in the high thousands (5000-10000). This is still an experimental feature and is subject to a changing API and instability. #3540 #3546 #3587 #3606 #3611 #3620 #3645 #3355 #3697 #3666 #3687 #3728 #3739 #3751 #3779 #3839 - [FEATURE] Alertmanager: Added support for the Webex receiver. #3758
- [FEATURE] Limits: Added the
-validation.separate-metrics-group-labelflag. This allows further separation of thecortex_discarded_samples_totalmetric by an additionalgrouplabel - which is configured by this flag to be the value of a specific label on an incoming timeseries. Active groups are tracked and inactive groups are cleaned up on a defined interval. The maximum number of groups tracked is controlled by the-max-separate-metrics-groups-per-userflag. #3439 - [FEATURE] Overrides-exporter: Added experimental ring support to overrides-exporter via
-overrides-exporter.ring.enabled. When enabled, the ring is used to establish a leader replica for the export of limit override metrics. #3908 #3953 - [FEATURE] Ephemeral storage (experimental): Mimir can now accept samples into "ephemeral storage". Such samples are available for querying for a short amount of time (
-blocks-storage.ephemeral-tsdb.retention-period, defaults to 10 minutes), and then removed from memory. To use ephemeral storage, distributor must be configured with-distributor.ephemeral-series-enabledoption. Series matching-distributor.ephemeral-series-matcherswill be marked for storing into ephemeral storage in ingesters. Each tenant needs to have ephemeral storage enabled by using-ingester.max-ephemeral-series-per-userlimit, which defaults to 0 (no ephemeral storage). Ingesters have new-ingester.instance-limits.max-ephemeral-serieslimit for total number of series in ephemeral storage across all tenants. If ingestion of samples into ephemeral storage fails,cortex_discarded_samples_totalmetric will use values prefixed withephemeral-forreasonlabel. Querying of ephemeral storage is possible by using{__mimir_storage__="ephemeral"}as metric selector. Following new metrics related to ephemeral storage are introduced: #3897 #3922 #3961 #3997 #4004cortex_ingester_ephemeral_seriescortex_ingester_ephemeral_series_created_totalcortex_ingester_ephemeral_series_removed_totalcortex_ingester_ingested_ephemeral_samples_totalcortex_ingester_ingested_ephemeral_samples_failures_totalcortex_ingester_memory_ephemeral_userscortex_ingester_queries_ephemeral_totalcortex_ingester_queried_ephemeral_samplescortex_ingester_queried_ephemeral_series
- [ENHANCEMENT] Added new metric
thanos_shipper_last_successful_upload_time: Unix timestamp (in seconds) of the last successful TSDB block uploaded to the bucket. #3627 - [ENHANCEMENT] Ruler: Added
-ruler.alertmanager-client.tls-enabledconfiguration for alertmanager client. #3432 #3597 - [ENHANCEMENT] Activity tracker logs now have
component=activity-trackerlabel. #3556 - [ENHANCEMENT] Distributor: remove labels with empty values #2439
- [ENHANCEMENT] Query-frontend: track query HTTP requests in the Activity Tracker. #3561
- [ENHANCEMENT] Store-gateway: Add experimental alternate implementation of index-header reader that does not use memory mapped files. The index-header reader is expected to improve stability of the store-gateway. You can enable this implementation with the flag
-blocks-storage.bucket-store.index-header.stream-reader-enabled. #3639 #3691 #3703 #3742 #3785 #3787 #3797 - [ENHANCEMENT] Query-scheduler: add
cortex_query_scheduler_cancelled_requests_totalmetric to track the number of requests that are already cancelled when dequeued. #3696 - [ENHANCEMENT] Store-gateway: add `cortex_bucket_store_partitioner_ex...
2.5.0
This release contains 230 PRs from 43 authors, including new contributors Aldo D'Aquino, Anıl Mısırlıoğlu, Charles Korn, Danny Staple, Dylan Crees, Eduardo Silvi, FG, Jesse Weaver, KarlisAG, Leegin-darknight, Rohan Kumar, Wille Faler, Y.Horie, manohar-koukuntla, paulroche, songjiayang, Éamon Ryan. Thank you!
Grafana Mimir version 2.5 release notes
Grafana Labs is excited to announce version 2.5 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.
Features and enhancements
-
Alertmanager Discord support
Alertmanager can now be configured to send alerts in Discord channels. -
Configurable TLS minimum version and cipher suites
We added the flags-server.tls-min-versionand-server.tls-cipher-suitesthat can be used to define the minimum TLS version and the supported cipher suites in all HTTP and gRPC servers in Mimir. -
Lower memory usage in store-gateway, ingester and alertmanager
We made various changes related to how index lookups are performed and how the active series custom trackers are implemented, which results in better performance and lower overall memory usage in the store-gateway and ingester.
We also optimized the alertmanager, which results in a 50% reduction in memory usage in use cases with larger numbers of tenants. -
Improved Mimir dashboards
We added two new dashboards namedMimir / Overview resourcesandMimir / Overview networking. Furthermore, we have made various improvements to the following existing dashboards:Mimir / Overview: Add "remote read", "metadata", and "exemplar" queries.Mimir / Writes: Add optional row about the distributor's new forwarding feature.Mimir / Tenants: Add insights into the read path.
Helm chart improvements
-
Zone aware replication
Helm now supports deploying the ingesters and store-gateways as different availability zones. The replication is also zone-aware, therefore multiple instances of one zone can fail without any service interruption and roll outs can be performed faster because many instances of each zone can be restarted together, as opposed to them all restarting in sequence.This is a breaking change, for details on how to upgrade please review the Helm changelog.
-
Running without root privileges
All Mimir, GEM and Agent processes now don't require root privileges to run anymore. -
Unified reverse proxy (
gateway) configuration for Mimir and GEM
This change allows for an easier upgrade path from Mimir to GEM, without any downtime. The unified configuration also makes it possible to autoscale the GEM gateway pods and it supports OpenShift Route. The change also deprecates thenginxsection in the configuration. The section will be removed in release7.0.0. -
Updated MinIO
The MinIO sub-chart was updated from4.xto5.0.0, note that this update inherits a breaking change because the MinIO gateway mode was removed. -
Updated sizing plans
We updated our sizing plans to make them reflect better how we recommend running Mimir and GEM in production. Note that this includes a breaking change for users of the "small" plan, more details can be found in the Helm changelog. -
Various quality of life improvements
- Rollout strategies without downtime
- Read path and compactor configuration refresh, providing better default settings
- OTLP ingestion support in the Nginx configuration
- A default configuration for alertmanager, so the user interface and the sending of alerts from the ruler works out of the box
Bug fixes
- Flusher: Added
Overridesas a dependency to prevent panics when starting with-target=flusher. PR 3151 - Query-frontend: properly close gRPC streams to the query-scheduler to stop memory and goroutines leak. PR 3302
- Ruler: persist evaluation delay configured in the rulegroup. PR 3392
- Fix panics in OTLP ingest path when parse errors occur. PR 3538
Changelog
2.5.0
Grafana Mimir
- [CHANGE] Flag
-azure.msi-resourceis now ignored, and will be removed in Mimir 2.7. This setting is now made automatically by Azure. #2682 - [CHANGE] Experimental flag
-blocks-storage.tsdb.out-of-order-capacity-minhas been removed. #3261 - [CHANGE] Distributor: Wrap errors from pushing to ingesters with useful context, for example clarifying timeouts. #3307
- [CHANGE] The default value of
-server.http-write-timeouthas changed from 30s to 2m. #3346 - [CHANGE] Reduce period of health checks in connection pools for querier->store-gateway, ruler->ruler, and alertmanager->alertmanager clients to 10s. This reduces the time to fail a gRPC call when the remote stops responding. #3168
- [CHANGE] Hide TSDB block ranges period config from doc and mark it experimental. #3518
- [FEATURE] Alertmanager: added Discord support. #3309
- [ENHANCEMENT] Added
-server.tls-min-versionand-server.tls-cipher-suitesflags to configure cipher suites and min TLS version supported by HTTP and gRPC servers. #2898 - [ENHANCEMENT] Distributor: Add age filter to forwarding functionality, to not forward samples which are older than defined duration. If such samples are not ingested,
cortex_discarded_samples_total{reason="forwarded-sample-too-old"}is increased. #3049 #3113 - [ENHANCEMENT] Store-gateway: Reduce memory allocation when generating ids in index cache. #3179
- [ENHANCEMENT] Query-frontend: truncate queries based on the configured creation grace period (
--validation.create-grace-period) to avoid querying too far into the future. #3172 - [ENHANCEMENT] Ingester: Reduce activity tracker memory allocation. #3203
- [ENHANCEMENT] Query-frontend: Log more detailed information in the case of a failed query. #3190
- [ENHANCEMENT] Added
-usage-stats.installation-modeconfiguration to track the installation mode via the anonymous usage statistics. #3244 - [ENHANCEMENT] Compactor: Add new
cortex_compactor_block_max_time_delta_secondshistogram for detecting if compaction of blocks is lagging behind. #3240 #3429 - [ENHANCEMENT] Ingester: reduced the memory footprint of active series custom trackers. #2568
- [ENHANCEMENT] Distributor: Include
X-Scope-OrgIdheader in requests forwarded to configured forwarding endpoint. #3283 #3385 - [ENHANCEMENT] Alertmanager: reduced memory utilization in Mimir clusters with a large number of tenants. #3309
- [ENHANCEMENT] Add experimental flag
-shutdown-delayto allow components to wait after receiving SIGTERM and before stopping. In this time the component returns 503 from /ready endpoint. #3298 - [ENHANCEMENT] Go: update to go 1.19.3. #3371
- [ENHANCEMENT] Alerts: added
RulerRemoteEvaluationFailingalert, firing when communication between ruler and frontend fails in remote operational mode. #3177 #3389 - [ENHANCEMENT] Clarify which S3 signature versions are supported in the error "unsupported signature version". #3376
- [ENHANCEMENT] Store-gateway: improved index header reading performance. #3393 #3397 #3436
- [ENHANCEMENT] Store-gateway: improved performance of series matching. #3391
- [ENHANCEMENT] Move the validation of incoming series before the distributor's forwarding functionality, so that we don't forward invalid series. #3386 #3458
- [ENHANCEMENT] S3 bucket configuration now validates that the endpoint does not have the bucket name prefix. #3414
- [ENHANCEMENT] Query-frontend: added "fetched index bytes" to query statistics, so that the statistics contain the total bytes read by store-gateways from TSDB block indexes. #3206
- [ENHANCEMENT] Distributor: push wrapper should only receive unforwarded samples. #2980
- [BUGFIX] Flusher: Add
Overridesas a dependency to prevent panics when starting with-target=flusher. #3151 - [BUGFIX] Updated
golang.org/x/textdependency to fix CVE-2022-32149. #3285 - [BUGFIX] Query-frontend: properly close gRPC streams to the query-scheduler to stop memory and goroutines leak. #3302
- [BUGFIX] Ruler: persist evaluation delay configured in the rulegroup. #3392
- [BUGFIX] Ring status pages: show 100% ownership as "100%", not "1e+02%". #3435
- [BUGFIX] Fix panics in OTLP ingest path when parse errors exist. #3538
Mixin
- [CHANGE] Alerts: Change
MimirSchedulerQueriesStuckfortime to 7 minutes to account for the time it takes for HPA to scale up. #3223 - [CHANGE] Dashboards: Removed the
Querier > Stagespanel from theMimir / Queriesdashboard. #3311 - [CHANGE] Configuration: The format of the
autoscalingsection of the configuration has changed to support more components. #3378- Instead of specific config variables for each component, they are listed in a dictionary. For example,
autoscaling.querier_enabledbecomesautoscaling.querier.enabled.
- Instead of specific config variables for each component, they are listed in a dictionary. For example,
- [FEATURE] Dashboards: Added "Mimir / Overview resources" dashboard, providing an high level view over a Mimir cluster resources utilization. #3481
- [FEATURE] Dashboards: Added "Mimir / Overview networking" dashboard, providing an high level view over a Mimir cluster network bandwidth, inflight requests and TCP connections. #3487
- [FEATURE] Compile baremetal mixin along k8s mixin. #3162 #3514
- [ENHANCEMENT] Alerts: Add MimirRingMembersMismatch firing when a component does not have the expected number of running jobs. #2404
- [ENHANCEMENT] Dashboards: Add optional row about the Distributor's metric forwarding feature ...