Loading

Time series data streams

Serverless Stack

A time series data stream (TSDS) is a type of data stream optimized for indexing metrics data. A TSDS helps you analyze a sequence of data points as a whole.

A TSDS can also help you store metrics data more efficiently. In our benchmarks, metrics data stored in a TSDS used 70% less disk space than a regular data stream. The exact impact varies by data set.

Before setting up a time series data stream, make sure you're familiar with general data stream concepts.

Metrics consist of data point–timestamp pairs, identified by dimension fields, that can be used in aggregation queries. Both a regular data stream and a time series data stream can store metrics data.

Choose a time series data stream if you typically add metrics data to Elasticsearch in near real-time and in @timestamp order. For other timestamped data, such as logs or traces, use a logs data stream or a regular data stream.

To make sure a TSDS is right for your use case, review the list of differences from a regular data stream on this page.

A time series is a sequence of observations for a specific entity. Together, these observations let you track changes to the entity over time. For example, a time series can track:

  • CPU and disk usage for a computer
  • The price of a stock
  • Temperature and humidity readings from a weather sensor
time series chart

Compared to a regular data stream, a TSDS uses some additional fields specific to time series: dimension fields and metric fields, plus an internal _tsid metadata field.

Dimension fields often correspond to characteristics of the items you're measuring. For example, documents related to the same weather sensor might have the same sensor_id and location values.

Tip

Elasticsearch uses dimensions and timestamps to generate time series document _id values. Two documents with the same dimensions and timestamp are considered duplicates. Duplicates are rejected during ingestion with a 409 Conflict status.

To mark a field as a dimension, set the Boolean time_series_dimension mapping parameter to true. The following field types support the time_series_dimension parameter:

To work with a flattened field, use the time_series_dimensions parameter to configure an array of fields as dimensions. For details, refer to flattened.

You can also simplify dimension definitions by using pass-through fields.

Metrics are numeric measurements that change over time. Documents in a TSDS typically contain one or more metric fields.

To mark a field as a metric, use the time_series_metric mapping parameter. This parameter ensures data is stored in an optimal way for time series analysis. The following field types support the time_series_metric parameter:

The valid values for time_series_metric are counter and gauge:

counter
A cumulative metric that only monotonically increases or resets to 0 (zero). For example, a count of errors or completed tasks that resets when a serving process restarts.
gauge
A metric that represents a single numeric that can arbitrarily increase or decrease. For example, a temperature or available disk space.

The _tsid is an automatically generated object derived from the document’s dimensions. It's intended for internal Elasticsearch use, so in most cases you won't need to work with it. The format of the _tsid field is subject to change.

A time series data stream works like a regular data stream, with some key differences:

  • Time series index mode: The matching index template for a TSDS must include a data_stream object with index.mode set to time_series. This option enables most TSDS-related functionality.
  • Fields: In a TSDS, each document contains:
    • A @timestamp field
    • One or more dimension fields, set with time_series_dimension: true
    • One or more metric fields
    • An auto-generated document _id (custom _id values are not supported)
  • Backing indices: A TSDS uses time-bound indices to store data from the same time period in the same backing index.
  • Dimension-based routing: The routing logic uses dimension fields to map all data points of a time series to the same shard, improving storage efficiency and query performance. Duplicate data points are rejected.
  • Sorting: A TSDS uses internal index sorting to order shard segments by _tsid and @timestamp, for better compression. Time series data streams do not use index.sort.* settings.
  • Source field: A TSDS uses synthetic _source, and as a result is subject to some restrictions and modifications applied to the _source field.

Serverless Preview

You can use the ES|QL TS command to query time series data streams. The TS command is optimized for time series data. It also enables the use of aggregation functions that efficiently process metrics per time series, before aggregating results.