Time series data streams
Serverless Stack
A time series data stream (TSDS) is a type of data stream optimized for indexing metrics data. A TSDS helps you analyze a sequence of data points as a whole.
A TSDS can also help you store metrics data more efficiently. In our benchmarks, metrics data stored in a TSDS used 70% less disk space than a regular data stream. The exact impact varies by data set.
Before setting up a time series data stream, make sure you're familiar with general data stream concepts.
Metrics consist of data point–timestamp pairs, identified by dimension fields, that can be used in aggregation queries. Both a regular data stream and a time series data stream can store metrics data.
Choose a time series data stream if you typically add metrics data to Elasticsearch in near real-time and in @timestamp
order. For other timestamped data, such as logs or traces, use a logs data stream or a regular data stream.
To make sure a TSDS is right for your use case, review the list of differences from a regular data stream on this page.
A time series is a sequence of observations for a specific entity. Together, these observations let you track changes to the entity over time. For example, a time series can track:
- CPU and disk usage for a computer
- The price of a stock
- Temperature and humidity readings from a weather sensor
Compared to a regular data stream, a TSDS uses some additional fields specific to time series: dimension fields and metric fields, plus an internal _tsid
metadata field.
Dimension fields often correspond to characteristics of the items you're measuring. For example, documents related to the same weather sensor might have the same sensor_id
and location
values.
Elasticsearch uses dimensions and timestamps to generate time series document _id
values. Two documents with the same dimensions and timestamp are considered duplicates. Duplicates are rejected during ingestion with a 409 Conflict
status.
To mark a field as a dimension, set the Boolean time_series_dimension
mapping parameter to true
. The following field types support the time_series_dimension
parameter:
To work with a flattened field, use the time_series_dimensions
parameter to configure an array of fields as dimensions. For details, refer to flattened
.
You can also simplify dimension definitions by using pass-through fields.
Metrics are numeric measurements that change over time. Documents in a TSDS typically contain one or more metric fields.
To mark a field as a metric, use the time_series_metric
mapping parameter. This parameter ensures data is stored in an optimal way for time series analysis. The following field types support the time_series_metric
parameter:
- All numeric field types
aggregate_metric_double
, for internal use during downsampling (rarely user-populated)
The valid values for time_series_metric
are counter
and gauge
:
counter
- A cumulative metric that only monotonically increases or resets to
0
(zero). For example, a count of errors or completed tasks that resets when a serving process restarts. gauge
- A metric that represents a single numeric that can arbitrarily increase or decrease. For example, a temperature or available disk space.
The _tsid
is an automatically generated object derived from the document’s dimensions. It's intended for internal Elasticsearch use, so in most cases you won't need to work with it. The format of the _tsid
field is subject to change.
A time series data stream works like a regular data stream, with some key differences:
- Time series index mode: The matching index template for a TSDS must include a
data_stream
object withindex.mode
set totime_series
. This option enables most TSDS-related functionality. - Fields: In a TSDS, each document contains:
- A
@timestamp
field - One or more dimension fields, set with
time_series_dimension: true
- One or more metric fields
- An auto-generated document
_id
(custom_id
values are not supported)
- A
- Backing indices: A TSDS uses time-bound indices to store data from the same time period in the same backing index.
- Dimension-based routing: The routing logic uses dimension fields to map all data points of a time series to the same shard, improving storage efficiency and query performance. Duplicate data points are rejected.
- Sorting: A TSDS uses internal index sorting to order shard segments by
_tsid
and@timestamp
, for better compression. Time series data streams do not useindex.sort.*
settings. - Source field: A TSDS uses synthetic
_source
, and as a result is subject to some restrictions and modifications applied to the_source
field.
Serverless
You can use the ES|QL TS
command to query time series data streams. The TS
command is optimized for time series data. It also enables the use of aggregation functions that efficiently process metrics per time series, before aggregating results.
- Try the quickstart for a hands-on introduction
- Set up a time series data stream
- Ingest data using the OpenTelemetry Protocol (OTLP)
- Learn about downsampling to reduce storage footprint