Skip to content

daft.functions.video_frames#

video_frames #

video_frames(file_expr: Expression, *, start_time: float = 0, end_time: float | None = None, width: int | None = None, height: int | None = None, is_key_frame: bool | None = None, sample_interval_seconds: float | None = None) -> Expression

Decode all video frames within a time range, with per-frame metadata.

Mirrors the per-frame schema of daft.read_video_frames().

Parameters:

Name Type Description Default
file_expr VideoFile Expression

The video file to decode frames from.

required
start_time float

Start of the time range in seconds. Defaults to 0.

0
end_time float | None

End of the time range in seconds. Defaults to None (all frames).

None
width int | None

Target width for resizing frames. Must be provided with height.

None
height int | None

Target height for resizing frames. Must be provided with width.

None
is_key_frame bool | None

If True, decode only keyframes. If False, decode only non-keyframes. If None, decode all frames.

None
sample_interval_seconds float | None

If provided and > 0, sample frames at approximately this time interval in seconds based on frame_time. The algorithm picks the first frame whose timestamp is >= the next target time (start_time, start_time + interval, start_time + 2*interval, ...). Frames without valid timestamps are skipped. Same semantics as the source-side :func:daft.read_video_frames. Defaults to None (no sampling).

None

Returns:

Name Type Description
Expression List[Struct] Expression

List of structs, each containing: - frame_index (int): 0-based index of the frame in the video stream - frame_time (float): Presentation time in seconds - frame_time_base (str): Time base as a fraction string - frame_pts (int): Presentation timestamp in stream time_base units - frame_dts (int): Decode timestamp in stream time_base units - frame_duration (int): Duration in stream time_base units - is_key_frame (bool): Whether this frame is a keyframe - data (Image): The decoded frame as an image

Source code in daft/functions/video.py
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
def video_frames(
    file_expr: Expression,
    *,
    start_time: float = 0,
    end_time: float | None = None,
    width: int | None = None,
    height: int | None = None,
    is_key_frame: bool | None = None,
    sample_interval_seconds: float | None = None,
) -> Expression:
    """Decode all video frames within a time range, with per-frame metadata.

    Mirrors the per-frame schema of ``daft.read_video_frames()``.

    Args:
        file_expr (VideoFile Expression): The video file to decode frames from.
        start_time (float, optional): Start of the time range in seconds. Defaults to 0.
        end_time (float | None, optional): End of the time range in seconds. Defaults to None (all frames).
        width (int | None, optional): Target width for resizing frames. Must be provided with ``height``.
        height (int | None, optional): Target height for resizing frames. Must be provided with ``width``.
        is_key_frame (bool | None, optional): If True, decode only keyframes. If False,
            decode only non-keyframes. If None, decode all frames.
        sample_interval_seconds (float | None, optional): If provided and > 0, sample frames at
            approximately this time interval in seconds based on ``frame_time``. The algorithm
            picks the first frame whose timestamp is >= the next target time (``start_time``,
            ``start_time + interval``, ``start_time + 2*interval``, ...). Frames without valid
            timestamps are skipped. Same semantics as the source-side
            :func:`daft.read_video_frames`. Defaults to None (no sampling).

    Returns:
        Expression (List[Struct] Expression): List of structs, each containing:
            - frame_index (int): 0-based index of the frame in the video stream
            - frame_time (float): Presentation time in seconds
            - frame_time_base (str): Time base as a fraction string
            - frame_pts (int): Presentation timestamp in stream time_base units
            - frame_dts (int): Decode timestamp in stream time_base units
            - frame_duration (int): Duration in stream time_base units
            - is_key_frame (bool): Whether this frame is a keyframe
            - data (Image): The decoded frame as an image
    """
    return video_frames_fn(
        file_expr,
        start_time=start_time,
        end_time=end_time,
        width=width,
        height=height,
        is_key_frame=is_key_frame,
        sample_interval_seconds=sample_interval_seconds,
    )  # type: ignore