Understanding the Flux State Machine

Traditional STT+VAD requires you to build complex interruption logic. Flux handles this natively.

Emitted events adhere to the below state machine for managing turns:

  1. Update messages are sent for approximately every 0.25 seconds of transcribed audio, regardless of transcript updates, unless a state change has occurred.
  2. An EagerEndOfTurn message always contains a nonempty transcript.
  3. A TurnResumed message can only follow a preceding EagerEndOfTurn message.
  4. The EndOfTurn transcript will always match the immediately preceding EagerEndOfTurn transcript. If the transcript changes after an EagerEndOfTurn, a TurnResumed event will occur first.
  5. The turn_index increments immediately following an EndOfTurn message.
  6. When using flux-general-multi, all TurnInfo events include languages (detected languages sorted by word count) and languages_hinted (active language hints). See Language Prompting.

Configuring Event Behavior: The EagerEndOfTurn and TurnResumed events are only triggered when you set the eager_eot_threshold parameter. The EndOfTurn event behavior is controlled by eot_threshold and eot_timeout_ms parameters. See the End-of-Turn Configuration for details on tuning these thresholds for your use case.

Barge-in and audio quality: Flux’s StartOfTurn event is the recommended way to trigger barge-in — it’s more reliable than an external VAD because every StartOfTurn is guaranteed to contain a non-empty transcript. For guidance on echo cancellation, noise suppression, and other audio preprocessing that affects turn detection, see Audio Preprocessing & Barge-In.

Turn Lifecycle Example

Here’s how Flux processes a customer calling support saying “Hi I need to cancel my subscription please.”

Notice how confidence builds up and how the EagerEndOfTurn event fires before the final EndOfTurn. With EagerEndOfTurn, your voice agent can begin preparing a response before the user has fully finished speaking. This allows you to send a synchronous request with early context, creating the effect of a faster, more natural reply.

1{
2 "event": "Update",
3 "turn_index": 0,
4 "audio_window_start": 0.0,
5 "audio_window_end": 0.2,
6 "transcript": "",
7 "words": [],
8 "end_of_turn_confidence": 0.1
9}
10
11{
12 "event": "Update",
13 "turn_index": 0,
14 "audio_window_start": 0.0,
15 "audio_window_end": 0.5,
16 "transcript": "",
17 "words": [],
18 "end_of_turn_confidence": 0.1
19}
20
21{
22 "event": "StartOfTurn",
23 "turn_index": 0,
24 "audio_window_start": 0.0,
25 "audio_window_end": 0.6,
26 "transcript": "Hi I",
27 "words": [
28 {
29 "word": "Hi",
30 "confidence": 0.95,
31 "start": 0.42,
32 "end": 0.56
33 },
34 {
35 "word": "I",
36 "confidence": 0.92,
37 "start": 0.56,
38 "end": 0.6
39 }
40 ],
41 "end_of_turn_confidence": 0.1
42}
43
44{
45 "event": "Update",
46 "turn_index": 0,
47 "audio_window_start": 0.0,
48 "audio_window_end": 0.8,
49 "transcript": "Hi I need to",
50 "words": [...],
51 "end_of_turn_confidence": 0.1
52}
53
54{
55 "event": "Update",
56 "turn_index": 0,
57 "audio_window_start": 0.0,
58 "audio_window_end": 1.0,
59 "transcript": "Hi I need to cancel my subscription.",
60 "words": [...],
61 "end_of_turn_confidence": 0.3
62}
63
64{
65 "event": "EagerEndOfTurn",
66 "turn_index": 0,
67 "audio_window_start": 0.0,
68 "audio_window_end": 1.1,
69 "transcript": "Hi I need to cancel my subscription.",
70 "words": [...],
71 "end_of_turn_confidence": 0.3
72}
73
74{
75 "event": "TurnResumed",
76 "turn_index": 0,
77 "audio_window_start": 0.0,
78 "audio_window_end": 1.2,
79 "transcript": "Hi I need to cancel my subscription please",
80 "words": [...],
81 "end_of_turn_confidence": 0.1
82}
83
84{
85 "event": "Update",
86 "turn_index": 0,
87 "audio_window_start": 0.0,
88 "audio_window_end": 1.4,
89 "transcript": "Hi I need to cancel my subscription please.",
90 "words": [...],
91 "end_of_turn_confidence": 0.3
92}
93
94{
95 "event": "EagerEndOfTurn",
96 "turn_index": 0,
97 "audio_window_start": 0.0,
98 "audio_window_end": 1.5,
99 "transcript": "Hi I need to cancel my subscription please.",
100 "words": [...],
101 "end_of_turn_confidence": 0.3
102}
103
104{
105 "event": "Update",
106 "turn_index": 0,
107 "audio_window_start": 0.1,
108 "audio_window_end": 1.6,
109 "transcript": "Hi I need to cancel my subscription please.",
110 "words": [...],
111 "end_of_turn_confidence": 0.5
112}
113
114{
115 "event": "EndOfTurn",
116 "turn_index": 0,
117 "audio_window_start": 0.0,
118 "audio_window_end": 1.7,
119 "transcript": "Hi I need to cancel my subscription please.",
120 "words": [...],
121 "end_of_turn_confidence": 0.7
122}