Streaming sends data to the client in small pieces, called chunks, as soon as they're ready. Unlike traditional methods that wait for the entire response, streaming lets users start seeing and using content sooner. For example, a server can send the first part of an HTML page immediately, then stream in additional content as it's generated.
Vercel Functions support streaming responses, so you can render and display parts of the UI as soon as they are ready. This approach reduces perceived wait times and allows users to interact with your app before the entire page loads.
Common use-cases include:
- Ecommerce: Render the most important product and account data early, letting customers shop sooner
- AI applications: Streaming responses from AIs powered by LLMs lets you display response text as it arrives rather than waiting for the full result
HTTP responses typically send the entire payload to the client all at once. This approach can lead to noticeable delays for users, especially when handling large datasets or complex computations.