Overview
Deep Research is designed for open-ended research questions where you don’t have structured input data to enrich. Instead of bringing data to enhance, you bring a research question or topic, and the Task API conducts comprehensive multi-step web exploration to deliver analyst-grade intelligence. This powerful capability compresses hours of manual research into minutes, delivering high-quality intelligence at scale. Optimized within thepro and ultra processor families, Deep Research transforms natural language research queries into comprehensive reports complete with inline citations and verification.
This guide focuses on Deep Research. If you have structured data you want to enrich with web intelligence (like adding columns to a spreadsheet), see our Enrichment Quickstart.
How Deep Research Works
With Deep Research, the system automatically:- Interprets your research intent from natural language
- Conducts multi-step web exploration across authoritative sources
- Synthesizes findings into structured data or markdown reports
- Provides citations and confidence levels for verification
Key Features
- Natural Language Input: Simply describe what you want to research in plain language—no need for structured data or predefined schemas.
- Declarative Approach: Specify what intelligence you need, and the system handles the complex orchestration of research, exploration, and synthesis.
- Flexible Output Structure: Choose between
autoschema mode (automatically structured JSON),textmode (markdown reports) or pre-specified structured JSON schema based on your needs. - Comprehensive Intelligence: Multi-step research across authoritative sources with granular citations, reasoning, and confidence levels for every finding.
Long-Running Tasks: Deep Research can take up to 45 minutes to complete. See Polling vs Webhooks vs SSE below for how to handle async results.
Creating a Deep Research Task
Deep Research accepts any input schema as input, including plain-text strings. The more specific and detailed your input, the better the research results would be.Input size restriction: Deep Research is optimized for concise research
prompts and is not meant for long context inputs. Keep your input under
15,000 characters for optimal performance and results.
Auto Schema
Specifying auto schema mode in the Task API output schema triggers Deep Research and ensures well-structured outputs, without the need to specify a desired output structure. The final schema type will follow a JSONSchema format and will be determined by the processor automatically. Auto schema mode is the default mode when usingpro and ultra line of processors.
This format is ideal for programmatic processing, data analysis, and integration with other systems.
Text Schema
Specifying text schema mode in the Task API output schema triggers Deep Research with a markdown report output format. The generated result will contain extensive research formatted into a markdown report with in-line citations. This format is perfect for human-readable content as well as LLM ingestion. To provide guidance on the output, use the description field when specifying text schema. This allows users to steer the report generated towards a certain direction like control over the length or the content of the report.Sample Response
Important: The response below shows the final completed result after
Deep Research has finished. When you first create a task, you’ll receive an
immediate response with
"status": "running". You’ll need to poll the task or
use webhooks to get the final structured research output
shown below.auto schema. The complete response contained 124 content fields, with 610 total citations for this Task.
content and the basis, as with other Task API executions. The key difference is that the basis object in an auto mode output contains Nested FieldBasis.
Nested FieldBasis
In
text mode, FieldBasis is not nested. It contains a list of citations (with
URLs and excerpts) for all sites visited during research. The most relevant citations
are included at the base of the report itself, with inline references.auto mode, the Basis object maps each output field (including nested fields) with supporting evidence. This ensures that every output, including nested output fields, has citations, excerpts, confidence levels and reasoning.
For nested fields, the basis uses dot notation for indexing:
key_players.0for the first item in a key players arrayindustry_overview.growth_cagrfor nested object fieldsmarket_trends.2.descriptionfor nested arrays with objects
Example: Market Research Assistant
Here’s how to build a market research tool with Deep Research, showing different approaches for handling the async nature of the Task API:Polling vs Webhooks vs SSE
The Task API is asynchronous—when you create a task, it returns immediately with arun_id while processing continues in the background. There are three ways to get results:
| Method | What It Does | Best For |
|---|---|---|
| Polling | Your code repeatedly calls the API to check if the task is done | Simple integrations, scripts, testing |
| Webhooks | Parallel sends an HTTP request to your server when the task completes | Production apps with backend servers |
| SSE | Stream real-time progress updates as the task runs | Interactive UIs, monitoring progress |
Polling
How it works: After creating a task, repeatedly check its status until it completes.- Simplest approach—no infrastructure needed
- Use
retrieve()to check status, thenresult()when complete - The
result()method also blocks until complete if you prefer a one-liner:client.task_run.result(run_id, api_timeout=3600)
Webhooks
How it works: Provide a webhook URL when creating the task. Parallel sends a POST request to your URL when the task finishes.- Webhooks notify you when the task completes—they don’t send the actual results
- After receiving the webhook, call
result()to retrieve the output data - Requires a publicly accessible HTTPS endpoint
- See Webhooks documentation for setup and verification
Important: Webhooks are a notification mechanism, not a data delivery mechanism. The webhook payload contains the task status and metadata, but you must make a separate API call to retrieve the actual research results.
Server-Sent Events (SSE)
How it works: Connect to a streaming endpoint to receive real-time progress updates as the task runs.- See real-time progress: research plan, sources being explored, intermediate findings
- The final
task_run.stateevent includes the complete output - Ideal for showing users what’s happening during long research tasks
- See Streaming Events documentation for event types and examples
Which Should I Use?
| Scenario | Recommended Method |
|---|---|
| Testing or one-off scripts | Polling |
| Backend service processing many tasks | Webhooks |
| User-facing app showing research progress | SSE |
| Simple integration without a server | Polling |
| Production system needing reliability | Webhooks + Polling fallback |
Next Steps
-
Choose a Processor: Deep Research works best with
proorultraprocessors—use fast variants (pro-fast,ultra-fast) for quicker turnaround - Task Spec Best Practices: Craft effective research queries and output specifications
- Task Groups: Run multiple research queries in parallel for batch intelligence gathering
- Access Research Basis: Understand nested FieldBasis structure for auto schema outputs
- Streaming Events: Monitor long-running research tasks with real-time progress updates
- Webhooks: Configure HTTP callbacks for research completion notifications
- Enrichment: Learn about enriching structured data instead of open-ended research
- API Reference: Complete endpoint documentation for the Task API