Expand description
§Spider Agent
A concurrent-safe multimodal agent for web automation and research.
§Features
- Concurrent-safe: Designed to be wrapped in
Arcfor multi-task access - Feature-gated: Only include dependencies you need
- Multiple LLM providers: OpenAI, OpenAI-compatible APIs
- Multiple search providers: Serper, Brave, Bing, Tavily
- Browser automation: Chrome support via chromiumoxide
§Quick Start
ⓘ
use spider_agent::{Agent, AgentConfig};
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent = Arc::new(Agent::builder()
.with_openai("sk-...", "gpt-4o-mini")
.with_search_serper("serper-key")
.build()?);
// Search
let results = agent.search("rust web frameworks").await?;
println!("Found {} results", results.len());
// Extract from first result
let html = agent.fetch(&results.results[0].url).await?.html;
let data = agent.extract(&html, "Extract framework name and features").await?;
println!("{}", serde_json::to_string_pretty(&data)?);
Ok(())
}§Concurrent Execution
ⓘ
use spider_agent::Agent;
use std::sync::Arc;
let agent = Arc::new(Agent::builder()
.with_openai("sk-...", "gpt-4o")
.with_search_serper("serper-key")
.with_max_concurrent_llm_calls(10)
.build()?);
// Execute multiple searches concurrently
let queries = vec!["rust async", "rust web frameworks", "rust databases"];
let mut handles = Vec::new();
for query in queries {
let agent = agent.clone();
let query = query.to_string();
handles.push(tokio::spawn(async move {
agent.search(&query).await
}));
}
// Collect results
for handle in handles {
let result = handle.await??;
println!("Found {} results", result.results.len());
}§Feature Flags
openai- OpenAI/OpenAI-compatible LLM providerchrome- Browser automation via chromiumoxidesearch- Base search functionalitysearch_serper- Serper.dev search providersearch_brave- Brave Search providersearch_bing- Bing Search providersearch_tavily- Tavily AI Search providerfull- All features
Re-exports§
pub use tools::AuthConfig;pub use tools::CustomTool;pub use tools::CustomToolRegistry;pub use tools::CustomToolResult;pub use tools::HttpMethod;pub use tools::SpiderCloudToolConfig;pub use automation::RemoteMultimodalConfigs;pub use automation::RemoteMultimodalEngine;pub use automation::EngineError;pub use automation::EngineResult;pub use automation::best_effort_parse_json_object;pub use automation::execute_graph;pub use automation::cache::CacheStats;pub use automation::cache::CacheValue;pub use automation::cache::SmartCache;pub use automation::executor::BatchExecutor;pub use automation::executor::ChainExecutor;pub use automation::executor::PrefetchManager;pub use automation::router::auto_policy;pub use automation::router::estimate_tokens;pub use automation::router::ModelRequirements;pub use automation::router::ModelRouter;pub use automation::router::ModelSelector;pub use automation::router::RoutingDecision;pub use automation::router::ScoredModel;pub use automation::router::SelectionStrategy;pub use automation::router::TaskAnalysis;pub use automation::router::TaskCategory;
Modules§
- automation
- Automation module for spider_agent.
- categories
- URL category constants.
- tools
- Custom tool support for external API calls.
Structs§
- ActResult
- Result of a single action execution via
act(). - Action
Record - Record of an action taken during automation.
- Action
Result - Result of an action execution.
- Action
Tool Schemas - Generator for automation action tool schemas.
- Agent
- Multimodal agent for web automation and research.
- Agent
Builder - Agent builder for configuring and creating agents.
- Agent
Config - Agent configuration.
- Agent
Memory - Session memory for storing state across operations.
- Alternative
- An alternative action with its confidence score.
- Automation
Config - Main automation configuration.
- Automation
Memory - In-memory storage for agentic automation sessions.
- Automation
Result - Result of an automation operation.
- Automation
Usage - Token usage tracking for automation operations with granular call tracking.
- Capture
Profile - Capture profile for screenshots and HTML.
- Chain
Builder - Builder for creating action chains.
- Chain
Context - Context for evaluating chain conditions.
- Chain
Result - Result of an action chain execution.
- Chain
Step - A single step in an action chain.
- Chain
Step Result - Result of a single step in an action chain.
- Checkpoint
- A checkpoint condition to verify after a step.
- Checkpoint
Result - Result of a checkpoint verification.
- Clip
Viewport - Clip viewport for screenshots.
- Completion
Options - Options for completion requests.
- Completion
Response - Response from a completion request.
- Concurrent
Chain Config - Configuration for concurrent chain execution.
- Concurrent
Chain Result - Result of executing a concurrent chain.
- Confidence
Retry Strategy - Strategy for retrying based on confidence.
- Confidence
Summary - Summary of confidence statistics.
- Confidence
Tracker - Tracker for confidence statistics across an automation session.
- Confident
Step - A step with confidence score and alternatives.
- Content
Analysis - Result of analyzing HTML content.
- Dependency
Graph - Dependency graph for managing step execution order.
- Dependent
Step - A step in a dependency chain.
- Diff
Stats - Statistics about HTML diff performance.
- Discovered
Url - A discovered URL with AI-generated metadata.
- Element
Change - A single element change.
- Execution
Plan - An execution plan from the LLM.
- Extraction
Schema - Schema for structured data extraction.
- Fetch
Result - Result from fetching a URL.
- Form
Field - A field in a form.
- Form
Info - Information about a form on the page.
- Function
Call - A function call from the LLM.
- Function
Definition - OpenAI-compatible function definition.
- Generated
Schema - A generated JSON schema.
- Healed
Selector Cache - Cache for healed selectors.
- Healing
Diagnosis - Diagnosis and suggested fix from the LLM.
- Healing
Request - A request to heal a failed selector.
- Healing
Result - Result of a healing attempt.
- Healing
Stats - Statistics about healing operations.
- Html
Diff Result - Result of computing an HTML diff.
- Interactive
Element - An interactive element on the page.
- MapResult
- Result of the
map()API call for page discovery. - Message
- A message in a conversation.
- Model
Capabilities - Model capabilities struct.
- Model
Endpoint - A model endpoint override for dual-model routing.
- Model
Info Entry - Detailed model information entry.
- Model
Policy - Policy for selecting models based on cost/quality tradeoffs.
- Model
Pricing - Pricing in USD.
- Model
Profile - Full model profile: capabilities, ranks, pricing, and context window.
- Model
Ranks - Arena and task-specific rankings, normalized to 0.0-100.0.
- Multi
Page Context - Multi-page context for synthesis.
- Navigation
Option - A navigation option on the page.
- Page
Context - Context for a single page in multi-page synthesis.
- Page
Contribution - Contribution of a single page to the synthesis.
- Page
Extraction - Extraction from a single page.
- Page
Observation - Observation of a page’s current state.
- Page
State - Current page state for re-planning context.
- Page
State Diff - Tracker for page state changes across rounds.
- Plan
Execution State - State of plan execution.
- Planned
Step - A single step in an execution plan.
- Planning
Mode Config - Configuration for planning mode.
- Prompt
UrlGate - URL-based prompt gating for per-URL config overrides.
- Remote
Multimodal Config - Runtime configuration for
RemoteMultimodalEngine. - Replan
Context - Context for re-planning after a failure.
- Research
Options - Research options for research tasks.
- Retry
Config - Retry configuration.
- Retry
Policy - Retry policy for automation operations.
- Schema
Cache - Cache for generated schemas.
- Schema
Generation Request - Request to generate a schema from examples.
- Search
Options - Search options for web search.
- Selector
Cache - Self-healing selector cache.
- Selector
Cache Entry - A single entry in the selector cache.
- Self
Healing Config - Configuration for self-healing behavior.
- Step
Result - Result of executing a single step.
- Structured
Output Config - Configuration for structured output mode.
- Synthesis
Config - Configuration for multi-page synthesis.
- Synthesis
Result - Result of multi-page synthesis.
- Token
Usage - Token usage from a completion.
- Tool
Call - A tool call from the LLM response.
- Tool
Definition - OpenAI-compatible tool definition.
- Usage
Limits - Usage limits for controlling agent resource consumption.
- Usage
Snapshot - Snapshot of usage statistics.
- Usage
Stats - Usage statistics for tracking agent operations.
- Verification
- Verification to run after an action.
Enums§
- Action
Type - Types of actions that can be performed.
- Agent
Error - Agent error types.
- Chain
Condition - Condition for conditional execution in action chains.
- Change
Type - Type of change to an element.
- Checkpoint
Type - Type of checkpoint verification.
- Cleaning
Intent - Intent for HTML cleaning decisions.
- Cost
Tier - Cost tier for model selection.
- Html
Cleaning Mode - HTML cleaning mode.
- Html
Cleaning Profile - HTML cleaning profile for content processing.
- Html
Diff Mode - Mode for HTML diffing.
- Limit
Type - Type of limit that was exceeded.
- Memory
Operation - Memory operation requested by the LLM.
- Message
Content - Message content - either text or multi-part.
- Reasoning
Effort - Reasoning effort level for models that support explicit reasoning controls.
- Recovery
Strategy - Recovery strategy for handling failures during automation.
- Search
Error - Search-specific error types.
- Selector
Issue Type - Types of selector issues.
- Time
Range - Time range for filtering search results.
- Tool
Calling Mode - Mode for how actions should be formatted in LLM requests.
- Verification
Type - Types of verification checks.
- Vision
Route Mode - Routing mode that decides when to use the vision vs text model.
Constants§
- ACT_
SYSTEM_ PROMPT - System prompt for the
act()single-action API. - CONFIGURATION_
SYSTEM_ PROMPT - System prompt for configuring a web crawler from natural language.
- DEFAULT_
SYSTEM_ PROMPT - Default system prompt for web automation (iterative). This is the foundation for all web automation tasks - kept lean with core action bindings and agentic reasoning only. Challenge-specific strategies should be injected via system_prompt_extra or skill modules.
- EXTRACT_
SYSTEM_ PROMPT - System prompt for the
extract()data extraction API. - MAP_
SYSTEM_ PROMPT - System prompt for the
map()URL discovery API. - MODEL_
INFO - Sorted by name for binary search lookup.
- OBSERVE_
SYSTEM_ PROMPT - System prompt for the
observe()page understanding API.
Traits§
- LLMProvider
- LLM provider trait for abstracting different LLM APIs.
Functions§
- build_
schema_ generation_ prompt - Build a prompt for LLM-assisted schema generation.
- clean_
html - Default cleaner (base level).
- clean_
html_ base - Clean the HTML removing CSS and JS (base level).
- clean_
html_ full - Full/aggressive HTML cleaning.
- clean_
html_ raw - Raw passthrough - no cleaning.
- clean_
html_ slim - Slim HTML cleaning - removes heavy elements.
- clean_
html_ with_ profile - Clean HTML using a specific profile.
- clean_
html_ with_ profile_ and_ intent - Clean HTML with a specific profile and intent.
- extract_
assistant_ content - Extract the assistant’s text content from an OpenAI-compatible response.
- extract_
html_ context - Extract HTML context around a selector’s expected location.
- extract_
last_ code_ block - Extract the LAST
jsonor ``` code block from text. - extract_
last_ json_ array - Extract the last JSON array from text with proper brace matching.
- extract_
last_ json_ boundaries - Extract the last balanced JSON object or array from text.
- extract_
last_ json_ object - Extract the last JSON object from text with proper brace matching.
- extract_
usage - Extract token usage from an OpenAI-compatible response.
- fnv1a64
- FNV-1a 64-bit hash function for cheap content hashing.
- generate_
schema - Generate a schema from a request.
- infer_
schema - Infer a JSON schema from a value.
- infer_
schema_ from_ examples - Infer a schema from multiple examples, merging field information.
- parse_
tool_ calls - Parse tool calls from an OpenAI-compatible response.
- refine_
schema - Refine a schema by adding more examples.
- smart_
clean_ html - Smart HTML cleaner that automatically determines the best cleaning level.
- tool_
calls_ to_ steps - Convert tool calls to automation step actions.
- truncate_
utf8_ tail - Take the last
max_bytesof a UTF-8 string without splitting code points.
Type Aliases§
- Agent
Result - Result type for agent operations.