Crate spider_agent

Expand description

§Spider Agent

A concurrent-safe multimodal agent for web automation and research.

§Features

Concurrent-safe: Designed to be wrapped in Arc for multi-task access
Feature-gated: Only include dependencies you need
Multiple LLM providers: OpenAI, OpenAI-compatible APIs
Multiple search providers: Serper, Brave, Bing, Tavily
Browser automation: Chrome support via chromiumoxide

§Quick Start

use spider_agent::{Agent, AgentConfig};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let agent = Arc::new(Agent::builder()
        .with_openai("sk-...", "gpt-4o-mini")
        .with_search_serper("serper-key")
        .build()?);

    // Search
    let results = agent.search("rust web frameworks").await?;
    println!("Found {} results", results.len());

    // Extract from first result
    let html = agent.fetch(&results.results[0].url).await?.html;
    let data = agent.extract(&html, "Extract framework name and features").await?;
    println!("{}", serde_json::to_string_pretty(&data)?);

    Ok(())
}

§Concurrent Execution

use spider_agent::Agent;
use std::sync::Arc;

let agent = Arc::new(Agent::builder()
    .with_openai("sk-...", "gpt-4o")
    .with_search_serper("serper-key")
    .with_max_concurrent_llm_calls(10)
    .build()?);

// Execute multiple searches concurrently
let queries = vec!["rust async", "rust web frameworks", "rust databases"];
let mut handles = Vec::new();

for query in queries {
    let agent = agent.clone();
    let query = query.to_string();
    handles.push(tokio::spawn(async move {
        agent.search(&query).await
    }));
}

// Collect results
for handle in handles {
    let result = handle.await??;
    println!("Found {} results", result.results.len());
}

§Feature Flags

openai - OpenAI/OpenAI-compatible LLM provider
chrome - Browser automation via chromiumoxide
search - Base search functionality
search_serper - Serper.dev search provider
search_brave - Brave Search provider
search_bing - Bing Search provider
search_tavily - Tavily AI Search provider
full - All features

Re-exports§

pub use tools::AuthConfig;
pub use tools::CustomTool;
pub use tools::CustomToolRegistry;
pub use tools::CustomToolResult;
pub use tools::HttpMethod;
pub use tools::SpiderCloudToolConfig;
pub use automation::RemoteMultimodalConfigs;
pub use automation::RemoteMultimodalEngine;
pub use automation::EngineError;
pub use automation::EngineResult;
pub use automation::best_effort_parse_json_object;
pub use automation::execute_graph;
pub use automation::cache::CacheStats;
pub use automation::cache::CacheValue;
pub use automation::cache::SmartCache;
pub use automation::executor::BatchExecutor;
pub use automation::executor::ChainExecutor;
pub use automation::executor::PrefetchManager;
pub use automation::router::auto_policy;
pub use automation::router::estimate_tokens;
pub use automation::router::ModelRequirements;
pub use automation::router::ModelRouter;
pub use automation::router::ModelSelector;
pub use automation::router::RoutingDecision;
pub use automation::router::ScoredModel;
pub use automation::router::SelectionStrategy;
pub use automation::router::TaskAnalysis;
pub use automation::router::TaskCategory;

Modules§

automation: Automation module for spider_agent.
categories: URL category constants.
tools: Custom tool support for external API calls.

Structs§

ActResult: Result of a single action execution via act().
ActionRecord: Record of an action taken during automation.
ActionResult: Result of an action execution.
ActionToolSchemas: Generator for automation action tool schemas.
Agent: Multimodal agent for web automation and research.
AgentBuilder: Agent builder for configuring and creating agents.
AgentConfig: Agent configuration.
AgentMemory: Session memory for storing state across operations.
Alternative: An alternative action with its confidence score.
AutomationConfig: Main automation configuration.
AutomationMemory: In-memory storage for agentic automation sessions.
AutomationResult: Result of an automation operation.
AutomationUsage: Token usage tracking for automation operations with granular call tracking.
CaptureProfile: Capture profile for screenshots and HTML.
ChainBuilder: Builder for creating action chains.
ChainContext: Context for evaluating chain conditions.
ChainResult: Result of an action chain execution.
ChainStep: A single step in an action chain.
ChainStepResult: Result of a single step in an action chain.
Checkpoint: A checkpoint condition to verify after a step.
CheckpointResult: Result of a checkpoint verification.
ClipViewport: Clip viewport for screenshots.
CompletionOptions: Options for completion requests.
CompletionResponse: Response from a completion request.
ConcurrentChainConfig: Configuration for concurrent chain execution.
ConcurrentChainResult: Result of executing a concurrent chain.
ConfidenceRetryStrategy: Strategy for retrying based on confidence.
ConfidenceSummary: Summary of confidence statistics.
ConfidenceTracker: Tracker for confidence statistics across an automation session.
ConfidentStep: A step with confidence score and alternatives.
ContentAnalysis: Result of analyzing HTML content.
DependencyGraph: Dependency graph for managing step execution order.
DependentStep: A step in a dependency chain.
DiffStats: Statistics about HTML diff performance.
DiscoveredUrl: A discovered URL with AI-generated metadata.
ElementChange: A single element change.
ExecutionPlan: An execution plan from the LLM.
ExtractionSchema: Schema for structured data extraction.
FetchResult: Result from fetching a URL.
FormField: A field in a form.
FormInfo: Information about a form on the page.
FunctionCall: A function call from the LLM.
FunctionDefinition: OpenAI-compatible function definition.
GeneratedSchema: A generated JSON schema.
HealedSelectorCache: Cache for healed selectors.
HealingDiagnosis: Diagnosis and suggested fix from the LLM.
HealingRequest: A request to heal a failed selector.
HealingResult: Result of a healing attempt.
HealingStats: Statistics about healing operations.
HtmlDiffResult: Result of computing an HTML diff.
InteractiveElement: An interactive element on the page.
MapResult: Result of the map() API call for page discovery.
Message: A message in a conversation.
ModelCapabilities: Model capabilities struct.
ModelEndpoint: A model endpoint override for dual-model routing.
ModelInfoEntry: Detailed model information entry.
ModelPolicy: Policy for selecting models based on cost/quality tradeoffs.
ModelPricing: Pricing in USD.
ModelProfile: Full model profile: capabilities, ranks, pricing, and context window.
ModelRanks: Arena and task-specific rankings, normalized to 0.0-100.0.
MultiPageContext: Multi-page context for synthesis.
NavigationOption: A navigation option on the page.
PageContext: Context for a single page in multi-page synthesis.
PageContribution: Contribution of a single page to the synthesis.
PageExtraction: Extraction from a single page.
PageObservation: Observation of a page’s current state.
PageState: Current page state for re-planning context.
PageStateDiff: Tracker for page state changes across rounds.
PlanExecutionState: State of plan execution.
PlannedStep: A single step in an execution plan.
PlanningModeConfig: Configuration for planning mode.
PromptUrlGate: URL-based prompt gating for per-URL config overrides.
RemoteMultimodalConfig: Runtime configuration for RemoteMultimodalEngine.
ReplanContext: Context for re-planning after a failure.
ResearchOptions: Research options for research tasks.
RetryConfig: Retry configuration.
RetryPolicy: Retry policy for automation operations.
SchemaCache: Cache for generated schemas.
SchemaGenerationRequest: Request to generate a schema from examples.
SearchOptions: Search options for web search.
SelectorCache: Self-healing selector cache.
SelectorCacheEntry: A single entry in the selector cache.
SelfHealingConfig: Configuration for self-healing behavior.
StepResult: Result of executing a single step.
StructuredOutputConfig: Configuration for structured output mode.
SynthesisConfig: Configuration for multi-page synthesis.
SynthesisResult: Result of multi-page synthesis.
TokenUsage: Token usage from a completion.
ToolCall: A tool call from the LLM response.
ToolDefinition: OpenAI-compatible tool definition.
UsageLimits: Usage limits for controlling agent resource consumption.
UsageSnapshot: Snapshot of usage statistics.
UsageStats: Usage statistics for tracking agent operations.
Verification: Verification to run after an action.

Enums§

ActionType: Types of actions that can be performed.
AgentError: Agent error types.
ChainCondition: Condition for conditional execution in action chains.
ChangeType: Type of change to an element.
CheckpointType: Type of checkpoint verification.
CleaningIntent: Intent for HTML cleaning decisions.
CostTier: Cost tier for model selection.
HtmlCleaningMode: HTML cleaning mode.
HtmlCleaningProfile: HTML cleaning profile for content processing.
HtmlDiffMode: Mode for HTML diffing.
LimitType: Type of limit that was exceeded.
MemoryOperation: Memory operation requested by the LLM.
MessageContent: Message content - either text or multi-part.
ReasoningEffort: Reasoning effort level for models that support explicit reasoning controls.
RecoveryStrategy: Recovery strategy for handling failures during automation.
SearchError: Search-specific error types.
SelectorIssueType: Types of selector issues.
TimeRange: Time range for filtering search results.
ToolCallingMode: Mode for how actions should be formatted in LLM requests.
VerificationType: Types of verification checks.
VisionRouteMode: Routing mode that decides when to use the vision vs text model.

Constants§

ACT_SYSTEM_PROMPT: System prompt for the act() single-action API.
CONFIGURATION_SYSTEM_PROMPT: System prompt for configuring a web crawler from natural language.
DEFAULT_SYSTEM_PROMPT: Default system prompt for web automation (iterative). This is the foundation for all web automation tasks - kept lean with core action bindings and agentic reasoning only. Challenge-specific strategies should be injected via system_prompt_extra or skill modules.
EXTRACT_SYSTEM_PROMPT: System prompt for the extract() data extraction API.
MAP_SYSTEM_PROMPT: System prompt for the map() URL discovery API.
MODEL_INFO: Sorted by name for binary search lookup.
OBSERVE_SYSTEM_PROMPT: System prompt for the observe() page understanding API.

Traits§

LLMProvider: LLM provider trait for abstracting different LLM APIs.

Functions§

build_schema_generation_prompt: Build a prompt for LLM-assisted schema generation.
clean_html: Default cleaner (base level).
clean_html_base: Clean the HTML removing CSS and JS (base level).
clean_html_full: Full/aggressive HTML cleaning.
clean_html_raw: Raw passthrough - no cleaning.
clean_html_slim: Slim HTML cleaning - removes heavy elements.
clean_html_with_profile: Clean HTML using a specific profile.
clean_html_with_profile_and_intent: Clean HTML with a specific profile and intent.
extract_assistant_content: Extract the assistant’s text content from an OpenAI-compatible response.
extract_html_context: Extract HTML context around a selector’s expected location.
extract_last_code_block: Extract the LAST json or ``` code block from text.
extract_last_json_array: Extract the last JSON array from text with proper brace matching.
extract_last_json_boundaries: Extract the last balanced JSON object or array from text.
extract_last_json_object: Extract the last JSON object from text with proper brace matching.
extract_usage: Extract token usage from an OpenAI-compatible response.
fnv1a64: FNV-1a 64-bit hash function for cheap content hashing.
generate_schema: Generate a schema from a request.
infer_schema: Infer a JSON schema from a value.
infer_schema_from_examples: Infer a schema from multiple examples, merging field information.
parse_tool_calls: Parse tool calls from an OpenAI-compatible response.
refine_schema: Refine a schema by adding more examples.
smart_clean_html: Smart HTML cleaner that automatically determines the best cleaning level.
tool_calls_to_steps: Convert tool calls to automation step actions.
truncate_utf8_tail: Take the last max_bytes of a UTF-8 string without splitting code points.

Type Aliases§

AgentResult: Result type for agent operations.

Crate spider_agent

Crate spider_agent Copy item path

§Spider Agent

§Features

§Quick Start

§Concurrent Execution

§Feature Flags

Re-exports§

Modules§

Structs§

Enums§

Constants§

Traits§

Functions§

Type Aliases§

Crate spider_agent