The Web Actor Programming Model Whitepaper

View on GitHub

This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.

By Jan Čurn, Marek Trunkát, Ondra Urban, and the entire Apify team.

Version 0.999 (February 2025)

Introduction

This whitepaper introduces Actors, a new language-agnostic model for building general-purpose web computing and automation programs (also known as agents, functions, or apps). The main goal for Actors is to make it easy for developers to build and ship reusable software tools, which are easy to run, integrate, and build upon. Actors are useful for building web scrapers, crawlers, automations, and AI agents.

Background

Actors were first introduced by Apify in late 2017, as a way to easily build, package, and ship web scraping and web automation jobs to customers. Over the years, Apify has continued to develop the concept and applied it successfully to thousands of real-world use cases in many business areas, well beyond the domain of web scraping.

Building on this experience, we’re releasing this whitepaper to introduce the philosophy of Actors to other developers and receive your feedback on it. We aim to establish the Actor programming model as an open standard, which will help the community to more effectively build and ship reusable software automation tools, as well as encourage new implementations of the model in other programming languages.

The goal of this whitepaper is to be the North Star that shows what the Actor programming model is and what operations it should support. But this document is not an official specification. The specification will be an OpenAPI schema of the Actor system interface, to enable new independent implementations of both the client libraries and backend systems. This is currently a work in progress.

Currently, the most complete implementation of the Actor model is provided by the Apify platform, with SDKs for Node.js and Python, and a command-line interface (CLI). Beware that the frameworks might not yet implement all the features of the Actor programming model described in this whitepaper.

Overview

Actors are serverless programs that run in the cloud. They can perform anything from simple actions such as filling out a web form or sending an email, to complex operations such as crawling an entire website, or removing duplicates from a large dataset. Actors can persist their state and be restarted, and thus they can run as short or as long as necessary, from seconds to hours, even infinitely.

Basically, Actors are programs packaged as Docker images, which accept a well-defined JSON input, perform an action, and optionally produce a well-defined JSON output.

Actors have the following elements:

Dockerfile which specifies where the Actor’s source code is, how to build it, and run it.
Documentation in a form of a README.md file.
Input and output schemas that describe what input the Actor requires, and what results it produces.
Access to an out-of-the-box storage system for Actor data, results, and files.
Metadata such as the Actor name, description, author, and version.

The documentation and the input/output schemas make it possible for people to easily understand what the Actor does, enter the required inputs both in user interface or API, and integrate the results of the Actor into their other workflows. Actors can easily call and interact with each other, enabling the building of more complex systems on top of simple ones.