How We Built a Screenshot API with Headless Chrome

When we started building SavePage.io, the core technical challenge was straightforward: take a URL, render it in a browser, capture a screenshot, and return the image. The complexity came from doing this reliably at scale.

Browser management

The foundation of the system is a pool of headless Chromium instances. Each instance runs in an isolated container with its own memory allocation. When an API request comes in, we assign it to an available browser instance from the pool.

We use the Chrome DevTools Protocol (CDP) to control the browser. This gives us fine-grained control over viewport dimensions, device emulation, page loading, and screenshot capture.

The render pipeline

A typical screenshot request follows this path:

API server validates the request and checks rate limits
Request enters the render queue
A worker picks up the request and acquires a browser instance
The browser navigates to the URL and waits for the page to load
If a delay is specified, the worker waits the additional time
The screenshot is captured at the specified dimensions and format
The image is uploaded to object storage
The CDN URL is returned to the caller

Page load detection

Knowing when a page is "done loading" is harder than it sounds. We use a combination of signals:

The load event fires (all resources loaded)
The networkidle0 heuristic (no network requests for 500ms)
A maximum timeout as a safety net

For JavaScript-heavy single-page applications, the delay parameter gives users explicit control over when the screenshot is taken.

Image delivery

Screenshots are stored in distributed object storage and served through a global CDN. This means the captured image is available at low latency regardless of where the API caller is located.

Free plan images expire after 24 hours. Pro plan images are retained for 30 days. The CDN handles cache invalidation automatically.

Error handling

Not every URL renders cleanly. Common failure modes include:

DNS resolution failures (domain does not exist)
Connection timeouts (server unreachable)
SSL certificate errors
Pages that redirect infinitely
Pages that crash the renderer (out-of-memory)

Each of these cases returns a specific error code and message so the API caller knows what went wrong.

What we learned

The biggest lesson was about browser stability. Chromium processes can leak memory, hang on certain pages, or crash unexpectedly. Our solution is aggressive recycling: each browser instance handles a fixed number of requests before being replaced with a fresh one.

The second lesson was about queue management. Without backpressure, a spike in requests can overwhelm the browser pool. We use a bounded queue with configurable concurrency limits per API key tier.

These patterns have held up well as the service has grown from a few hundred requests per day to tens of thousands.