Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/pinchtab/pinchtab/llms.txt

Use this file to discover all available pages before exploring further.

PinchTab is a 12MB Go binary that wraps Chrome DevTools Protocol (CDP) to provide AI agents with browser control via a simple REST API.

Architecture Overview

Self-hosted mode

PinchTab launches and manages its own Chrome instance

Remote Chrome mode

PinchTab connects to an existing Chrome instance via CDP_URL

Self-hosted Mode (Default)

┌─────────────┐     HTTP      ┌──────────────┐      CDP       ┌──────────────┐
│   AI Agent  │ ────────────▶ │   Pinchtab   │ ─────────────▶ │    Chrome    │
│  (any LLM)  │ ◀──────────── │  (Go binary) │ ◀───────────── │ self-launched │
└─────────────┘    JSON/text  └──────────────┘   WebSocket    └──────────────┘

Remote Chrome Mode (CDP_URL)

┌─────────────┐     HTTP      ┌──────────────┐      CDP       ┌──────────────┐
│  Multiple   │ ────────────▶ │  Multiple    │ ─────────────▶ │  Shared      │
│  Agents     │ ◀──────────── │  Pinchtab    │ ◀───────────── │  Chrome      │
│             │    JSON/text  │  instances   │   WebSocket    │  instance    │
└─────────────┘               └──────────────┘                └──────────────┘
Agents never touch CDP directly. They send HTTP requests and receive JSON responses. The accessibility tree (a11y) is the primary interface — not screenshots, not raw DOM.

Design Principles

1

A11y tree over screenshots

4x cheaper in tokens, works with any LLM
2

HTTP over WebSocket

Stateless requests, no connection management for agents
3

Ref stability

Snapshot refs (e0, e1…) are cached and reused by action endpoints
4

Self-contained

Launches its own Chrome, manages its own state, zero config needed
5

Decoupled architecture

Interface-driven design for testability and maintainability

Project Layout

The project follows the standard Go internal/ pattern to ensure encapsulation and clean boundaries:
pinchtab/
├── cmd/pinchtab/        # Application entry points and CLI commands
├── internal/
│   ├── bridge/          # Core CDP logic, tab management, and state
│   ├── handlers/        # HTTP API handlers and middleware
│   ├── orchestrator/    # Multi-instance lifecycle and process management
│   ├── profiles/        # Chrome profile management and identity discovery
│   ├── dashboard/       # Backend logic and static assets for web UI
│   ├── assets/          # Centralized embedded files (stealth scripts, HTML)
│   ├── human/           # Human-like interaction simulation (Bezier mouse, typing)
│   ├── config/          # Centralized configuration management
│   └── web/             # Shared HTTP and JSON utilities
├── Dockerfile           # Alpine + Chromium container image
└── scripts/             # Deployment and automation scripts

Core Components

Bridge (internal/bridge)

The central state holder. Owns the Chrome browser context, tab registry, and snapshot caches. It implements the BridgeAPI interface. Key responsibilities:
  • Tab lifecycleCreateTab, CloseTab, TabContext (resolve "" to first tab)
  • Ref caching — Each tab’s last snapshot is cached. When /action receives ref: "e5", it looks up the cached BackendDOMNodeID without re-fetching the a11y tree
  • State logic — Diffing snapshots and managing session persistence (SaveState/RestoreState)
// Create a new tab
tabID := bridge.CreateTab(ctx, "https://example.com")

// Get tab context (auto-resolves to first tab if ID is empty)
tabCtx := bridge.TabContext(ctx, "")

// Close tab
bridge.CloseTab(ctx, tabID)

Orchestrator (internal/orchestrator)

Manages multiple isolated browser instances. Uses a HostRunner interface to decouple business logic from OS process management. Key responsibilities:
  • Instance registry — Tracking running instances, their ports, and statuses
  • Process management — Spawning, signaling, and stopping instances
  • Health monitoring — Probing instance health via HTTP
See: internal/orchestrator/orchestrator.go

Profiles (internal/profiles)

Handles Chrome user data directories and metadata. Key responsibilities:
  • CRUD operations — Creating, importing, and resetting profiles
  • Identity discovery — Parsing internal Chrome JSON files to find user identity info
  • Activity tracking — Recording and analyzing agent actions per profile
See: internal/profiles/profiles.go

Snapshot Pipeline (internal/bridge/snapshot.go)

The accessibility tree is PinchTab’s core abstraction. Flow:
Chrome a11y tree (CDP)


  Raw JSON parse (RawAXNode)     ← Manual parsing to avoid cdproto crash
       │                            on "uninteresting" PropertyName values

  Flatten to []A11yNode           ← DFS walk, assign refs (e0, e1, e2...)

       ├──▶ JSON (default)        ← Full structured output
       ├──▶ Text (indented tree)  ← Low-token format for agents
       └──▶ YAML                  ← Alternative structured format
Ref caching: When /snapshot is called, the ref→nodeID mapping is stored per tab. When /action receives {"ref": "e5", "kind": "click"}, it looks up e5 in the cache.

Human Interaction (internal/human)

Two main simulation engines for anti-detection: MouseMove — Cubic bezier curve from A to B:
  • Random control points for natural curvature
  • Step count scales with distance (5-30 steps)
  • Per-step jitter and variable timing
Type — Keystroke-level simulation:
  • Base delay: 80ms/char (40ms in fast mode)
  • Random long pauses (“thinking”)
  • Simulated typos and backspace corrections
// Natural curved mouse movement from current position to target
human.MouseMove(ctx, targetX, targetY)
// Generates bezier curve with random control points
// Simulates human-like movement with jitter

Deployment Modes

go build -o pinchtab ./cmd/pinchtab

Docker

docker build -t pinchtab .

CDP Architecture

PinchTab sits between your tools/agents and Chrome:
┌─────────────────────────────────────────┐
│         Your Tool/Agent                 │
│   (CLI, curl, Python, Node.js, etc.)    │
└──────────────┬──────────────────────────┘

               │ HTTP

┌─────────────────────────────────────────┐
│    PinchTab HTTP Server (Go)            │
│  ┌─────────────────────────────────┐    │
│  │  Tab Manager                    │    │
│  │  (tracks tabs + sessions)       │    │
│  └─────────────────────────────────┘    │
│  ┌─────────────────────────────────┐    │
│  │  Chrome DevTools Protocol (CDP) │    │
│  └─────────────────────────────────┘    │
└──────────────┬──────────────────────────┘

               │ CDP WebSocket

┌─────────────────────────────────────────┐
│        Chrome Browser                   │
│  (Headless, headed, or external)        │
└─────────────────────────────────────────┘
PinchTab wraps Chrome’s DevTools Protocol (CDP) to translate HTTP requests into CDP commands, manage browser state, and deliver structured responses (accessibility trees, screenshots, PDFs) back to your agents.

Instance Orchestration

The Orchestrator handles the lifecycle of multiple independent Chrome processes:

Process Isolation

Each instance runs as a separate OS process with its own PID

Health Monitoring

After launching, polls the instance’s /health endpoint until ready

Port Management

Ensures each instance is assigned a unique port (9868-9968)

Resilience

Handles Chrome crashes with lock file cleanup and retry logic

Pre-Flight Stealth Injection

Before any website loads, PinchTab performs pre-flight injection:
1

AddScriptToEvaluateOnNewDocument

The stealth.js script is registered to execute before any other script on a page
2

Mask navigator.webdriver

Hides automation markers before websites can detect them
3

Spoof hardware identifiers

Randomizes CPU cores, memory, and other hardware fingerprints
4

Environment spoofing

Applies timezone overrides and locale settings immediately after startup
See: internal/bridge/init.go

Resilience & Self-Healing

PinchTab includes logic to handle common browser startup failures:
If Chrome previously crashed, it might leave SingletonLock or SingletonSocket files that prevent it from restarting. PinchTab automatically detects an “unclean exit” and deletes these locks.
If Chrome fails to start within the chromeStartTimeout (15s), PinchTab will clear the session data and attempt one retry to ensure service availability.
Enforces BRIDGE_MAX_TABS (default 20) to prevent runaway agents from consuming all memory.
Periodically removes tabs that no longer exist in Chrome.
See: internal/bridge/crash.go

Performance Characteristics

MetricValueNotes
Binary size12MBStatically compiled Go binary
Memory per instance~300MBHeadless Chrome with 1 tab
Memory per tab~50-100MBDepends on page complexity
Startup time2-5sChrome initialization
Snapshot latency50-200msDepends on page size
Token efficiency800 tokens/pagevs 10,000+ for screenshots

Next Steps

CDP Integration

Learn about remote Chrome mode and CDP_URL

Configuration

Complete environment variable reference

Multi-Instance Guide

Running multiple browser instances

API Reference

Complete HTTP API documentation