EP echo-pdf docs
Docs only, not a service

Local-first, VL-first PDF context for agents.

echo-pdf turns local PDFs into reusable package APIs, CLI outputs, and workspace artifacts. The mainline path is render, page understanding, semantic structure, and reusable local artifacts. This site explains that boundary and does not expose hosted PDF processing.

What an agent gets back

  • document.json with source snapshot and page count
  • structure.json with a stable document -> pages[] index
  • semantic-structure.json with heading tree + cross-page tables/formulas/figures
  • page JSON, render PNG, and per-page tables/, formulas/, understanding/ artifacts

Non-goals in this phase

  • hosted SaaS or multi-tenant platform work
  • treating MCP as the main entrypoint
  • turning the site into an online playground
  • datasheet / EDA or other domain-specific logic
8

local primitives: document, structure, semantic, page, render, tables, formulas, understanding.

3

main product surfaces: package, CLI, workspace artifacts.

0

hosted promises. This site documents a local product, not a remote service.

Zero-config local run

npm i -g @echofiles/echo-pdf

echo-pdf document ./sample.pdf
echo-pdf structure ./sample.pdf
echo-pdf page ./sample.pdf --page 1
echo-pdf render ./sample.pdf --page 1

Provider-backed extraction

# Configure provider once
echo-pdf provider set --provider openai --api-key $OPENAI_API_KEY
echo-pdf model set --provider openai --model gpt-4.1-mini

# Then use any provider-required primitive
echo-pdf semantic ./sample.pdf
echo-pdf tables ./sample.pdf --page 1
echo-pdf formulas ./sample.pdf --page 1
echo-pdf understanding ./sample.pdf --page 1

Provider-required primitives are not zero-config.

document, structure, page, and render work without provider setup. semantic, tables, formulas, and understanding require a configured local provider and model.

That provider may be a cloud API or a local OpenAI-compatible server such as Ollama, LM Studio, vLLM, llama.cpp, or LocalAI. Slow local vision models may need a higher provider timeoutMs.

Copyable output example

{
  "documentId": "b1d2f6a8e9c4f013",
  "artifactPaths": {
    "documentJsonPath": "...documents/b1d2f6a8e9c4f013/document.json",
    "structureJsonPath": "...documents/b1d2f6a8e9c4f013/structure.json",
    "semanticStructureJsonPath": "...documents/b1d2f6a8e9c4f013/semantic-structure.json",
    "pagesDir": "...documents/b1d2f6a8e9c4f013/pages",
    "rendersDir": "...documents/b1d2f6a8e9c4f013/renders"
  },
  "pageCount": 12,
  "cacheStatus": "fresh"
}

Artifact tree a human can read fast

  • document.jsonsource path, snapshot, page count, artifact roots
  • structure.jsonstable page index for deterministic traversal
  • semantic-structure.jsonheading tree + cross-page merged tables/formulas/figures
  • renders/page render PNGs, ready for VL reuse
  • tables/per-page LaTeX tabular artifacts
  • formulas/per-page LaTeX math artifacts
  • understanding/combined per-page tables + formulas + figures

Why this is useful

  • the agent gets structured files, not just prose
  • the human sees concrete outputs before reading deep docs
  • downstream local tools can reuse the same workspace instead of reparsing PDFs
  • cache reuse stays explicit through source snapshot and strategy metadata

API

Read the mainline VL-first primitives and how they map to the artifact model.

CLI

Use the same VL-first primitives from the command line, with a clear source-checkout workflow.

What this site is for

Explain install, contracts, local-first workflows, and the shape of the public product.

What this site is not

An online PDF processor, hosted dashboard, account surface, or remote API playground.

How downstream products should read it

Use it as the docs layer for a local component. Build product-specific logic outside echo-pdf.

Local surfaces are the whole product boundary.

The supported boundary is local library, local CLI, and workspace artifacts. OCR is no longer a first-class product surface, and worker, HTTP, and MCP runtimes are not part of the current product shape.