EPecho-pdf docs
Concept

Workspace artifacts are part of the product surface.

.echo-pdf-workspace/ is not incidental cache dust. It is an inspectable local contract for downstream tools that need document, page, render, and semantic artifacts.

Directory layout

.echo-pdf-workspace/
  documents/<documentId>/
    document.json
    structure.json
    semantic-structure.json
    pages/
      0001.json
    renders/
      0001.scale-2.json
      0001.scale-2.png

Legacy OCR artifacts are migration-only.

Older workspaces may still contain ocr/ directories from pre-VL-first builds. They are not part of the current first-class workspace contract for new downstream integrations.

Reuse boundary

Artifacts reuse only when source snapshot and strategy metadata still match the current document state.

Traceability

Artifacts link back to source files, sibling artifacts, and page-level origins so downstream readers can audit what they consume.

What not to assume

documentId is path-derived, not a portable content hash. Optional artifacts appear only after their primitive runs.

Local artifact contract, not remote storage.

The workspace is documented as a local filesystem contract. This site does not promise a hosted artifact browser or cloud artifact retention surface.