Concept

Page index stays stable.

get_document_structure() is the stable page-index contract. It returns document -> pages[] and does not silently absorb semantic hierarchy.

Why it exists

iterate pages deterministically
locate per-page artifacts under one document root
support downstream incremental reads without semantic assumptions

Artifact mapping

documents/<documentId>/
  document.json
  structure.json
  pages/
    0001.json
    0002.json

Stable contract

Artifact	Purpose	Safe downstream assumption
`document.json`	source metadata	tracks source path, snapshot, page count, artifact roots
`structure.json`	page index	`root.children` stays a page list, not a semantic tree
`pages/0001.json`	page content	contains page text, preview, and artifact path

No hidden promotion to semantics.

If you need headings or sections, use the semantic layer explicitly. The page index is intentionally flatter and more boring, because downstream tooling depends on it being mechanically stable.