Skip to main content
wcag21aa.org

Tagged PDFs and PDF/UA

A PDF without tags is unusable with assistive technology. A PDF with the wrong tags can be worse than untagged. Both look fine on screen.

By Levi Whitted Last reviewed: Published:

What "tagged" means

A tagged PDF includes a structure tree: a hidden map of the document's semantic structure, parallel to (but not the same as) its visual layout. Tags identify which on-page text is a heading, which is a paragraph, which is a list item, which cell belongs to which row in a table, and what order content should be read.

The visual rendering of a PDF (what you see when you open it) does not depend on tags. The accessibility of a PDF to assistive technology depends almost entirely on tags. An untagged PDF presents to a screen reader as a flat stream of characters with no headings, no list structure, no table semantics, and no reliable reading order.

Why tagging matters

Screen readers and other assistive technologies navigate documents structurally. A user listening to a long document needs to jump from heading to heading, skim sections, return to a specific list, navigate a table by row and column. None of that works without tags.

Without tags, a screen reader user encounters the document as a single uninterrupted block of text. There is no way to skim. There is no way to find a specific section. Multi-column layouts often read as garbage because reading order in an untagged PDF is unreliable. Tables collapse into rows of unlabeled cells.

Tagging is what makes a PDF behave as a document rather than a picture of one.

What PDF/UA means

PDF/UA (ISO 14289) is the international standard for accessible PDFs. It defines the technical requirements for a PDF to be considered accessible: a complete structure tree with semantically correct tags, embedded fonts, language metadata, properly tagged tables, alternative text on images, and so on.

PDF/UA and WCAG 2.1 AA are related but not identical. WCAG 2.1 AA is the regulatory standard under the DOJ rule. PDF/UA is the format-specific standard that, in practice, satisfies WCAG 2.1 AA for PDF content. A document that conforms to PDF/UA generally conforms to WCAG 2.1 AA's PDF-relevant criteria. Most automated checkers test against a PDF/UA-derived rule set.

"Has tags" vs. "has correct tags"

A PDF can be technically tagged and still functionally inaccessible. This is the source of much confusion in accessibility audits. The automated checker reports "tagged: yes," the file shows a structure tree when inspected, and yet a screen reader user cannot navigate the document.

The reason is that tag presence and tag correctness are different. A PDF may have:

  • Every paragraph tagged as <P> but headings also tagged as <P> instead of <H1> / <H2>. Screen readers cannot navigate by heading.
  • Tables tagged as <Table> but cells flattened so row and column relationships are lost.
  • Headings tagged but at the wrong level (H1 followed by H3 with no H2, multiple H1s where the hierarchy needs nesting).
  • Reading order tagged in the order content was added to the PDF rather than the order it should be read.
  • Decorative images tagged as content (or content images tagged as decorative).
  • List items tagged individually rather than wrapped in a parent <L> structure.

A file in any of these states passes the "tagged?" question and fails the "accessible?" question. The harder remediation case is often cleaning up bad tags before applying correct ones.

Common tagging failures by source

Tagging quality is usually a function of where the PDF came from.

Word "Save As PDF" without tagged-PDF option

Word produces tagged PDFs when "Document structure tags for accessibility" is checked in the export dialog (or when the equivalent setting is on by default). When this option is off, the resulting PDF has no tags. Many older templates and many quick-export workflows produce untagged files.

"Print to PDF"

Printing to PDF (via a printer driver like "Microsoft Print to PDF" or "Adobe PDF") almost always produces an untagged file. The print pipeline does not preserve semantic structure. This is one of the most common sources of inaccessible born-digital PDFs.

Older financial and ERP systems

Many financial systems, student information systems, and ERPs export PDFs through legacy report writers that do not produce tagged output. Statements, transcripts, and reports generated through these systems are typically untagged.

Design tools (InDesign, Illustrator) without accessibility workflow

InDesign can produce tagged PDFs but requires deliberate setup: paragraph styles mapped to PDF tags, articles defined, alt text added to images. Without that setup, the export is visually polished and structurally empty.

Scanned documents

Pure scanned PDFs have no tags because they have no semantic content. After OCR, they have text but still no structural tags. Both layers have to be added.

What tag review actually involves

A defensible PDF accessibility review goes beyond running an automated checker.

  1. Automated check. Acrobat's built-in accessibility checker or a third-party tool catches the obvious issues: untagged content, missing language attribute, missing alt text. This is a floor, not a ceiling.
  2. Structure tree inspection. Open the tag tree and walk through it. Verify that headings are tagged as headings at correct levels, that list items are wrapped in list parents, that tables have row and column structure, that figures have alt text.
  3. Reading order verification. Use the reading order tool to confirm content reads in the intended sequence, especially across multi-column layouts, sidebars, and pull quotes.
  4. Screen reader pass. Open the document in NVDA (Windows) or VoiceOver (macOS). Navigate by heading, by list, by table. Verify the document is actually usable, not just technically tagged.
  5. Document the result. Defensible compliance posture includes a record of who reviewed each document, when, and against what checklist.

For high-volume document estates, this work is what triage and prioritization buy time for. See Which documents to prioritize.