Skip to main content
wcag21aa.org

How many documents do we actually have to fix?

The panicked first question after an accessibility audit. The answer is fewer than the raw document count, but probably more than the optimistic estimate. The path to a real number starts with inventory.

By Levi Whitted Last reviewed: Published:

The panic question

This page walks through how to move from "we have N thousand documents and a deadline" to "we have X documents to remediate now, Y to remediate on cycle, and Z under exception analysis."

What is clearly in scope

Some documents are unambiguously within the rule's scope and require remediation:

  • Documents linked from current public-facing pages
  • Documents distributed in current programs, services, or activities (forms, handbooks, applications)
  • Documents that are republished on a recurring cycle (agendas, minutes, schedules, catalogs)
  • Documents that establish current official position (policies, AP/BP manuals, board resolutions)
  • Documents posted within the last 12 months
  • Documents on internal-but-digital systems used by employees (HR portals, staff handbooks, intranet)

For a typical entity, this category is a meaningful fraction of the total document count but not the majority. It is, however, the meaningful fraction: these are the documents that are actually relied on, that drive complaint exposure, and that need to work for assistive technology users.

What may qualify for exceptions

The rule provides two narrow exceptions that may apply to portions of the document estate. Both are narrower than they sound, and both require a documented analysis per document or per document class.

Archived web content (28 CFR § 35.201)

Content qualifies as "archived web content" only if all of the following are true:

  • The content is maintained exclusively for reference, research, or recordkeeping
  • The content is in an archive section of the website (or clearly identified as archived)
  • The content is not altered or updated after the date of archival
  • The content is not currently being used for activities such as applying for, gaining access to, or participating in the entity's programs, services, or activities

A board minute from 2018 that is still linked from a "Board Minutes Archive" page, has not been altered, and is not used in current operations may qualify. A board minute from 2018 that is still linked as supporting material from a current policy page does not.

Preexisting conventional electronic documents (28 CFR § 35.202)

The "preexisting electronic documents" exception applies only if all of the following are true:

  • The document is a "conventional electronic document" (PDF, Word, PowerPoint, Excel)
  • The document was created before the entity's compliance deadline
  • The document is not currently used by the entity for activities such as applying for, gaining access to, or participating in its programs, services, or activities

The "not currently used" language is the operative limit. A registration form created before the deadline but still in use for current registrations does not qualify. A 2017 grant report that no current activity references and that exists only for retrospective inspection may.

What still needs review

A large category of documents falls between "clearly in scope" and "clearly exempt." These need per-document or per-class review:

  • Documents linked from old pages that may or may not still be current
  • Documents in topical folders ("/budget/2019/") with no explicit archive designation
  • Recurring documents from past cycles (last year's catalog, last year's annual report)
  • Documents referenced indirectly (search results return them; navigation does not)
  • Documents that are duplicated across multiple URLs

A typical triage workflow handles these in classes rather than one by one: "all board minutes from before 2022 in the archive section: archived-content exception, document the analysis, move on." This is much faster than per-document analysis and produces a defensible record.

How to estimate the volume

A workable first-pass estimate takes hours, not weeks.

  1. Crawl the site for PDF and Office file extensions. Use a site crawler (Screaming Frog, Sitebulb) or a directory listing of the file system. Output: a list of all linked or hosted documents with URLs and file sizes.
  2. Bucket by directory. Most public entity sites organize content by department or function. Bucket the inventory by top-level directory to surface concentrations (e.g., "/board/" has 1,200 documents, "/budget/" has 450, "/student-services/" has 380).
  3. Sample-classify by age and type. Pull 30 to 50 random documents and classify each: born-digital vs. scanned, tagged vs. untagged, current-use vs. likely-archival. This is calibration data, not full coverage.
  4. Extrapolate. Apply the sample classification rates to the full count to get a rough estate composition: e.g., 60% likely in scope and current, 25% candidates for archived-content exception, 15% candidates for preexisting-documents exception.

The output is not exact. It is a planning estimate that turns "we have an unmanageable problem" into "we have approximately 1,800 documents to remediate now and 4,200 to evaluate for exceptions over the next six months."

How to turn the count into a plan

The estimated count drives three planning decisions:

  1. Tier 1 budget. Multiply the in-scope-now count by an estimated per-document remediation cost (varies widely; see Scanned vs. born-digital). This is the budget number that needs to be raised, allocated, or contracted.
  2. Tier 2 workflow. The recurring-publication documents (agendas, minutes, schedules) require workflow changes, not budget. Identify the upstream authoring sources and target them for accessibility-by-default before catching up retroactively.
  3. Tier 3 documentation. The exception-candidate categories need a documented analysis. The deliverable is a memo, by document class, that captures the analysis and the conclusion. This is the artifact that supports a "we relied on this exception" claim if challenged.

Next steps