logo

LaTeX Tagging Project

Accessible STEM documents


Listening to untagged and tagged PDF

Accurate reading is critical; even small mistakes in reading STEM content can result in entirely incorrect understandings.

There are many ways to try to make PDF documents accessible. To highlight the importance of accuracy in representing the author’s intent we recorded and analyzed a screen-reader reading the results from various attempts by software to understand a “realistic” demonstration PDF.

The recordings were made on Windows 11 using the test release of NVDA 2025 (which enables reading of MathML) and version 0.6.8-rc.9 of the MathCat plugin. Testing included two PDF viewers, Foxit and Adobe Acrobat.

For the recording of a tagged PDF generated from LaTeX, we used a PDF 2.0 file, which allows us to include MathML in an accessible manner. These recordings showcase two distinct routes to including MathML in PDF 2.0: PDF’s Associated Files feature and MathML structure elements in the tags tree.

Access the LaTeX source of the test file used to make these recordings.

Listening to untagged PDF

Everyone might understand that math is hard. But surely, commonplace elements such as tables of contents or pronouncing symbols in code-blocks must produce reasonable results when vocalized, regardless of system? Not necessarily.

Foxit/NVDA reading untagged PDF

Observations

It is very difficult to find all issues with this document if you are looking at the text in parallel, so close your eyes and try to understand what is told to you. Here is a (possibly incomplete) list of issues:

Summary

The untagged PDF is basically incomprehensible.

Acrobat/NVDA reading untagged PDF

Observations

Summary

The same untagged PDF gets a different reading compared to the previous one, but overall the results are equally incomprehensible.

Reading PDF Auto Tagged by Acrobat

Foxit/NVDA reading PDF tagged by Acrobat Pro auto-tagging (MacOS version)

The untagged PDF was auto-tagged by Acrobat Pro on a Mac. The resulting PDF is then read by Foxit/NVDA. A similar readout is produced when passing it to Acrobat Reader/NVDA.

Observations

Summary

For normal text structures the auto-tagging heuristics makes reasonable guesses and seldom fails (in this document the misinterpretation of the TOC row). However, the quality varies with the complexity of the document structure as we saw when using different documents. With respect to mathematics and graphics the reading always fails severely; basically only text characters contained in the formulas or graphics are read, everything else is ignored. This makes auto-tagging unsuitable for STEM documents.

Foxit/NVDA reading PDF tagged by Acrobat Pro auto-tagging (Windows version)

The untagged PDF was auto-tagged by Adobe Acrobat Pro on Windows. The resulting PDF is then read by Foxit/NVDA. A similar readout is produced when passing it to Acrobat Reader/NVDA.

Observations

Summary

Auto-tagging using the Windows software gives worse results than the corresponding version on MacOS. This is a bit surprising, but it shows the general problem that auto-tagging is faced with: it has to interpret visual clues that by themselves allow for several interpretation and it is often not clear to the software if alignments (e.g., same baseline) indicate a reading order or if other aspects (e.g., size of spaces) should take precedence—on the marginal the software failed spectacularly in this document. With respect to mathematics and graphics the reading always fails severely; basically only text characters contained in the formulas or graphics are read, everything else is ignored. This alone makes auto-tagging unsuitable for STEM documents.

Listening to Tagged PDF generated directly from LaTeX

When LaTeX source content includes math formulas there are two ways to generate tagged PDF directly from LaTeX:

Both are valid approaches in PDF 2.0, but unfortunately, as of today, PDF consumer applications differ in their support. We hope that the majority of the PDF readers soon fully support PDF 2.0, including both of the above methods.

LaTeX can automatically produce the necessary MathML for either method if the LuaTeX engine is used. If pdfTeX is used only the AF method is supported and the data for the AF files have to be prepared in a separate step, as is explained elsewhere.

Configuration possibilities

As shown in the example page LaTeX may be configured to use Associated files or Structure Elements to provide MathML tagging for mathematics.

Foxit/NVDA reading PDF with MathML AF

The sample document shown in the video was compiled with the above configuration lines using the LuaTeX engine. The resulting PDF was then displayed in Foxit with NVDA as a speech generator.

Observations

Remaining issues

Summary

The example shows that the accessibility of STEM documents produced by LaTeX is very high and there are no problems with complex material.

Adobe Acrobat Reader/NVDA reading PDF with MathML SE

Observations

Remaining issues

Summary

The example shows that the accessibility of STEM documents produced by LaTeX is very high and there are no problems with complex material.

The use of structure elements instead of AF files give identical results for math. The reading of the rest of the document is similar, with slight differences due to the use of different PDF consumer applications. Some are due to bugs, others are due to different decisions on what is or should be passed on to the speech generator (e.g., handling of tables, announcing links or graphics), some of this is configurable in the consumer application.

Listening to ChatGPT’s interpretation

Foxit reading GitHub display of markdown extracted by ChatGPT 3

Notes

This document is shown as one possible alternative. ChatGPT 3 was used. The untagged PDF was uploaded and the following question posed:

Please show full markdown source for an accessible document suitable for a blind reader extracted from this PDF

The supplied markdown was not edited other than changing \( and \[ to $ and $$ to match the default MathJax configuration at GitHub. The markdown was then viewed in GitHub markdown preview, Foxit was used to read the rendered web page.

Observations

Summary

ChatGPT 3 produces a fairly reasonable result for a larger portion of the document, but fails it in several critical areas by

The sample document is too short to assess how severe these limitations are in longer and more complex documents. It is likely, though, that this approach to accessibility, while appearing on the surface as a good representation, is in fact producing a distorted and incorrect variant of the information that the author tries to convey.

Foxit Reading ChatGPT 4 display

Using ChatGPT4 a similar query produced a markdown document immediately displayed rather than shown as source.

Observations

Summary

ChatGPT 4 does some things better than the trials we did with ChatGPT 3 and on the surface this appears to be a workable path to make an untagged PDF accessible. However, the tendency to rewrite the document content (which is in the nature of LLMs), the dropping of important information (such as graphics and labels) means that this approach is questionable—the fact that it “reads well” while at the same time presents corrupted information is a dangerous combination.

Foxit Reading GitHub display of ChatGPT 4 markdown

ChatGPT was then queried to show the markdown source, which (as for ChatGPT3) is then rendered in GitHub. Note here the document text has been extensively re-worded by ChatGPT.

Observations

Summary

In many places important information in the original document is completely lost (e.g., the note stating that the table syntax is temporary, etc.). None of the supporting cross-references to other places in the document are preserved (text containing them was thrown away) and all footnotes, marginals, and graphics in the document have been eliminated. The result clearly shows the unpredictability of the approach: there is no way for the consumers to understand that what is read to them is not what was written in the original.


NVDA pronounciation settings

Pronunciation of some technical words was improved by using the following settings in the NVDA speech dictionary:

Comment Pattern Replacement case Type
  LaTeX lay-tech on whole word
  alignat align-at on whole word
  flalign f-l-align on whole word
  notag no-tag on whole word
  bibTEX bib-tech off whole word
  unordered un-ordered off whole word
  TeX tech on whole word
  tugboat  tug-boat  off Anywhere
Ignore softhyphens ­   off Anywhere
Hyphenation help cluding clueding off whole word

The Ignore softhyphens row contains an invisible softhyphen character in the second column replaced by “nothing”, which improved the reading of hyphenated words in the Foxit/NVDA workflow.