Skip to content


I am not an inherit fan of PDF documents at all (hah! You should see my earliest writings about PDF and if you were there for my first PhD student presentation at my first WAIS away day you know how it went when I presented and mentioned in one slide how horrible PDF is) but it has to be recognised that PDF is the de facto standard for academic publishing. It also has the benefit of being frozen, something that has real value for certain uses (archival and more trust in the authenticity of the document).

Doug Engelbart said WYSIWYG should be WYSIAYG: What You See Is All You Get: and I agree.

I am trying, with Visual-Meta, to make PDF a Trojan horse with a rich payload.

Published inThoughts


  1. It is not frozen, you know that. Anybody can easily unzip it and change the PostScript and re-pack the result. Any other format can be freezed just as well as PDF can be, at the point of publication, by printing or hashing or diffing, not unique to PDF (not even inherently supported by the format or its infrastructure). The de-facto for academic publishing was/is printed books, and before it might have been stone tablets or whatever.

    For embedding dynamic things into PDF, I continue to be in favor of having a general resource management container that bootstraps semantics (like OPS/OPF/OEBPS in EPUB), which then may contain many specific “frozen”, static renderings potentially including PDF, images, dynamic configurations and so on, as these are and can be generated from the actual source data which is the main master for the content of the “document”, from which more renderings can be generated and stored/preserved or just ad-hoc, custom ViewSpecs applied based on the capabilities and preferences of the hypertext system of the user. Granted, technically it doesn’t matter too much if something else is embedded in PDF or if something else embeds PDF (despite it’s technically probably no fun abusing PDF as a container format and me assuming that it likely lacks semantics to bootstrap the meaning/type of its contents), but whatever it is, in any real practical use case, one has to unpack the container anyway, throw away the PDF rendering and go straight for the text content in order to be able to do reasonable things with it, not just “reader looking at things” or printing (and how would you do ViewSpecs for print, embed multiple PDF versions in PDF?).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.