How can we provide useful computer magic when publishing a document, loosing as little of the richness of the process of producing the document as the author chooses, supporting active reading and analysis of the document by others? These are important questions and something we are looking into at the University of Southampton and with the Author project.
• Annotations should be possible to add through whatever means the user wants to, such as underlines, highlights or drawings. These annotations should then be attached to the meta-data of the document so that the user can choose to search only highlighted text for example.
• Server Knowledge of the content of the document to allow for analysis of the document or documents in bulk, through making the data in the document clearly tagged and surfacing this meta to other applications or servers.
The big aim is to produce a document reading, writing and publishing system which will let the reader have a richer interaction with the author’s work than interacting with the author him or herself would allow. That is why we are calling this Socratic Publishing.
This needs to be possible within legacy systems, supporting a process of publishing a document in a way which keeps the data structured for better use when someone reads the document or interacts with it as a whole document or pieces of the document. Our solution is a process of encapsulation, where the original document and an XML version is embedded inside a .pdf document so that if a reader only has a PDF reader the PDF version will be shown but if the user has the original system the original document will be presented or if the reader has software which understands the XML then that version will be shown. In the DKR world I propose that these become Doug’s Xfiles.
This is the model as we see it so far: