Skip to content

Category: Future Of Text

Continuing symposium on the future of text.

Making Information Self Aware

We can fight fake news and find more useful information in the academic and scientific publishing tsunami if we make the information self aware–if the information knows what it is. This is not a suggestion of Harry Potter level magical fantasy but a concrete act we can start with today and lay for foundation for future massive improvement.

the intelligent environment

Many years ago I read an interview with one of the developers of the computer game Crysis where he was lauded with the quality of the AI of the opponents in the game. He said that making the AI was not really the hard part, making the different parts of the environment aware of their attributes was key. If a tree trunk is thick, then the enemy can hide behind it. If it is dense then it will also serve as a shield, up to a point.

the self aware document

This is what we can and must do to documents. We must encode the meaning in documents as clearly as possible so that the document may be read by software and human. The document must be aware of who authored it, when, what its title is and so on, to at least provide the minimal context for useful citations.

It should also know what citations it contains and what any charts and graphs means what glossary terms are used and how they connect. Of course, we call this ‘metadata’ – information about information and the term has been used in many ways for many years now, but the metadata has so far been hidden inside the document, away from direct human and system interaction. We should maybe instead call it ‘hiddendata’. For some media this is actively used, such as the EXIF data in photographs, but it is lost when the photograph changes format, is inserted into other media or is printed. For text-based documents this is certainly currently possible but seldom actually used and not usefully read by the reader software and lost on printing.

bibtex foundation

You may well feel that this is simply a call for yet another document format but it is not. This is simply a call for a new way to add academic ‘industry-standard’ BibTeX style formatting of metadata to any document, starting with PDFs, in a robust, useful and legacy friendly way, by simply adding a final appendix to the document which follows a visually human-readable (hence BibTeX) and therefore also machine parseable format.

As this will include who authored the information, which the reading software can ‘understand’ and make it possible for the user to simply copy text from the document and paste it as a full citation into a new document in one operation, making citations easier, quicker and more robust. Further information can be explained for reader-software parsing, such as how the headings are formatted (so that the reader software can re-format the document if required, to show academic citation styles in the preference of the reader if they are different from the presence of the author), what citations are used, what glossary terms are used and what the data in tables etc. contains and more.

more connected texts

This is making the document say what it is, where it comes from, how it’s connected, what it means, and what data it contains. This is, in effect, making the document self aware and able to communicate with the world. These are truly augmented documents.

This will power simple parsing today and enable more powerful AI in the future in order to much better ‘understand’ the ‘intention’ of the author producing the document, by making documents readable.

This explicitly applies to documents and has the added benefit that even if they are turned into different formats and even if they are printed and scanned they will still retain the metadata. The concept is extensible to other textual media, but that is beyond this proposal.

visual-meta

I call this approach Visual-Meta and it’s presented in more detail here liquid.info/visual-meta.html. I believe this is important and I have therefore started the process of hosting a dialog with industry and I have produced two proof-of-concept applications, one for authoring Visual-Meta documents and one for reading and parsing them: Liquid | Author and Liquid | Reader: www.liquid.info

paper

Digital capabilities run deeper than what previous substrates could, but even in the pursuit of more liquid information environments we should not ignore the power of the visual symbolic layer. We hide the meta at our peril – we reveal it and include it in the visual document and gain robustness through document format changes and even writing and scanning, gaining archival strength without any loss of deep digital interactivity, something which matters more and more as we live and discover how brittle our digital data is and how important rich interactivity is to enable the deeper literacy required to fight propaganda and to propagate academic discoveries often lost in the sheer volume of documents.

Furthermore, with the goal of more robust formats and supporting reading of printed books and documents, addressing information (as discussed in the Visual-Meta addressability post) can be printed on each page in the footer to allow for easy scanning of hand-annotated texts to be OCR’d and entered into the user’s digital workflow automatically. Digital is magic. Paper is also magic. One day they will merge, but until then there is value to be had to use both to their strengths.

 

As we make our information aware,
we increase the potential of our own awareness

 

 

Leave a Comment

ELO 2019

flier for sponsorship of elo2019.ucc.ie

The premise of my work is that the written word is a fundamental unit of knowledge and therefore the richer we can interact with our text, the richer we can interact with our knowledge. This is why I host the annual Future of Text Symposium and curate the largest collection of perspectives on the future of text ever undertaken in the book: ‘The Future of Text : A 2020 Vision’, due out next year: futureoftext.org (and for which I would greatly appreciate suggestions for further contributions).

Liquid | Author

This is also why I produce software to learn what the possibilities actually are, rather than only what they might be. All the software is for macOS (with iOS versions planned), available from www.liquid.info 

Liquid | Author is a minimalist workspace word processor with powerful gestures and commands, such as the ability to pinch the document into an outline, a Find command which shows you the full sentences of the text you search for instead of yellow dots out of view, quickly accessible full screen mode (ESC to enter and leave), Cuttings which stores everything you cut, and more. Your final work can be exported with academic formatting of citations including appending a References section or posted to WordPress. 

Author also features a Dynamic View, which is similar to a mind map or concept map but remains part–of and connected to–the text in the word processing view: youtu.be/bCpJTRd0hrE 

Liquid | Flow (universal text tool companion)

Liquid | Flow is a companion to Author which allows you to select text and search, look up references, translate, convert and more, in less than a second once you are familiar with it. 

Liquid | Reader (supporting Visual-Meta from Author)

Liquid | Reader is a visually lightweight PDF reader which supports the Visual-Meta system, where citation information is visually ‘printed’ at the end of the document, in the last appendix automatically added by software, such as Liquid | Author, or which you can add manually for downloaded documents by pasting its BibTeX export format text.

The Liquid | Reader ‘reads’ the Visual-Meta in the document so that when you copy text from the PDF and paste it, the citation will be pasted as a citation, not just as text. This means that even if the document changes format or is printed out and scanned with OCR again, it will still retain its metadata, including citation information but also information about the document in general, such as the contents of tables, glossaries and more: wordpress.liquid.info/printed-meta

Onwards

My friend and mentor Doug Engelbart presented the vision and mission in his 2002 address ‘Improving Our Ability To Improve’:

The thing that amazed me – even humbled me – about the digital computer when I first encountered it over fifty years ago – was that, in the computer, I saw that we have a tool that does not just move earth or bend steel, but we have a tool that actually can manipulate symbols and, even more importantly, portray symbols in new ways, so that we can interact with them and learn. We have a tool that radically extends our capabilities in the very area that makes us most human, and most powerful.

There is a native American myth about the coyote, a native dog of the American prairies – how the coyote incurred the wrath of the gods by bringing fire down from heaven for the use of mankind, making man more powerful than the gods ever intended. My sense is that computer science has brought us a gift of even greater power, the ability to amplify and extend our ability to manipulate symbols.

It seems to me that the established sources of power and wealth understand, in some dim way, that the new power that the computer has brought from the heavens is dangerous to the existing structure of ownership and wealth in that, like fire, it has the power to transform and to make things new.

I must say that, despite the cynicism that comes with fifty years of professional life as a computer scientist, inventor, and observer of the ways of power, I am absolutely stunned at the ferocious strength of the efforts of the American music industry, entertainment industry, and other established interests to resist the new ability that the coyote in the computer has brought from the heavens. I am even more surprised by the ability of these established interests to pass laws that promise punishment to those who would experiment and learn to use the new fire.

As the recipient of my country’s National Medal of Technology, I am committed to raising these issues and questions within my own country, but I am also canny enough to understand that, in the short term, it is the nations with emerging economies that are most likely to understand the critical importance and enormous value in learning to use this new kind of fire.

We need to become better at being humans. Learning to use symbols and knowledge in new ways, across groups, across cultures, is a powerful, valuable, and very human goal. And it is also one that is obtainable, if we only begin to open our minds to full, complete use of computers to augment our most human of capabilities.


Engelbart,
2002

Dialogue

I would greatly appreciate your perspective and feedback, which is why I am here at this festival and proud to play a very small role in it by being a sponsor. 

Frode Hegland

London 2019

Leave a Comment

Visual-Meta Introduction

 

Visual-Meta is an approach to make document’s meta machine and human readable by adding an appendix to the end of the document, based on BibTeX, with all the information needed to cite the document (author, title, date etc.) as well as clearly stating the values of any data (such as tables, lists advanced layouts etc.) and glossary terms.

This visually (as plain text in the document) metadata can then be parsed by a Visual-Meta aware PDF reader to enable functionality such as copying text and pasting it as citation in one step.

Putting the metadata visually into the document means that even if the document format is changed or the document is printed and scanned, the data will still be a part of the document and compatibility with legacy readers is maintained since they will only see the metadata as plain text.

Adding human readable appendices to a PDF document which usefully describe the semantics of the document and also making it machine readable offers many benefits and workflow improvements in the academic document space, while adding no document overhead beyond a few plain text pages at the end of the document. This approach keeps compatibility with legacy PDF software Readers while opening up rich opportunities for augmented Readers; Legacy Readers will simply show a normal PDF with an appendix with BibTeX style information.

 

Augmentations

Visible-Meta Augmented Readers can provide the user with as rich interactions as can be provided in a custom authoring environment–the publishing and freezing onto PDF is no longer a limitation. Advanced interactions can include:

  • Copy As Citation using a simple copy command, with all citation information added to the clipboard payload for use by Visible-Meta aware applications on Paste.
  • Instant Outline based on the document specifying heading formatting.
  • Dynamic Views, such as the one implemented in Liquid | Author could be stored as data not only images.
  • Server Access. Repositories can extract information for large scale analysis.
  • Glossary Support. Glossary terms could be added to the appendix.
  • High Resolution, Document Based Addressing. The Name of the document is not the same as the Title and this can be be used to address by document and not location and support High-Resolution Addressing.
  • & more, to be discovered.

 

Benefits

For an author this approach means that they can embed more rich information in their document with a minimum of effort and be sure of the robustness of the information.

It allows the reader a much faster way to cite with a higher degree of accuracy and more access to the original data and interactions.

Augmented textual communication. Using the appendices to describe the document content, such as the formatting of headings and citations as well as the use of glossaries, can allow the reading software to present the document to the readers preference without loosing the creator’s semantics.

Server Friendly which allows for large scale citation and other document element analysis. University of Southampton’s Christopher Gutteridge, one the of the people behind the university repository, elaborates on this.

Institutions can worry less about the cosmetics of citations and benefit from more documents cited being checked and read.

This could put an end to the absurd academic time-waste of nit-picking how citations should be displayed: Let the teacher/examiner/reader specify how the citations should be displayed, based on the document having described in the appendix how they are used and therefore the reader can re-format the the readers tastes.

Universities still get to dictate the default handing-in formatting but the same document could be displayed in any format the reader chooses.

 

Demonstration

Visual-Meta export is built in to the Liquid | Author word processor and parsing it can be done by the Liquid | Reader PDF reader application, both produced by the author of this article, Frode Hegland: www.liquid.info

Video demonstration of the concept (less than two minutes long): youtube.com/watch?v=Q-LnkuI2Qx8&feature=youtu.be

 

Example

Examples and description of the format is posted: Visible-Meta Examples

 

Document Name

Note that the ‘document_name’ is distinct from the title and can be set automatically by the authoring software to help identify the document through search later. The unique name will be the first 10 characters of the title, author’s name, the time in condensed form and a random 4 digit number. For example:

augmentinghu_douglas_engelbart_19621021231532_6396.pdf

  • 1962 | 10 | 21 | 23 | 15 | 32
  • year | month | date | hour | min | seconds

Document Based Addressability

This approach allows the user to click on a citation and have the PDF open if it is available to the user, not simply to load a download page. If the document is not found, an opportunity to search for it will be presented.

High Resolution Addressing

Enacting a linking in this style is an active process initiated by the Reader software so adding an internal ‘search’ to the processes will allow the software to not only load the document but to open it at the section cited..

 

Adoption Support

The first implementations will include links to actual code for how to add this into other developer’s projects, dramatically reducing the implementation overhead.

 

Legacy Support

When using a supported Reader, the user can download a PDF and copy the BibTeX export format on the download page, then open the PDF in Reader and click to ‘Assign BibTeX’ and it will be applied as an appendix and saved, same as if it was natively exported with Visual-Meta. Only the citation information will be provided in this way–formatting etc. will not be available.

Legacy Augmentation

 ­

Manual

When using a supported Reader, the user can download a PDF and copy the BibTeX export format on the download page, then open the PDF in Reader and click to ‘Assign BibTeX’ and it will be applied as an appendix and saved, same as if it was natively exported with Visual-Meta. Only the citation information will be provided in this way–formatting etc. will not be available.

Server

Reader applications can also send non-visible-meta PDFs to a server, such as Scholarcy to have the Visible-Meta extracted and appended.

Background

This work grew out of work on Liquid | Author: Visible-Meta Origins.

 

How This Relates To My PhD

This work has grown out of my PhD work at the University of Southampton under Dame Wendy Hall and Les Carr. It aims to solve infrastructure issues which hamper citation interaction and visualisations: Visual-Meta & my PhD.

 

15 Comments