Skip to content

Category: Notes On…

This category is for writings I consider fuller articles than the very brief glossary terms or other posts.

The Importance of Semantic Export

With inspiration from my friend and mentor Doug Engelbart’s opening remarks of his famous 1968 demo, re-wrought for this post:


“The research program that I’m going to describe to you is quickly characterisable by saying: If, in your office, you as an intellectual worker, were supplied with a” document editing system backed up by full semantic embedding and access, “which was alive for you all day, and was instantly responsive to every action you had, how much value could you derive from that?”


The use of this quote is both for emotional resonance and for the analogy of how he saw the computer as a type of intellectual ‘fire’–as a richly interactive system accessible through a computer display. We however, are living in the shadow of his vision where academia sees the display but not the interactive power of the computer behind it and micromanage the cosmetics of the visual display of esoteric citation styles without investing in robust and flexible document standards for maintaining rich document semantics affording rich interactions.

The academic publishing world currently takes documents and squeeze every bit of useful semantics out of them by flattening them into PDFs to fetishise their particular wrinkle of paper based citation styles, instead of build a semantic document standard where the basic metadata of the document such as the author’s names, publication date and title is stored for any reader software to extract. Such a digital-native document format would also allow for digital-only experiences and control such as high resolution addressability for linking to exact passages (please note how digital documents do not even allow for linking to pages generally) and using link-types for the author to express that a citation is not supportive and for this to have meaning in a concept or document space or map.

Liquid | Author is moving to an export to PDF where it’s made semantically explicit what the document contains, both for queries via server repositories and other bulk operations, but also so that when a user copies text from such as a document and pastes it in to their own document, all the relevant citation information is carried over, making citing a copy and paste operation, including high resolution support meaning that the pasted text link (as a hyperlink address or identifier) to not only the origin document but to the particular section with that document, giving the reader a quick way to check a citations veracity and relevance etc.

This level of semantic export via the ubiquitous PDF format will allow for innovations in citation analysis, where it becomes relatively easy to build software to give the users views of their documents resulting in greater insights, such as keyword connect maps, author analysis and much more. This is frequently discussed in computer science and there is a rich literature of ideas to support, but it is let down by the paucity and varying quality of the available citation data, which even in the best instances is made available to the reader as a separate piece of downloadable data, such as in a BibTeX sheet or only through the use of third party, proprietary ‘Citation Management Systems’ which all have their own ‘magic sauce’ for allocating the correct citation information to PDF documents–and only within their system, not in a generally accessible way.

JATS is a promising approach for this but the ‘thread lightly’ approach taken by everyone to everyone else in the industry by the lack of backbone to build robust standards means that currently different parts of the industry use their own custom JATS dialects.

Let us come together to support a rich document interchange format, whether a richly exported PDF or a clearly and uniformly tagged JATS, and let the reader choose whichever way they prefer to see the citation styles, as paper or as advanced digital, depending on their need.

Currently we are at the development of digital documents stage where TV was when showing plays with the camera fixed on a tripod and no edits. Let’s liberate our academic dialog by embracing the richness of the media and truly augmenting our academic discourse.

My first step is simply to develop Author so that any paper authored can be exported as richly as possible so that another Author user can gain from these benefits and hopefully demonstrate its utility for the industry. But first, we are completing Dynamic Views which are more visually exciting and which will hopefully help gain more users and thus more resources to the future developments.

Leave a Comment

Socrates and Text


Much has been discussed about the concerns Socrates had about reading. Here I make the point that the act of authorship is Socratic in ways that reading is not, and that this has implications for how we design writing systems.


Socrates argued against text because he felt that reading was a superficial process where the reader does not have an opportunity to question or interact with the author. This argument has held water over the last two millennia but with thee advent of digital text there is a call to make text a more socratic medium, with richer interactions to support a deeper and more active reading, as Alan Kay illuminates in The Future of Reading Depends on the Future of Learning Difficult to Learn Things (2013).

So yes, analog substrate reading has to answer to Socrate’s issues. Digital reading can be improved to a deeper level of interaction.


The act of writing is the opposite of the act of reading, not only for the obvious reason, but also because the act of writing is highly interactive with the author–the author interacts with their own thoughts when they write–the very act of writing is an interrogation of the text of the author. If neuroscience has confirmed anything over the last few decades it is that the human mind is a storymaking machine; a highly creative one but also a lazy one: Our richly connected synapses allow us to think of myriads of things, but we only fill in the details and see the connections when we have to, which our brains present us with as aspects of what we already knew but in reality are inventions made up on the spot. For an excellent overview of this I can recommend Nick Chater’s The Mind Is Flat (2018).


Authorship on the other hand, is a seriously interactive process of linearising and connecting initial thoughts with those which develop over the course of the process of authoring, resulting in a coherent ‘authored’ whole.

Simply writing something down as one continuous transcription will not suffice for anything of any substantial length or depth, a fiction only those who have never tried it believe. The truth is that even a shopping list can require revisions once it’s seen by the optical eyes and not only by the mind’s eye. The mind’s eye has but the smallest canvas compared with what the optical eye can offer–and little of the permanence.

Authoring by thinking with text is similar to thinking with another person: Something is stated and once it’s out of the author’s mind it can be examined and commented on, either immediately or at a later date.

Augmented Authorship

What the minds eye has in abundance is the ability to associate between richly woven connections accessible to us through our association cortices, as described in Elastic (2019) by the renowned scientist Leonard Mlodinow. To truly leverage the power of connecting our minds to an external workspace, the challenge becomes how we can also make the optical view as flexible and interactive as our mental view while keeping its permanence and exploring its potentially capacious size. This, I believe, is the central challenge of interactive, digital text: The ability to fix thoughts has been possible since the dawn of writing but now we have the opportunity of also allowing the author’s mind to fluidly move around the text and re-arrange it at will.

Writing then has always been Socratic–the dialog Socrates said was missing from the written word is there when it is being written, it is the dialog with yourself.

Considering the volume of dialog we have through the written word and the repercussions when we do not or can not interrogate what we read and the brain capacity lost when struggling to author clearly, we should, as a society–as a species–invest in augmenting text for truly powerful reading and authoring.

1 Comment