Skip to content

Category: Deep Literacy

An end of summer update

I have now settled into my new home with my beautiful wife and wondrous baby boy, who is now in nursery 3 days a week and the shock of loosing my father has turned into a deep sorrow I understand from friends who have been through it I will simply have to learn to live with, and I accept that to a degree. This all means that I have 3 days a week of relative calm to work, in my new home office without domestic or other obligations pulling at me.

Liquid | Author

I am continuing to develop my Mac word processor Liquid | Author but now with a more commercial angle since I can no longer expect the odd, random investment from my father, now that his estate is being broken up and starting to come out of probate I feel more like the money he left for me, and thus my family, is like holding a bag of sugar; I am very concern that it should run out, little by little, and my focus in life now is of course to provide for Edgar. Author is receiving some work to make sure that citations are handled as cleanly and clearly as possible and then work on Author will pause so that I can market it and make it at least some kind of level of sales success. The only other thing we are adding to Author is the ability to post to WordPress and this is because of the jrnl project:


I am co-hosting the 50th Anniversary of Doug Engelbart’s demo with ‘The Engelbart Symposium’ on the 9th of December in Silicon Valley (Wendy, I think we’ll be asking you to be on a panel, it’s almost all panels, no real presentations, but we have not finalised the program yet). The other hosts are The Engelbart Institute under Christina Engelbart, Vint Cerf of course and The Computer History Museum who are also our hosts. I will be hosting a panel discussion that day and we will have an area set up for demos, of which my work will be one station. We will also produce a book of the day, which will contain all the contributor’s transcribed presentations/panel discussions and a short pre-or-post written statement by them, along with monochromatic pictures. It’ll be nice. We will also be posting this contents online in a way which honours and demonstrates some of Doug’s ideas, primarily high resolution addressing and viewspecs:

For this I the started the jrnl project (pronounced Journal, I just could not get a good domain name with all the letters showing up for duty) which recalls Doug’s Journal and is based on WordPress for rapid opportunities for actually making something useful. There was an earlier project started when I hosted which I hoped could produce something wonderful from a diverse group of people but it became a 100% knowledge graph effort due to the passion of the strongest contributors so whereas I am still involved in some of their dialog and they will exhibit on the 9th, this is not something I feel strongly connected with or able to understand. There is one important overlap between their work and the jrnl project and that is hyperGlossaries

I am working with Chris Gutteridge of Southampton on the jrnl project and Gyuri Layos who is also actively involved with the knowledge graph work, on the hyperGlossaries and Chris is also working with me on the jrnl project along my coders in the Ukraine who built the current version of Liquid | Flow, my Mac text-interaction application. I am also hiring external wordpress expertise to make sure we are going about the plugin system in the best way.

I further had a meeting with professional, full-time wordpress developer Shane Gibson who is considering donating some time to the project since I told him I have no money for this but he is excited to be working on a project with Vint’s name attached, which shows commitment to the ideals here I think.

The interaction method is a blue dot which appears when user selects text. Point to it to get a menu of options. This is what used to be called the Hyperwords Project and was developed for Mozilla and Chrome as well as for wordpress which is the code we will be using now:

What I feel would be the minimum features inspired by Doug to demonstrate (as in show fully working and let people download and install as robust plugins for their own wordpress setup) are high resolution addressability (in the form of the ability to Copy As Citation which provides at least a paragraph level address in the form of an anchor so that anyone can use the link in any context and it will work) and some ViewSpecs, likely an Author style Find In Page (which re-draws the page to show only sentences with the selected text) and/or Flow (which redraws the text with line breaks after , and double line breaks after . as discussed in:


In an advisor meeting with Les when I was still fully active we agreed that my interest in what Doug Engelbart called ViewSpecs should be the focus of my research but with a focus on the act of changing views, not just on the views themselves.

The act of changing a view should certainly not be a removed act like clicking a button but rather an action where the user is manipulating the shape of the information. I came up with the notion of Compressed Scrolling this week. The idea is that when the user starts to scroll past a certain speed (going from positional scrolling to navigational scrolling in Chris Gutterdidge’s words) then currently all the user sees is a grey block of illegible text. What I propose is that on this threshold the view changes to hide generic text and highlight useful text since the user is now in seeking mode. The user does not start to scroll because they want to move the document up and down, they start to scroll to see another part of the document.

So what could happen is that:

  • All the body text becomes greyed out and reduced in size making headings more prominent
  • All names in the document stay black and the rest of the text greys out
  • All the names in the document become icons if they are for companies and pictures if for people
  • Doug Engelbart suggested colour coding text based on the category words. For example we did a fun test together where all the words about tech were coloured yellow, companies blue, people green and so on. It really did give an insight as to what sections discussed what since it didn’t really take that long to learn the colours but it was ugly and not so readable when you stopped to read. In this scenario however, the colours would only be applied when scrolling, not when reading

We are further working on realising a powerful hyperGlossary in this environment:

These are potentially powerful views to help the user move around their own or other’s documents. I have been in touch with Howard Oakley who blogged about macOS Mojave’s text analysis APIs and we’ll try to look at some of this for Author, while Shane might look into the for jrnl wordpress and Chris is interested in both. I met Howard through Twitter, having Mark Anderson as a mutual friend. This is the post I read which opened my eyes and got in touch with him:

I think this line of enquiry could be useful and a good PhD, but it would real user testing so I will need to get some research funding, which I am working on.


1 Comment

Voice Interaction for Text

I have long argued against voice interfaces for information manipulation since they interfere with the visual-dexterity operations of reading and writing. There is good reason why no-one has asked for a system where you speak ‘turn the page’ to turn the page since it would take you out of your internal mental world and break your flow.

However, there are interesting developments being opened up, such as the linguistic support for macOS Mojave which Howard Oakley mentions in his blog and Apple’s APIs for machine learning being included in iOS and macOS, as coreML provide opportunities for very rich text manipulation.

What can be reasonable to design systems for now includes such interactions as:

  • Show me only sentences with names,–remove headings,–no, not these names over here, ignore them–highlight ‘Steve’ in red.

Designing the interfaces for this can quickly include a lot of buttons or commands to memorise and that is an issue.

However, I recently splurged and bought an Apple iPhone X S max which is very, very fast (capable of 5 trillion operations per second!) and which processes speech commands near perfectly and near instantly.

It is becoming clear that we must start experimenting with flexible views based not just on linear commands but also on analysis (coreML) and interactions for this can benefit from speech, where the system does not need a ‘hey Siri’ prompt and which is aware of the on-screen/in-document data it is working on, including the results of previous operations for continual builds of views.

This could be backed up with a tokenised command bar where the commands are added as tokens when spoken and acted upon, both to make sure the user is happy the commands are correctly interpreted and also for the user to then visually edit them as desired and share/save the command set as a ViewSpec if useful.

I feel this warrants serious consideration.


1 Comment

Digital Substrate

Digital interactive text shares its orthography, the lines and shapes, with all previous substrates but where the interactions afforded by all pre-digital substrates are limited or expanded the utility–the interactability–of the materiality of the substrate itself, such as paper making the text easier to annotate and carry than stone or clay, the digital ‘substrate’ (“to spread underneath”) is uniquely powerful and useful:

• On non-digital substrates the textual meaning is contained on the surface of the substrate–in the orthography–the shapes of the text.

• With digital text the orthography is only a representation of the text–the text itself is stored within the computer system.

It’s important to note that digital text is inherently interactable since the very act of summoning the text to be displayed is an interaction (at runtime there is not necessarily a human specifying the steps of the interaction, many are pre-set and pre-assumed). Thus, in the same way that the computer can and must interact with the symbols to display them (must have an address for the text in storage somewhere and must have a specification for how the text should be displayed), the user can, with appropriate software systems, further interact with the text, changing the specification for what text should be displayed by somehow providing an address for the text and changing the way the text is displayed (there is no inherent, only legacy reason that documents on computers look like virtual copies of text on paper) and what operations are done on the text’s symbolic meaning.

This difference goes far beyond the philosophical and into the core opportunities of digital text: The phenomenal potential of vastly increasing our abilities to interact with knowledge through the digital text.

This is crucial because we get the surface meaning from reading the surface and we have always needed to further interact with the text in order to go beneath the surface, through reading several reports or books on the same subject, through annotating and scribbling down our insights–we have always interacted with text to the best of the ability of the substrate–digital text provides a whole new, powerful set of dimensions through which we can go deeper. It is through interaction we ‘get a handle’ on the meaning behind the text and this is how we can ‘change our perspective’ and ‘gain deeper insights’, all terms which reflect our age old experience of physically being able to move around our environment. Now that we increasingly live and work in a digital environment access in large part through digital text, we need to create the means for us to interact with the text as fluidly as we can pick up an object with our hands but as richly as only digital technologies can allow.

This is not just a philosophical point and I am not just playing with words about how interactivity gives us deeper insights, helps us get a handle on things and helps us change our perspective, it is deeply rooted in who we are. An illustrative example is what happens with ‘tactile vision substitution systems’ where someone who is blind has a camera connected to actuators on their skin which allows them with some experience to ‘see’ through their skin. What is revealing is that this ‘vision’ is only achieved when the user can move the camera–interact with what they are seeing by changing their perspective.

If ‘seeing is believing’ then ‘controlling what you are seeing goes beyond superficial belief and generates understanding’.