With Visual-Meta we now have the means to dream bigger. Here are a few thoughts on authorship of documents as part of the private and public life of text. This is an example of something I’d prefer to write more hypertextually, in a smaller unit and have flexible views to see the collection of thoughts, but that is for another day.
(initial note: This was written in Author with paragraph indents which were removed in WordPress–just another reminder how primitive our digital text systems still are)
[start ramble / rant]
We can imagine something grand and crazy but when we make it real, it will be manifestly different from how we imagined it. Similar to writing down a thought. We will find edge cases and inconsistencies. So we must build and experiment.
‘Inventing the future’ of anything easily becomes projection of a trend (speed increase etc.) or the combination of two known functions (flying cars) but rarely an understanding of the evolution of useful (as defined by the user) functions, such as text messages on a smart phone or social media (defined by the user but shaped by the companies providing the service).
It is easy to set criteria and priorities based on currently understood relatively abstract understanding. For example, to say that the future of text should involve high resolution linking, be robust, provide flexible views and be highly interactive. These criteria were not abstract for Doug Engelbart and his team–they were real and really useful–but how would they exist today, with today’s interaction conventions and infrastructures?
Innovation is of course always about figure and ground, it is not about just one thing. The device/tool/tech/system/app evolves and so does its use by the user and the economics around it. We have augmentation infrastructures and these are tied to financial infrastructures which are often written out of the equation by those who want to ‘do good’ and augment our intellect and communication but they will be primary for those who want to ‘earn more’ and those who want to earn more are focused on users in a very different way. Rarely do we get a situation where a deep investigation into the real needs of the user goes along with a deep understanding of the economics of how this can be viable. Rarely, but not never.
That was my rant that we need a way to fund the research and product building.
What I do know is that we will be working inside a rectangle the size of a modern laptop for a long while to come. Something like 13-16 inches, with high resolution and soon passive lighting when required (larger than this and we are beyond eyeball movement and into full head movement which takes a lot more effort over time). When AR is up to it, we will also be able to put text anywhere in our room but this will bring its own issues of what is useful and what becomes a mess. After all, it is not that expensive or time consuming today to print out A3 sheets and put them on a wall, either as singles sheets or in groups. It is also easy to connect them with string, as in the ‘murder walls’ of TV. But for what work will this be useful? For solving a crime, probably. I had heart surgery on Friday and when they went in through my leg to access my heart, they had a pretty huge flat screen monitor to work with. It still didn’t show everything and they needed to talk about what it should show.
AI will be useful but we also need augment the user’s brain, not only let systems think outside it. When voice interacting with an AI (Siri, Cortana, Bigsby etc.) the reply will be ‘correct’ and not easily questioned so the user is outsourcing their subconscious and I think Socrates would think this voice interaction less interactive than text! :-)
That was my rant about the obvious ‘bigger screen/AR/AI magically solve it all’
[end ramble / rant]
I am a visual guy but I think interactivity needs to be what we focus on.
Physical engineers appreciate a good display but require rich interaction: It’s not enough to see the object they are planning to build, they have to see it in different ways, including rotated and with different layers of opacity and highlighting different parts. If we think of document editing as ‘engineering’ documents then this may be a good mental model for developing future directions of text interaction.
So we write and we write and we write, with fingers on keyboard or speech on screen. Then we need to start engineering. We can even today choose to make some adjustments to our view; we can choose to see only sentences with certain text, we can fold the text to only see the headings and we can highlight and bold text to make it more prominent. We can pull text and notes to the side. We can flip our view into a non-linear ‘dynamic view’, we can add metadata for how the text connects with external information through links and citations and we can (soon, I hope) define glossary terms for the reader to allow us to write more tersely without fear of missing out information. We can also instantly translate the text in-line and look up any text we are not certain of and find citations to support any claim in seconds–or find our claims have been refuted.
But further… (my own personal dreams)
When typing it would be interesting to look into different kinds of focus mode, for paragraph or sentence and with complete hiding of other text or greying out, and what kind of interaction should toggle this view mode, as is currently done to some extent in some word processors.
I’d like to see focus with flow mode, where the sentences automatically get a line break on every ? ! : ; . to make it even more easy and clear to see the text. We have this in Liquid but not in a reversible way. Can it be in a reversible view?…
We further need to be able to hide and make prominent different aspects of the text to be able to really focus. We need to be able to hide sections we are done with to be able to ‘crack that nut’ of this particular text. Can we even reflow out layouts safely to escape the grey block of text and can we re-pack it? How can we step out and see that we have discussed a keywords in different sections without realising it? How can we quickly skim through our documents and see what’s discussed where? Colour glossaries maybe?
Saving & Finding
How can we better save our documents to more easily find them later when searching or browsing? Better (quicker and more convenient since this is not seen as valuable work when doing it) tagging maybe? What about more visually descriptive document icons to let the user know at a glance if it’s a large or small document, whether it’s been published and so on?
We now have the ability to employ ML to analyse our documents. What will be actually useful? How about a view to see all the names in the document and make that view show repeated use through darker or larger text (we have the first bit). Should we be able to grey out common words quickly and easily (I agree with Doug on ease of use for first timers being a waste, but a trained user should be able to fly through the information with minimal interaction effort)? Also we should be able to show only the first sentence (or the last) of every paragraph, as Doug had the ability to do.
How about see only citations or only links (plus headings), or only company names or product names or tech? How can we make such interactions actually accessible to the user? That’s the real issue: How can we make the interaction actually useful? Already today almost any of this is possible–if you are a programmer and you want to spend the time for the views.
This is similar to how libraries existed before the internet so anyone could go to the library and look things up (or in that book on their self) but mental effort is a real thing and often it would not be deemed worth it. With an instant Google the effort is so much lower. We need the effort to change views to be similarly low enough that they actually get used.
Maybe through keyboard shortcuts or maybe through an interaction panel spawned in a similar way to Liquid and working in a similar way? Liquid is accessible through selecting text and ctrl-clicking or cmd-space on selected text to launch (for me, I don’t know what you have chosen). How about cmd-enter (though currently the OS does not see this combination) and up comes a Liquid like panel where the user can quickly enact views and also store new keyboard shortcuts in the same way they can in Liquid? If you have not seen how Liquid works please have a look: http://www.liquid.info/liquid.html
Note that in Author currently only paste shows on ctrl-click when the user has not selected text, kind of like a placeholder. We can easily remove the paste option since cmd-v is universally known and make ctrl-click only show document-wide view options (spawn a Liquid like interface)…. That would be nice to have on your side (should we call this the Washington view?). We could also make the Liquid command key shortcut produce a view panel if the use invokes it without selecting text.
Doug had different command options for holding down a modifier while double clicking and in the middle of a double click I think. I therefore think it’s reasonable to have different meanings for the same commands if used on selected text and not on selected text. Cmd-f shows a dialog if not on selected text fx. Many simply ‘beep’. So let’s make them gainfully employed!
Connected Graph Views
What kind of graphs views can we let the user manually create and curate and which can be automatically extracted from the text based on grammar and analysis and citations and glossaries? And maybe most importantly for the perspective of authoring, how can this be made useful for authoring rather than reading?
How can we add computational text? How can the author specify the life of text for advanced interactions later? Ctrl-click and add maybe?
What other things could we do? Let the headings fade out instead of just fold them to view only them? How about tagging paragraphs with a level of heading that does not get exported to help the author in an overview and navigation but not making it cluttered for the reader, though of course ideally this will also be available for the reader?
What about embedding time and place stamps to all sentences and edits so that the author can find what was written in a certain coffeeshop at at a certain time of day? Or simply when it was raining or during a meeting? These are timeline questions and can also be embodied in the document but at the risk of making the formatting code messy. Maybe limit it to save states or add it to exports?
How about when saving add a ton of context? Add what was on the calendar then, what calls were made and such, in addition to time and place? How about building a landscape of documents so easily interactive to change the view that open and save will seem like (what it really is similar to), extracting a mummy’s brain through its nose (as was done).
All the extra data mentioned above can fit into Visual-Meta. What else can we usefully add? A record of some sort of the users typing pattern for identification of different texts by same author? Many things can be encoded.
We can look at the process of authorship: Writing, citing, linking, connecting internally, editing-sculpting–engineering and saving, storing and exporting as nodes in a network and we can attack all of these, and find more. We have the start of a technical infrastructure. When we do this, we will truly not simple be inventing A future of text–we will be engineering it. This is in line with Doug Engelbart’s notions of A, B, and C levels of activity and networked communities. Let’s get going. All that’s at stake is the future of humankind’s symbol manipulation-which may sound boring to some, but it is largely through text we think and communicate with each other on many important issues.
Above all: How we build powerful interactions which have interfaces making them accessible and useful for the production of documents with a focus on augmenting the author’s ability to communicate their intent?
What do you think? Please let me know in the comments below. (I always wanted to type that…)