An overview of some of the main interfaces in use now and questions about future possibilities. This section covers a very wide area and doesn’t attempt to go into depth on any one interface, it’s an attempt to remind us what a variety of interfaces there are, to shake up our thinking a bit.
The thing about technological progress is that, apart from Moore’s Law, it’s not altogether smooth. Innovation comes in jumps and not always when expected. The liquid information environment must therefore use what’s available and have an eye to possible future innovation and invention. What’s available technologically (hardware, as well as software algorithms and protocols) is the landscape in which the environment is built so it is much more important to understand this landscape (and work to change it if necessary), than it is to complain that it should be otherwise and just don’t’ do anything.
Engelbartian Mouse and Keyboard Controls
The mouse and keyboard interfaces have come to represent the traditional desktop/laptop environment dominated by Windows, Mac OS and Linux. The layout below on the left is Doug Engelbart’s original mouse with a keyboard and a chorded keyset and the keyboard on the right is a modern mac keyboard and mouse. Much sleeker, but the original had more controls (the chorded keyset could enter all ASCII characters, to modify what the mouse did).
The traditional “GUI”, or Graphical User Interfaces is very efficient for the knowledge worker who is able to sit in a controlled environment and work with nice, large screens. Other interfaces are replacing some of these traditional interfaces (track pads on laptops) but in general they extend the opportunities of the traditional system (drawing tablets) or add interfaces to entirely new medium (multitouch) such as mobile devices and gaming. An outline is provided next. This is where the future lies.
Most interfaces require live interaction to be useful. Agents are hands-off for much of their useful lifespan. Users set an agent to do something and at some point it returns. During the mid 1990s this was very fashionable in the industry (with General Magic for example) to set agents to work and come back to the user later. At the start of the 21st century immediate feedback (Siri, bought by Apple fx.) through human-style queries are becoming available. What is it we seem to value about agents? Is it their anthropomorphism? Or is it their utility? Does it matter how long it takes for a result to come or is it the manner in which the query is made? I think it’s fair to say that Google Search is an agent, but a very fast one, where the users query is very intelligently handled but presented in a list form. ‘AskJeves’ tried the anthropomorphic approach and that didn’t work better than Google clever under-the-hood approach.
What are the most effective and inspirational uses of agents? Compiling search results in documents? Finding the nicest nearest coffee shop? Monitoring your communications so it’ll know what messages to highlight to you and which to block? Should agents be human-smart (with or without human foibles) or machine clever?
Let’s not forget that while making the agent more powerful, that should not be end end result. I read this quote attributed to Marvin Minsky and Douglas Engelbart when they met at MIT in the 50’s in Kevin Kelly’s book Out Of Control:
“Minsky: ‘We’re going to make machines intelligent. We are going to make them conscious!’” Engelbart: “You’re going to do all that for computers? But what are you going to do for people?”
Doug says of the quote: “I don’t know about meeting him in the ’50s; I wasn’t until ’62 that I connected at all with the AI community — a week (or two?) workshop on AI held at RAND, with Allen Newell and Herb Simon as major part of the “faculty.” Minsky wasn’t there. Probably didn’t meet Minsky until summer ’63 at a 6-week working group at MIT on timesharing, sponsored by ARPA. I don’t remember much/any interaction directly with Minsky. Might have had that interchange. Did have a fair amount of interaction with Newell over those early years — some of them were quite definitely hav- ing two different visions not aligning; but friends anyway.’
At the time of writing motion control for console gaming is a hot topic.
The Wii has sold 76million units and Microsoft and Sony have added motion control to their systems. Edge magazine writes in issue 224 that this has certainly widened the market for games, making more people interested – but it turns out that hard-core gamers actively resist the motion control games. This is in contrast to 3D mice, pressure sensitive tablets and other non-gaming interfaces beyond the mouse and keyboard, where the hard-core desktop computer users have been the only ones to embrace the richer interaction offered.
The question is not wether physical interfaces will become more used but what will make the most sense and wether users will have a problem with the computer system ‘looking’ at them. What we have as touch today we’ll have with touching in thin air. But when will the computer know to interpret the gestures. Perhaps we will speak to the computer, press a button or something else to get its attention. It’ll get interesting.
High Resolution Displays
With the advent of the retina display in the iPhone 4 (2010) the bar has been set for super resolution displays with 326 pixels per inch or 128 pixels per cm. Higher resolution demands more from the graphical processing capability of the machines but we are seeing an evolution in GPU’s which is very nicely keeping up with the increase in resolution. What difference will it make to interactivity when the display is crystal clear, high resolution?
“Acuity information is useful in determining what is needed to produce an adequate or optimal visual display. A modern high resolution monitor has about 40 pixels per cm. This translates to 40 cycles per degree at normal viewing distances. Given that the human eye has receptors packed into the fovea at roughly 180 per degree of visual angle, we can claim that in linear resolution, we are about a factor of four from having monitors that match the resolving power of the human retina in each direction. A 4000-x-4000-resolution monitor should be adequate for any conceivable visual task, leaving aside, for the moment, the problem of superacuities. Such a monitor would require 16 million pixels.” Ware
Another issue, and a problem with the first iPad models, is that the screen is reflective and putting on a an anti-glare cover significantly increases the reading experience. This lowers the perceived resolution slight since it adds a small amount of ‘grain’ to the screen. It seems likely that technological breakthroughs will solve this though.
Non-transmissive displays, also referred to as e-ink, promises to take displays in new directions and new forms. Harry Potter style newspapers is only the beginning.
Ambient & Anywhere Displays
At the start of the 21st century we seem to think we can be as clued in by looking at small rectangular screens. That’s almost humorous. Imagine hunting in the jungle with an augmented reality iPad app. You can only look through the app – you are not allowed to look outside it. It will have some wonderful enhanced views but wouldn’t it be useful to look around with your eyes directly at the jungle as well?
What will it be like when anything can be a display? What will it mean to ‘display’? Will we have more ambient glancability or will everything become a ‘loud’ tv? In the same way a transparent glass of water shows how much water is in the glass, will we be able to show the state of something with a similar glanceable impression? Tablets will continue to impress as will wall size displays, but what about displays potentially everywhere at low cost – what will be useful or fun to display?
The interfaces to our digital, liquid world, will have to become more powerful and we will also need more of them. We will need ambient information such as gadgets which light up on the back with basic information, walls which provide ambient colors depending on outside temperature, stock markers or what friends are doing on social networks. Have a look at the Corning Glass video ‘A day made of glass’ on YouTube. It’s a surprisingly inspirational corporate promo piece.
Non-glasses-needing 3D displays are around the corner. Will all content benefit from 3D? No, definitively not. Is 3D clunky and not as high quality as 2D? Absolutely, but with market opportunity and Moore’s Law pushing it forward, 3D will be here soon and in most cases will be not be adding any value, when reading plain text for example, but will add real value for immersion and data presentation. But where will it add value? Can your project benefit from a deeper third dimension.
Touchable interfaces are also around the corner. Phones will change texture, keyboards morph, pads unfold, glasses warp and we have vibrating game controllers – what’s next? Can science fiction keep up with the possibilities? Can you?
A fellow student at UCLIC in London, Udy Ravid, worked on mobile devices which could communicate by controlled vibration – instead of simply sending a text message which would cause the receiving phone to vibrate when in ‘silent mode’, the phone could be squeezed at different levels of intensity and length and such pulses would be recreated on the receiving phone. It was originally a way for blind people to communicate but I think many people would develop their own tactile signaling systems. Quite an interesting though.
Have you noticed that before you had a phone with a vibrate on silent or on ring mechanism, you didn’t pay attention to vibrations but now that you do, sometimes you feel a vibration and wonder if it’s your phone? That’s a whole new channel for your body and mind to expect information to arrive.
“THE US army is testing a navigation device that allows soldiers to feel their way, literally, through the fog of war. The device, a haptic belt, feeds information to the wearer through coded vibrations and can also relay orders given as hand signals via a glove that recognises gestures.
Navigation can be extremely difficult for soldiers, especially at night, says Elmar Schmeisser, who has been leading the work at the Army Research Office in North Carolina. GPS devices are not ideal as they require soldiers to take their eyes off their surroundings and their hand off their weapon. The illuminated displays can give away their position at night, too.
So Schmeisser has spent the last few years working with different companies and research groups to find an alternative. He and his colleagues have now developed a range of vibrating mini electric motors known as tactile actuators, or “tactors”, and tested them in various configurations. “What’s best is a belt around the torso with eight tactors signifying the eight cardinal directions,” says Linda Elliott, a psychologist who has been testing the systems on soldiers during training exercises at the Army Research Laboratory at Fort Benning in Georgia.
The tactors vibrate at 250 hertz, which is just enough to give a gentle but noticeable buzz around the torso at regular intervals indicating the direction in which the soldier needs to travel to reach the next waypoint.
The belts are hooked up to a regular GPS device to access directional information, as well as an accelerometer and digital compass. These mean the device knows which way the soldier is facing, even if they are lying down. “As long as you are going in the right direction you will feel it on your front,” says Elliott, who will be presenting the technology at the Human-Computer Interaction conference in Orlando, Florida, in July. “As you get to within 50 metres of the waypoint all the tactors start to go off, and within 15 metres they will quicken.”
Besides directions, the tactors can communicate commands such as “halt”, signified by the front, back and side tactors pulsing simultaneously, or “move out”, when they pulse from back to front, almost as if they were pushing the soldier forward.”
So how about interfaces you only feel?
“Imagine, Michael Chorost proposes, that four police officers on a drug raid are connected mentally in a way that allows them to sense what their colleagues are seeing and feeling. Tony Vittorio, the captain, is in the center room of the three-room drug den.
He can sense that his partner Wilson, in the room on his left, is not feeling danger or arousal and thus has encountered no one. But suddenly Vittorio feels a distant thump on his chest. Sarsen, in the room on the right, has been hit with something, possibly a bullet fired from a gun with a silencer.
Vittorio glimpses a flickering image of a metallic barrel pointed at Sarsen, who is projecting overwhelming shock and alarm. By deducing how far Sarsen might have gone into the room and where the gunman is likely to be standing, Vittorio fires shots into the wall that will, at the very least, distract the gunman and allow Sarsen to shoot back. Sarsen is saved; the gunman is dead.”
NYT Review of World Wide Mind by Micheael Chorost
The seemingly magical ability to look at something right in front of you, through some sort of a device or enchanted “window” has snuck around the corner and become quite commonplace without many people even noticing.
iPhone and Android apps allows users to look through their phone and see overlaid Wikipedia entries and much more. This has not taken off yet, as it’s just not a practical way to get information at this point. The opportunities though, are inspirational; what dress is that? Who is that? What building is that? Where is the nearest exit? Does it need to get integrated in glasses or other devices to not be so obtrusive? Or do they simply need to feel like they are almost on?
Audio-augmented reality in the form of recorded narrations for museums have been around for a long time but it’s not until the visual AR we would frame it as much. Similarly, would you call a book’s index as a form of hyperlink? New technologies give new perspectives on the old.
Word Lens for iPhone is an amazing peek into the future. It launches and turns the phone’s camera on and whatever you point at it translates from English to Spanish and vice versa – putting the translated text into the image – instead of the original text! http://www.youtube.com/watch?v=h2OfQdYrHRs&feature=player_embedded
Have a look at the demo videos from The Astonishing Tribe for inspiration: http://www.tat.se/videos I It’s neat, it’s different and non-obvious and adds a bit of fun to what really should be a bit fun.
Will it make a difference when you won’t need to cable devices together? What will it be like when all your data is reachable from any of your devices without even thinking about where it is stored?
Will the term ‘storage’ seem as antiquated as horseless carriage when the cloud becomes a robust reality?
Mobile devices are classed by some as devices you can comfortably use with one hand, so they are different from tablets.
What is the best use for such a small-screen device? Even with Moore’s Law and always-on, high speed internet, the device will be constrained by its size. Is it a good input device for when on the go – requiring great synchronizing systems for other devices? Is it a good device for personal communication? Maybe it’s the ultimate Agent device? Or is it better “dumb”?
The Future is Tiny
During the 1990’s there was a lot of talk of nanotechnology and this seems to have died down as of late but the research and development has not. Make no mistake about it, while ever faster chips an networks will continue to bring dramatic change, the next revolution will be in the hardware you interact with. It will almost like software systems enter the real world. Shirts which know if you are about to have a cold. Hats which change transparency depending on the light. Mobile devices which are completely covered in screens, tactile feedback and camera and other sensors. Keyboards will extrude from flat surface when needed. The opportunities will be near-limitless with only our prejudice holding us back.
Computers will be analyzing your mood based on your facial expression (already there are settings on cameras which only take a picture when the subject is smiling). How far will this go? How far should it go?
Where you are is already coming into the computer system’s context through technologies like the SRI originated, Apple released Siri who knows where you are when you ask for the nearest Starbucks, though how it’s possible not to know the nearest Starbucks is beyond me…
I predict that the iOS of the iPhone, iPod and iPad will become the most commonly used operating system within 5 years of this book being published (early 2012), primarily when run on the iPad family and it’s descendants. The iPad will expand to include innovative docking systems, with keyboard and charging for when this is desirable, but also very much be used on its own.
The interaction afforded by multi-touch devices, such as the iPad, presents a new challenge for liquid interaction. Where in many cases the intelligent placement of buttons, use of keyboard commands and so on was what it was all about on the desktop metaphor of the Mac, Windows and Linux, the users options are much simpler – and therefore perhaps harder to design – with the “direct” interaction they give the user.
I started the LiVE Globe www.live-globe.info project to have a better understanding of developing for multi-touch environment for the writing of this book and it should be available in the iOS App Store early February 2012. I look forward to hearing comments from you.
Multi touch interfaces confer a whole new level of immediacy and intimacy with the information, but also makes the application of sophisticated interactions more difficult to design for: people want to interact “directly” with the data, not by clicking on buttons.
There is a big plus with the new multi-touch interactions for tablet computers: It feels natural and immediate. A big plus point for keyboard and cursor based interaction is that you have many more controls at your disposal – you can right click, command click, and so on.
My MSc HCI thesis advisor, Harold Thimbleby’s son, Will Thimbleby, devised a calculator which works by the user hand-writing numbers and symbols, which the system then immediately interprets and makes active – 2+2 immediately receives a “4” when the user writes “=”. It really feels like magic.
An old tablet interaction (though not multi-touch) worth considering bringing back is the Apple Newton’s way of copying: Drag something to the side of the screen and it’ll stay there, until you drag it back. If you haven’t had a chance to play with the original Newton I strongly suggest you track one down and play for a while. Massively less powerful than even the original iPad, but with some very powerful interaction ideas for copying, as discussed, as well as “agent” like behavior for interpreting selected text and more.
Further Multi-Touch Inspiration
Jeff Han and team at the Media Research Lab at NYU, provides some thoughtful examples of where multi touch interaction can go: http://cs.nyu.edu/~jhan/
There is also an interesting general video on YouTube I recommend: http://www.youtube.com/watch?v=-2Kn2HKCWqs&feature=related
Sounds can provide immersion (tanks rumbling nearby in a game), immediate feedback (click sound when clicking a button), confirmations (sending a message in Apple’s Mail) and indicate a change of state (doppler effects etc.). Sound can also provide ambient environment augmentations beyond simply music – Julian Treasure said that a background sound of birdsong for working and concentrating is basically hard-wired into us to make us feel relaxed – there are obviously no predators around.
Sound – Speaking Commands
Voice interaction is quite a specialized interaction. It’s sexy to demo but not always efficient, apart from scenarios where the user cannot use a screen. The “Interactive Text” aspect of electronic text is inherent in the text itself. The interfaces used give the user access to these inherent qualities of manipulation and connections. We have chosen manual manipulation rather than other, more exotic interface such as speech. Voice is about words so a spoken interface would be the most “natural” right? Science fiction movies love to show computers which respond to voice commands and sometimes this is a useful situation. If you are driving a car and want to change the radio station that could be valuable. This is mostly since your hands and feet and eyes are engaged in the act of getting where you want to go while avoiding crashing into anything. When you are at your workstation however you have a different context and mental focus.
There is an iconic scene in the great Blade Runner where Harrison Ford’s character Rick Deckard manipulates a digital image by talking to the computer:
“Enhance 224 to 126” … “Enhance” … “Stop” … “Move in” … “Stop” … “Pull out, track right” … “Stop” … “Center and pull back” … “ah.. stop” … “Track 45 right” … “Stop” … “Center and step” … “Enhance 34 to 36” … “Pan right and go back” … “Stop” … “Enhance 34 to 46” … “Pull back” … “Wait a minute, go right” … “Stop” … “Enhance 5719” … “Turn 45 left” … “Stop” … “Enhance 15 to 23” … “Give me a hard copy right there”.
Would it not be quicker and easier to move around with a manual interface- such as a mouse? It is an exaggeration to say that all human thought is with inner voices but conscious thought is a combination of words, images and sounds. Having to verbalize commands blocks thinking time for the duration of the verbalization in a way that manipulation through the user’s body does not. For example, talking does not interfere with walking, driving or skiing. Similarly, using verbalization to walk, drive or ski would be much less effective than manual control: “Left 10º, then forward 5 steps, no wait, stop!” Speech can seem easy since it’s “natural”. The example above shows that it’s not all that natural if you want to interact with a computer, which needs relatively precise commands. The final command “Give me a hard copy right there” would be very easy for the software to either not understand if the rules were tight, or would be prone to commands being issued
Sound – Voice Alerts
Computers communicating with the user through the computer speaking can be intrusive or useful depending on the situation. In cases where the user cannot see, or does not want to see a screen then it’s a great solution. In such cases immediate spoken feedback is useful. However, spoken feedback can also be a way to help the user focus, since the user knows that important news will be announced.
I built LiSA (the Liquid Information Speaking Assistant, the lower case “i” used since the Liquid Information company’s logo featured a capital “L” and a lower-case “i”) as an assistant who would speak in a real human voice when the user received emails from anyone the user deemed important, and also announce phone calls. This lets the user focus on work, without being as tempted to check email frequently, or being prompted by a “ping” every time a message arrives.