WebHome < Tangible

Complementary Activity: Virtual and Augmented Reality with Tangible Interfaces and Surface Based Interactions

Interaction Models and Imaging: This activity is dedicated to surface-based interactions, an important paradigm encapsulating human navigation in complex natural environments. While humans see, recognize, and interact with objects around them, most of the visual cues and feedback can be associated with specific surfaces. We will thus discuss some surface-oriented interaction models that employ surface views for image extraction, analysis, and consequent localization based on embedded digital codes. This approach allows for direct recognition and interactions with physical objects based on surface coordinates and object identifiers derived from digitally encoded surfaces.

Carpet Encoding Approaches: Humans are capable of navigating surroundings based on human-oriented cues and information embedded in the environment. Such cues and information, however, tend to be difficult to extract and analyze by computers, automated systems, and robots. We will thus consider different approaches for enhancing physical surfaces with specialized carpet codes that encapsulate identification and localization information suitable for digital processing. Information from such digitally encoded surfaces can be reliably extracted by a simple video camera connected to a computer or by a camera embedded in a Smartphone or another portable device.

Interfaces and Applications: We introduce some techniques for physical embedding of digital codes into surfaces, and present some of their applications. In respect to employed digital surface encodings, we discuss various options for optical input, ranging from specialized code readers that we have developed, to standard, stand-alone, and embedded cameras. Examples of digital enhancements of product labels and illustrations of possible interactions for information discovery and referencing will be given. We will also touch on the Flexible Scanning Tablet (FST) for testing with electromagnetic acoustic transducers and the parking simulation and guidance experimental system that we have developed.

Multi-resolution Codes: Signs and billboards around us are meant to be seen from certain distances- if we are too close we miss the big picture and if we are too far we miss the small details. When it comes to machine vision, however, the meaning of "too close" and "too far" may vary, depending on the parameters of the employed optical system that are not known in advance. In this context we introduce and discuss various methods and approaches for multi-resolution encoding that allows for reliable access to embedded information irrespective of distances and employed optical systems. This approach is particularly suitable for digitally encoded information, designed for computerized image processing and robot vision. It can also be employed for creation of human-oriented emergency signs and announcements with enhanced visibility e.g. under severe conditions and for people with reduced vision.

Possible Projects

A picture dictionary that illustrates meaning through multimedia, e.g. scanning a picture could, for example i) play a sound file and/or ii) show a video
A Sign Language dictionary that illustrate the signing process through multimedia and/or CG animation e.g. scanning a picture (key frame) could, for example i) play a video from the corresponding frame to illustrate the sign and/or ii) execute a script to animate an avatar showing the corresponding sign
An augmented tangible-interface dictionary/manual that employs physical objects for interactions and access to (online) content, e.g. taking a picture of a physical object could, for example i) play an explanatory sound file and show an illustrative video of the object and/or ii) visualize CG and VR models to clarify the meaning and employ AR to demonstrate the object use and functionally

Questions to consider

How do you use the pictures/frames from the dictionaries for direct recognition and linking?
How does the recognition rate change as the number of registered pictures/frames increase?
How does picture/frame-based linking compare to barcodes, surface codes, and multi-resolution codes
How will the above approaches work with a screen instead of printed materials and physical objects?