Main objectives & driving scenarios |
|
|
Due to the fast development of computer graphics and audio-visual display technologies during the past two decades, numerous sophisticated systems like the CAVETM [Cruz-Neira93]1 exist today, allowing immersing a person in a highly realistic virtual environment. Unfortunately, nearly all of these systems regard the user merely as a passive observer. Furthermore, a large number of applications and systems exist, which make use of haptic interfaces for improving human computer interaction. Though, truly immersive setups are still seldom. The quality of the haptic feedback is limited and does not increase the sense of presence. Even such elementary tasks like grasping and manipulating a simple object with a hand is beyond the current limits of existing systems. The main objective of this Integrated Project is to fundamentally change this very restrictive situation. While aiming at full multimodal feedback of all relevant information, the IMMERSENCE project will focus on hand-based (i.e. manual) tasks when dealing with interaction. Three threads will be built around prototypical systems, which will be used as driving scenarios and testbeds in order to cover the full spectrum of problems arising when implementing interactive multimodal virtual or augmented reality environments:
P2O is concerned with the most basic form of interaction, the handling of an object by a human. A novel, data-driven algorithmic paradigm, for which the data are recorded from real objects and situations (comparable to a haptic movie sequence), will be investigated for the generation of sensory feedback. Combined with the classical model-driven approaches, this will enable the understanding, modeling, and high-fidelity synthesis in VE of multimodal P2O interactions, so that a person can for example realistically manipulate a complex object with deformable properties in an advanced multimodal virtual or augmented reality environment. This scenario relies on the passivity of the “interaction partner” (object), which only reacts in a physically predictable manner to the actions of the user. This constraint will be relaxed in the second thread P2P, concerned with two people engaged in multimodal interaction that involves personal contact (like handshaking). It has to be noted, that the dividing line between these two types of scenarios is not really sharp. The simplified example scenario of palpation may be regarded as handling a deformable object (P2O), while the full consideration of the patient’s actions as a consequence of palpation transforms this to a typical P2P-type problem. Finally, the last scenario (POP) is concerned with collaborative, multimodal interaction between two persons mediated by an object, for example jointly inserting a heavy component during an assembly process. In principle, all three threads will be investigated according to the same basic methodology, allowing full sharing of underlying scientific theories, technological components and even integrated subsystems, enabling to exchange experience, and to carrying over results between them. From the technological point of view, the major components for the implementation of such systems would be in principle modeling, rendering and display. This perfectly corresponds to the classical approach followed in computer graphics. During the past few years, however, data driven methods like image based rendering have become increasingly popular in visualization, due to the high fidelity rendering they can potentially offer. |
The IMMERSENCE project will systematically explore the potential of such approaches for the creation of immersive, multi-modal environments. For their implementation the above processing will have to be replaced by a new procedure which can be described as a chain of Recording, followed by either a pure Replay or the inter- or extrapolation of perceptional feedback in situations, which have never been observed before (i.e. to Create completely new scenarios based on the collected data). During the first Recording step all relevant data characterizing multimodal interaction is collected by (partly available, partly to be developed) multimodal sensing suites. The recorded data will be processed in order to convert the raw signals into abstract descriptors based on semantics (multimodal language), parameter identification of object/person properties or data-base type storage, e.g. such abstraction will facilitate the understanding and the quantitative characterization of the underlying multimodal interaction. The approaches in this phase are interdisciplinary ranging from neuroscience through psychophysics and computer science to control theory. This abstraction will enable to faithfully Replay the interactive session using the collected data, while replacing the real counterpart by a virtual (simulated) object respectively multimodal, haptic enhanced avatar. Finally, more classical modeling methods will be integrated into the interpolation-based prediction scheme in order to ensure its robust performance when it becomes necessary to create sensory feedback even beyond the applicability limits of extrapolation. We also will explore how the gained experience can be carried over and the developed technology can be utilized in the context of remote and mixed environments. These investigations will become possible by relying on the existing substantial resources at the partners and by fully utilizing synergies with already established large-scale national research initiatives. In an evaluation workpackage systematic comparisons to the real situation are to be made, relying on a novel theory of presence measurement using the combination of available presence measurement methods, engineering approaches, and new presence metrics. The related issues and problems will be addressed in full depth and breadth in order to reach the main objectives of the project, namely to significantly enhance our current psychophysical and neurological understanding of multi-modal interaction, to develop the necessary novel technological components and solutions for sensing, modeling and display and to investigate, how complex systems integrating those components can lead to improved immersion and presence in virtual and augmented reality environment. |