For a list of all papers click here

Wednesday, 8 March 2017

Automated capture and delivery of assistive task guidance with an eyewear computer: The GlaciAR system

An approach that allows both automatic capture and delivery of mixed reality guidance fully onboard Google Glass.

From the paper:

Teesid Leelasawassuk, Dima Damen, Walterio Mayol-Cuevas, Automated capture and delivery of assistive task guidance with an eyewear computer: The GlaciAR system. Augmented Human 2017.

In this paper we describe and evaluate an assistive mixed reality system that aims to augment users in tasks by combining automated and unsupervised information collection with minimally invasive video guides. The result is a fully self-contained system that we call GlaciAR (Glass-enabled Contextual Interactions for Augmented Reality). It operates by extracting contextual interactions from observing users performing actions. GlaciAR is able to i) automatically determine moments of relevance based on a head motion attention model, ii) automatically produce video guidance information, iii) trigger these guides based on an object detection method, iv) learn without supervision from observing multiple users and v) operate fully on-board a current eyewear computer (Google Glass). We describe the components of GlaciAR together with user evaluations on three tasks. We see this work as a first step toward scaling up the notoriously difficult authoring problem in guidance systems and an exploration of enhancing user natural abilities via minimally invasive visual cues.

Wednesday, 4 November 2015

Investigating Spatial Guidance for a Cooperative Handheld Robot

How to provide feedback information to guide users of a handheld robotic device when performing a spatial exploration task? We are investigating this issue via various feedback methods for communicating a 5 degree of freedom target to a user. We compare against a non-robotic handheld wand and use various display alternatives including a stereoscopic VR display (Oculus Rift), a monocular see-through AR display and a 2D screen, as well as simple robot arm gesturing. Our results indicate a significant improvement when using the handheld robot and some interesting conclusions on the effect of various popular display technologies. More to come on a follow up publication... Watch this space and also

Improving MAV Control by Predicting Aerodynamic Effects of Obstacles

Building on our previous work, we demonstrate how it is possible to improve flight control of a MAV that experiences aerodynamic disturbances caused by objects on its path. Predictions based on low resolution depth images taken at a distance are incorporated into the flight control loop on the throttle channel as this is adjusted to target undisrupted level flight. We demonstrate that a statistically significant improvement (p << 0:001) is possible for some common obstacles such as boxes and steps, compared to using conventional feedback-only control. Our approach and results are encouraging toward more autonomous MAV exploration strategies.

Beautiful vs Useful Maps

We are developing methods for compressing maps in a more intelligent way than only looking at geometry. Specifically we look at the relocalisation criteria as a more useful metric. Our methods in the above slide show that they are much better at relocalisation than traditional compression approaches. Some of them also preserve geometry well.

Tuesday, 1 September 2015

Estimating Visual Attention from a Head Mounted IMU

How to obtain attention for an eyewear computer without a gaze tracker?  We developed this method for Google Glass.
This paper concerns with the evaluation of methods for the estimation of both temporal and spatial visual attention using a head-worn inertial measurement unit (IMU). Aimed at tasks where there is a wearer-object interaction, we estimate the when and the where the wearer is interested in. We evaluate various methods on a new egocentric dataset from 8 volunteers and compare our results with those achievable with a commercial gaze tracker used as ground-truth. Our approach is primarily geared for sensor-minimal EyeWear computing.

Wednesday, 6 May 2015

Accurate Photometric and Geometric Error Minimisation with Inverse Depth & What to landmark?

Two papers related to RGBD mapping from a nice collaboration with Daniel Gutierrez and Josechu Guerrero from the University of Zaragoza. Both to be presented at ICRA 2015. One of them nominated for Awards:
  •  D. Gutiérrez-Gómez, W. Mayol-Cuevas, J.J. Guerrero. "Inverse Depth for Accurate Photometric and Geometric Error Minimisation in RGB-D Dense Visual Odometry", In IEEE International Conference on Robotics and Automation (ICRA), 2015. Nominated for Best Robotic Vision Paper Award. [pdf][video][code available] 

  •  D. Gutiérrez-Gómez, W. Mayol-Cuevas, J.J. Guerrero. "What Should I Landmark? Entropy of Normals in Depth Juts for Place Recognition in Changing Environments Using RGB-D Data", In IEEE International Conference on Robotics and Automation (ICRA), 2015.[pdf] 

Monday, 1 September 2014

Cognitive Handheld Robots

We have been working for 2.5 years on prototypes (and since 2006 in the concept!) on what we think is a new extended type of robot. Handheld robots have the shape of tools and are intended to have cognition and action while cooperating with people. This video is from our first prototype back in November 2013. We are also offering details of its construction and 3D CAD models at . We are currently developing a new prototype and more on this soon. Austin Gregg-Smith is sponsored by the James Dyson Foundation.

  • Austin Gregg-Smith and Walterio Mayol. The Design and Evaluation of a Cooperative Handheld Robot. IEEE  International Conference on Robotics and Automation (ICRA). Seattle, Washington, USA. May 25th-30th, 2015. [PDF] Nominated for Best Cognitive Robotics Paper Award.

Saturday, 9 August 2014

Recognition and Reconstruction of Transparent Objects for Augmented Reality

Dealing with real transparent objects for AR is challenging due to their lack of texture and visual features as well as the drastic changes in appearance as the background, illumination and camera pose change. In this work, we explore the use of a learning approach for classifying transparent objects from multiple images with the aim of both discovering such objects and building a 3D reconstruction to support convincing augmentations. We extract, classify and group small image patches using a fast graph-based segmentation and employ a probabilistic formulation for aggregating spatially consistent glass regions. We demonstrate our approach via analysis of the performance of glass region detection and example 3D reconstructions that allow virtual objects to interact with them.
From our paper: Alan Francisco Torres-Gomez, Walterio Mayol-Cuevas, Recognition and reconstruction of transparent objects for Augmented Reality. ISMAR 2014. PDF available at here.

Friday, 1 August 2014

Discovering Task Relevant Objects, their Usage and Providing Video Guides from Multi-User Egocentric Video

Using a wearable gaze tracking setup, we have developed a fully unsupervised method for the discovery of task relevant objects (TRO) and how these objects have been used. A TRO is an object, or part of an object that a person interacts with during a task. We combine visual appearance, motion and user attention to discover TROs. Both static objects such as a coffee machine as well as movable ones such as a cup are discovered. We introduce the term Mode of Interaction to refer to the different ways in which TROs are used. Say, a cup can be lifted, washed, or poured into. When harvesting interactions with the same object from multiple operators, common modes of interaction can be found. We also developed an online fully unsupervised prototype for automatically extracting video guides of how the objects are used. The method automatically selects suitable video segments that indicate how others have used that object before.
  • Damen, Dima and Leelasawassuk, Teesid and Haines, Osian and Calway, Andrew and Mayol-Cuevas, Walterio (2014). You-Do, I-Learn: Discovering Task Relevant Objects and their Modes of Interaction from Multi-User Egocentric Video. British Machine Vision Conference (BMVC), Nottingham, UK. [pdf]
  • Damen, Dima and Haines, Osian and Leelasawassuk, Teesid and Calway, Andrew and Mayol-Cuevas, Walterio (2014). Multi-user egocentric Online System for Unsupervised Assistance on Object Usage. ECCV Workshop on Assistive Computer Vision and Robotics (ACVR), Zurich, Switzerland. [preprint]  
We also have released the dataset available on this project webpage.

Wednesday, 23 April 2014

Learning to Predict Obstacle Aerodynamics from Depth Images for Micro Air Vehicles

This work develops a method to anticipate the aerodynamic ground effects a Micro Air Vehicle will have when going above different obstacles. This works by learning a mapping from depth images to the acceleration experienced from flying above a variety of objects. Computing full 3D aerodynamic effects using air flow simulation onboard a MAV is currently unfeasible. This work uses the alternative approach of learning the visual appearance of objects that produce a given effect which therefore turns the problem into one of regression rather than raw computation.
With the current "easiness" with which 3D maps are now possible to be constructed, this work in a way aims to enhance maps with information that is beyond purely geometric. We have also closed the control loop so we correct for the deviation in anticipation, but that is for another paper.

  • John Bartholomew, Andrew Calway and Walterio Mayol-Cuevas, Learning to Predict Obstacle Aerodynamics from Depth Images for Micro Air Vehicles by , IEEE ICRA 2014. [PDF]

Saturday, 12 October 2013

Monday, 1 July 2013

Real-time 3D simultaneous localization and map-building for a dynamic walking humanoid robot

 Image from our Advanced Robotics Journal paper

 This work is with our Samsung Electronics colleagues where our real-time slam system has been used to provide Samsung's RoboRay with mapping capability. RoboRay is one of the most advanced humanoid robots and uses dynamic walking which imposes higher demands on the slam system as there is more agility and motion overall. More details in the paper:

Real-time 3D simultaneous localization and map-building for a dynamic walking humanoid robot
S Yoon, S Hyung, M Lee, KS Roh, SH Ahn, A Gee, P Bunnun, A Calway, & WW Mayol-Cuevas,
Advanced Robotics, published online on May 1st, 2013.

Friday, 28 June 2013

Real-Time Continuous 6D Relocalisation for Depth Cameras

In this work we present our fast (50Hz) relocalisation method based on simple visual descriptors plus a 3D geometrical test for a system performing visual 6-D relocalisation at every single frame and in real time. Continuous relocalisation  is useful in re-exploration of scenes or for loop-closure in earnest. Our experiments suggest the feasibility of this novel approach that benefits from depth camera data, with a relocalisation performance of 73% while running on a single core onboard a moving platform over trajectory segments of about 120m. The system also reduces in 95% the memory footprint compared to a system using conventional SIFT-like descriptors.

  • J. Martinez-Carranza, Walterio Mayol-Cuevas. Real-Time Continuous 6D Relocalisation for Depth Cameras. Workshop on Multi VIew Geometry in RObotics (MVIGRO), in conjunction with Robotics Science and Systems RSS. Berlin, Germany. June, 2013. PDF 
  • J. Martinez Carranza, A. Calway, W. Mayol-Cuevas, Enhancing 6D visual relocalisation with depth cameras. International Conference on Intelligent Robots and Systems IROS. November 2013.

Wednesday, 1 May 2013

Mapping and Auto Retrieval for Micro Air Vehicles (MAV) using RGBD Visual Odometry

These videos show work we been doing with our partners at Blue Bear for onboard visual mapping for MAVs. These are based on visual odometry mapping for working over large areas and build maps onboard the MAV using an asus xtion-pro RGBD camera mounted on the vehicle. One of the videos show autoretrieval of the vehicle where a human pilot first flies the vehicle through the space and then the map is used for relocalisation using the map built on the way back. The other video is on a nuclear reactor installation. These are works we been doing for a while on uses of our methods for industrial inspection.

Monday, 25 March 2013

March 2013 New Robotics MSc. I redesigned our joint MSc in Robotics which is now aimed to support students from various backgrounds in Engineering, Physics and Maths. I am looking forward to supervise MSc project here. Have a look at the programme  here. The application deadline is August 31st.

Thursday, 1 November 2012

Integrating 3D Object Detection, Modelling and Tracking on a Mobile Phone
Click image for video
 This is from a paper at ISMAR 2012 where we combine three things on a mobile phone: 1) in-situ interactive object modelling, 2) real-time tracking based on the wireframe model of the object and 3) re-detection of a lost object using our textureless object detector. It works on a Nokia 900, but look for another post for the Android App.
  • Pished Bunnun, Dima Damen, Andrew Calway, Walterio Mayol-Cuevas, Integrating 3D Object Detection, Modelling and Tracking on a Mobile Phone. International Symposium on Mixed and Augmented Reality (ISMAR). November 2012. PDF.

Sunday, 7 October 2012

Predicting Micro Air Vehicle Landing Behaviour from Visual Texture

How does a UAV can decide where is best to land and what to expect if landing on a particular material? Here we develop a framework to predict the landing behaviour of a Micro Air Vehicle (MAV) from the visual appearance of the landing surface. We approach this problem by learning a mapping from visual texture observed from an onboard camera to the landing behaviour on a set of sample materials. In this case we exemplify our framework by predicting the yaw angle of the MAV after landing.

Egocentric Real-time Workspace Monitoring using an RGB-D Camera

We developed an integrated system for personal workspace monitoring based around an RGB-D sensor. The approach is egocentric, facilitating full flexibility, and operates in real-time, providing object detection and recognition, and 3D trajectory estimation whilst the user undertakes tasks in the workspace. A prototype on-body system developed in the context of work-flow analysis for industrial manipulation and assembly tasks is described. We evaluated on two tasks with multiple users, and results indicate that the method is effective, giving good accuracy performance.
  • Dima Damen, Andrew Gee, Walterio Mayol-Cuevas and Andrew Calway. Egocentric Real-time Workspace Monitoring using an RGB-D Camera, IROS 2012. PDF, Video available from here.

Monday, 3 September 2012

6D RGBD relocalisation

Relocalisation is about finding out where the camera is in translation and rotation (6D) when it visits the space for the first time after a map has been created, or if gets lost during tracking due to occlusion. This is also known as the "kidnapped robot" problem in Robotics and appears frequently in SLAM at the loopclosing stage. Here, we develop a fast relocalisation method for RGB-D cameras that operates in workplaces where low texture and some occlusion can be present. Videos are available here.