The May 2024 issue of IEEE Spectrum is here!

Close bar

Video Friday: Robot Dance Teacher, Transformer Drone, and Pneumatic Reel Actuator

The best robot videos of the week, ICRA edition

16 min read

Erico Guizzo is IEEE Spectrum's Digital Innovation Director.

Robot Dance Teacher
Image: Tohoku University

The week is almost over, and so is the 2017 IEEE International Conference on Robotics and Automation (ICRA) in Singapore. We hope you’ve been enjoying our coverage, which has featured aquatic drones, stone-stacking manipulators, and self-folding soft robots. We’ll have lots more from the conference over the next few weeks, but for you impatient types, we’re cramming Video Friday this week with a special selection of ICRA videos.

We tried to include videos from many different subareas of robotics: control, vision, locomotion, machine learning, aerial vehicles, humanoids, actuators, manipulation, and human-robot interaction. We’re posting the abstracts along with the videos, but if you have any questions about these projects, let us know and we’ll get more details from the authors.

We’ll return to normal Video Friday next week. Have a great weekend everyone!

“Dance Teaching by a Robot: Combining Cognitive and Physical Human–Robot Interaction for Supporting the Skill Learning Process,” by Diego Felipe Paez Granados, Breno A. Yamamoto, Hiroko Kamide, Jun Kinugawa, and Kazuhiro Kosuge from Tohoku University, Federal University of Uberlandia, and Nagoya University.

This letter presents a physical human–robot interaction scenario in which a robot guides and performs the role of a teacher within a defined dance training framework. A combined cognitive and physical feedback of performance is proposed for assisting the skill learning process. Direct contact cooperation has been designed through an adaptive impedance–based controller that adjusts according to the partner’s performance in the task. In measuring performance, a scoring system has been designed using the concept of progressive teaching (PT). The system adjusts the difficulty based on the user’s number of practices and performance history. Using the proposed method and a baseline constant controller, comparative experiments have shown that the PT presents better performance in the initial stage of skill learning. An analysis of the subjects’ perception of comfort, peace of mind, and robot performance have shown significant difference at the p < .01 level, favoring the PT algorithm.

“Whole-body Aerial Manipulation by Transformable Multirotor with Two-dimensional Multilinks,” by Moju Zhao, Koji Kawasaki, Xiangyu Chen, Shintaro Noda, Kei Okada, and Masayuki Inaba from the University of Tokyo.

In this paper, we introduce the achievement of the aerial manipulation by using the whole body of a transformable aerial robot, instead of attaching an additional manipulator. The aerial robot in our work is composed by two-dimensional multilinks which enable a stable aerial transformation and can be employed as an entire gripper. We propose a planning method to find the optimized grasping form for the multilinks while they are on the air, which is based on the original planar enveloping algorithm, along with the optimization of the internal force and joint torque for the force-closure. We then propose the aerial approach and grasp motion strategy, which is devoted to the determination of the form and position of the aerial robot to approach and grasp effectively the object from the air. Finally we present the experimental results of the aerial manipulation which involves grasping, carrying and dropping different types of object. These results validate the performance of aerial grasping based on our proposed wholebody grasp planning and motion control method.

“Blade-type Crawler Vehicle With Gyro Wheel for Stably Traversing Uneven Terrain at High Speed,” by Yasuyuki Yamada, Hirotaka Sawada, Takashi Kubota, and Taro Nakamura from Chuo University and Japan Aerospace Exploration Agency (JAXA).

Unmanned rescue, observation, and/or research vehicles with high terrain adaptability, high speed, and high reliability are needed in difficult-to-reach locations. However, for most vehicles, high performance over rough terrain reduces the travel speed and/or requires complex mechanisms. We have developed a blade-type crawler robot with a very simple and reliable mechanism, which traverses uneven terrain at high speed. Moreover, the gyro wheel design stabilizes the success of this approach in improving the motion, ensuring robust traversal. The improvement in traveling speed and robustness over uneven terrain by our approach was confirmed by experiment.

“Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning,” by Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, and Ali Farhadi from Stanford University, Allen Institute for AI, Carnegie Mellon University, University of Washington, and University of Southern California.

Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new goals, and (2) data inefficiency, i.e., the model requires several (and often costly) episodes of trial and error to converge, which makes it impractical to be applied to real-world scenarios. In this paper, we address these two issues and apply our model to target-driven visual navigation. To address the first issue, we propose an actor-critic model whose policy is a function of the goal as well as the current state, which allows better generalization. To address the second issue, we propose the AI2-THOR framework, which provides an environment with high-quality 3D scenes and a physics engine. Our framework enables agents to take actions and interact with objects. Hence, we can collect a huge number of training samples efficiently. We show that our proposed method (1) converges faster than the state-of-the-art deep reinforcement learning methods, (2) generalizes across targets and scenes, (3) generalizes to a real robot scenario with a small amount of fine-tuning (although the model is trained in simulation), (4) is end-to-end trainable and does not need feature engineering, feature matching between frames or 3D reconstruction of the environment.

“A 3 Wire Body Weight Support System for a Large Treadmill,” by Pouya Sabetian and John M. Hollerbach from University of Utah.

A 3 DoF parallel cable driven body weight support (BWS) system has been developed for the University of Utah’s Treadport Locomotion Interface, for purposes of rehabilitation, simulation of steep slopes, and display of reduced gravity environments. The Treadport’s large belt (6 by 10 feet) requires a multi-cable support system to ensure that the unloading forces are close to vertical. This paper presents the design and experimental validation, including the system model and force control.

“Path Integral Guided Policy Search,” by Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, and Sergey Levine from University of Southern California, X, and Google Brain.

We present a policy search method for learning complex feedback control policies that map from highdimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search (GPS), which iteratively optimizes a set of local policies for specific instances of a task, and uses these to train a complex, high-dimensional global policy that generalizes across task instances. We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI2), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization. We show that these contributions enable us to learn deep neural network policies that can directly perform torque control from visual input. We validate the method on a challenging door opening task and a pick-and-place task, and we demonstrate that our approach substantially outperforms the prior LQR-based local policy optimizer on these tasks. Furthermore, we show that on-policy sampling significantly increases the generalization ability of these policies.

“Crazyswarm: A Large Nano-Quadcopter Swarm,” by James A. Preiss, Wolfgang Honig, Gaurav S. Sukhatme, and Nora Ayanian from University of Southern California.

We define a system architecture for a large swarm of miniature quadcopters flying in dense formation indoors. The large number of small vehicles motivates novel design choices for state estimation and communication. For state estimation, we develop a method to reliably track many small rigid bodies with identical motion-capture marker arrangements. Our communication infrastructure uses compressed one-way data flow and supports a large number of vehicles per radio. We achieve reliable flight with accurate tracking (< 2cm mean position error) by implementing the majority of computation onboard, including sensor fusion, control, and some trajectory planning. We provide various examples and empirically determine latency and tracking performance for swarms with up to 49 vehicles.

“Development of a Block Machine for Volleyball Attack Training,” by Kosuke Sato, Keita Watanabe, Shuichi Mizuno, Masayoshi Manabe, Hiroaki Yano, and Hiroo Iwata from University of Tsukuba and Japan Volleyball Association.

This paper presents a system that consists of three robots to imitate the motion of top volleyball blockers. In a volleyball match, in order to score by spiking, it is essential to improve the spike decision rate of each spiker. To increase the spike decision rates, iterative spiking training with actual blockers is required. Therefore, in this study, a block machine system was developed that can be continuously used in an actual practice field to improve attack practice. In order to achieve the required operating speed and mechanical strength each robot has five degrees of freedom. This robot performs high speed movement on 9 m rails that are arranged in parallel with the volleyball net. In addition, an application with a graphical user interface to enable a coach to manipulate these robots was developed. It enables the coach to control block motions and change the parameters such as the robots’ positions and operation timing. Through practical use in the practice field, the effectiveness of this system was confirmed.

“Quasi-Static and Dynamic Mismatch for Door Opening and Stair Climbing With a Legged Robot,” by T. Turner Topping, Gavin Kenneally, and D. E. Koditschek from University of Pennsylvania and Ghost Robotics.

This paper contributes to quantifying the notion of robotic fitness by developing a set of necessary conditions that determine whether a small quadruped has the ability to open a class of doors or climb a class of stairs using only quasi-static maneuvers. After verifying that several such machines from the recent robotics literature are mismatched in this sense to the common human scale environment, we present empirical workarounds for the Minitaur quadrupedal platform that enable it to leap up, force the door handle and push through the door, as well as bound up the stairs, thereby accomplishing through dynamical maneuvers otherwise (i.e., quasi-statically) unachievable tasks.

“Robust Sensor Fusion for Finding HRI Partners in a Crowd,” by Shokoofeh Pourmehr, Jack Thomas, Jake Bruce, Jens Wawerla, and Richard Vaughan from Simon Fraser University.

We present a simple probabilistic framework for multimodal sensor fusion that allows a mobile robot to reliably locate and approach the most promising interaction partner among a group of people, in an uncontrolled environment. Our demonstration integrates three complementary sensor modalities, each of which detects features of nearby people. The output is an occupancy grid approximation of a probability density function over the locations of people that are actively seeking interaction with the robot. We show empirically that simply driving towards the peak of this distribution is sufficient to allow the robot to correctly engage an interested user in a crowd of bystanders.

“Quadrotor Collision Characterization and Recovery Control,” by Gareth Dicker, Fiona Chui, and Inna Sharf from McGill University.

Collisions between quadrotor UAVs and the environment often occur, for instance, under faulty piloting, from wind gusts, or when obstacle avoidance fails. Airspace regulations are forcing drone companies to build safer drones; many quadrotor drones now incorporate propeller protection. However, propeller protected quadrotors still do not detect or react to collisions with objects such as walls, poles and cables. In this paper, we present a collision recovery pipeline which controls propeller protected quadrotors to recover from collisions. This pipeline combines concepts from impact dynamics, fuzzy logic, and aggressive quadrotor attitude control. The strategy is validated via a comprehensive Monte Carlo simulation of collisions against a wall, showing the feasibility of recovery from challenging collision scenarios. The pipeline is implemented on a custom experimental quadrotor platform, demonstrating feasibility of real-time performance and successful recovery from a range of pre-collision conditions. The ultimate goal of the research is to implement a general collision recovery solution as a safety feature for quadrotor flight controllers.

“Overlap-based ICP Tuning for Robust Localization of a Humanoid Robot,” by Simona Nobili, Raluca Scona, Marco Caravagna, and Maurice Fallon from University of Edinburgh.

State estimation techniques for humanoid robots are typically based on proprioceptive sensing and accumulate drift over time. This drift can be corrected using exteroceptive sensors such as laser scanners via a scene registration procedure. For this procedure the common assumption of high point cloud overlap is violated when the scenario and the robot’s point-of-view are not static and the sensor’s field-of-view (FOV) is limited. In this paper we focus on the localization of a robot with limited FOV in a semi-structured environment. We analyze the effect of overlap variations on registration performance and demonstrate that where overlap varies, outlier filtering needs to be tuned accordingly. We define a novel parameter which gives a measure of this overlap. In this context, we propose a strategy for robust non-incremental registration. The pre-filtering module selects planar macro-features from the input clouds, discarding clutter. Outlier filtering is automatically tuned at run-time to allow registration to a common reference in conditions of non-uniform overlap. An extensive experimental demonstration is presented which characterizes the performance of the algorithm using two humanoids: the NASA Valkyrie, in a laboratory environment, and the Boston Dynamics Atlas, during the DARPA Robotics Challenge Finals.

“Soft Sheet Actuator Generating Traveling Waves Inspired by Gastropod’s Locomotion,” by Masahiro Watanabe and Hideyuki Tsukagoshi from Tokyo Institute of Technology.

In this paper, we propose an epoch-making soft sheet actuator called “Wavy-sheet”. Inspired by gastropod’s locomotion, Wavy-sheet can generate continuous traveling waves on the whole soft body. It aims to be applied to a mobile soft mat capable of moving and transporting without damaging the object and the ground. The actuator, driven by pneumatics, is mainly composed of a couple of flexible rubber tubes and fabrics. The advantages are: i) many traveling waves can be generated by just three tubes, ii) the whole structure can adapt its own shape to the outer environment passively, and iii) only 10 mm in thickness and can generate waves with larger than 10mm in amplitude. In this paper, first, we describe the basic concept of Wavy-sheet, and then show the configuration and the principle of wave propagation. Next, fabrication methods are illustrated and the design methods are addressed. By using a prototype actuator, several experiments are conducted. Finally, we verify the effectiveness of the proposed actuator and its design methods.

“NimbRo Picking: Versatile Part Handling for Warehouse Automation” by Max Schwarz, Anton Milan, Christian Lenz, Aura Munoz, Arul Selvam Periyasamy, Michael Schreiber, Sebastian Schuller, and Sven Behnke from University of Bonn.

Part handling in warehouse automation is challenging if a large variety of items must be accommodated and items are stored in unordered piles. To foster research in this domain, Amazon holds picking challenges. We present our system which achieved second and third place in the Amazon Picking Challenge 2016 tasks. The challenge required participants to pick a list of items from a shelf or to stow items into the shelf. Using two deep-learning approaches for object detection and semantic segmentation and one item model registration method, our system localizes the requested item. Manipulation occurs using suction on points determined heuristically or from 6D item model registration. Parametrized motion primitives are chained to generate motions. We present a full-system evaluation during the APC 2016 and componentlevel evaluations of the perception system on an annotated dataset.

“CoSTAR: Instructing Collaborative Robots with Behavior Trees and Vision,” by Chris Paxton, Andrew Hundt, Felix Jonathan, Kelleher Guerin, and Gregory D. Hager from Johns Hopkins University.

For collaborative robots to become useful, end users who are not robotics experts must be able to instruct them to perform a variety of tasks. With this goal in mind, we developed a system for end-user creation of robust task plans with a broad range of capabilities. CoSTAR: the Collaborative System for Task Automation and Recognition is our winning entry in the 2016 KUKA Innovation Award competition at the Hannover Messe trade show, which this year focused on Flexible Manufacturing. CoSTAR is unique in how it creates natural abstractions that use perception to represent the world in a way users can both understand and utilize to author capable and robust task plans. Our Behavior Tree-based task editor integrates high-level information from known object segmentation and pose estimation with spatial reasoning and robot actions to create robust task plans. We describe the crossplatform design and implementation of this system on multiple industrial robots and evaluate its suitability for a wide variety of use cases.

“TOMM: Tactile Omnidirectional Mobile Manipulator,” by Emmanuel Dean-Leon, Brennand Pierce, Florian Bergner, Philipp Mittendorfer, Karinne Ramirez-Amaro, Wolfgang Burger, and Gordon Cheng from Technical University of Munich.

In this paper, we present the mechatronic design of our Tactile Omnidirectional Robot Manipulator (TOMM), which is a dual arm wheeled humanoid robot with 6DoF on each arm, 4 omnidirectional wheels and 2 switchable end-effectors (1 DoF grippers and 12 DoF Hands). The main feature of TOMM is its arms and hands which are covered with robot skin. We exploit the multi-modal tactile information of our robot skin to provide a rich tactile interaction system for robots. In particular, for the robot TOMM, we provide a general control framework, capable of modifying the dynamic behavior of the entire robot, e.g., producing compliance in a non-compliant system. We present the hardware, software and middleware components of the robot and provide a compendium of the base technologies deployed in it. Furthermore, we show some applications and results that we have obtained using this robot.

“Tracking Objects with Point Clouds from Vision and Touch,” by Gregory Izatt, Geronimo Mirano, Edward Adelson, and Russ Tedrake from the Massachusetts Institute of Technology.

We present an object-tracking framework that fuses point cloud information from an RGB-D camera with tactile information from a GelSight contact sensor. GelSight can be treated as a source of dense local geometric information, which we incorporate directly into a conventional point-cloud-based articulated object tracker based on signed-distance functions. Our implementation runs at 12 Hz using an online depth reconstruction algorithm for GelSight and a modified secondorder update for the tracking algorithm. We present data from hardware experiments demonstrating that the addition of contact-based geometric information significantly improves the pose accuracy during contact, and provides robustness to occlusions of small objects by the robot’s end effector.

“A Two-Level Approach for Solving the Inverse Kinematics of an Extensible Soft Arm Considering Viscoelastic Behavior,” by Hao Jiang, Zhanchi Wang, Xinghua Liu, Xiaotong Chen, Yusong Jin, Xuanke You, and Xiaoping Chen from University of Science and Technology of China.

Soft compliant materials and novel actuation mechanisms ensure flexible motions and high adaptability for soft robots, but also increase the difficulty and complexity of constructing control systems. In this work, we provide an efficient control algorithm for a multi-segment extensible soft arm in 2D plane. The algorithm separate the inverse kinematics into two levels. The first level employs gradient descent to select optimized arm’s pose (from task space to configuration space) according to designed cost functions. With consideration of viscoelasticity, the second level utilizes neural networks to figure out the pressures from each segment’s pose (from configuration space to actuation space). In experiments with a physical prototype, the control accuracy and effectiveness are validated, where the control algorithm is further improved by an optional feedback strategy.

“Underwater Cave Mapping using Stereo Vision,” by Nick Weidner, Sharmin Rahman, Alberto Quattrini Li, and Ioannis Rekleitis from University of South Carolina.

This paper presents a systematic approach for the 3-D mapping of underwater caves. Exploration of underwater caves is very important for furthering our understanding of hydrogeology, managing efficiently water resources, and advancing our knowledge in marine archaeology. Underwater cave exploration by human divers however, is a tedious, labor intensive, extremely dangerous operation, and requires highly skilled people. As such, it is an excellent fit for robotic technology, which has never before been addressed. In addition to the underwater vision constraints, cave mapping presents extra challenges in the form of lack of natural illumination and harsh contrasts, resulting in failure for most of the state-ofthe-art visual based state estimation packages. A new approach employing a stereo camera and a video-light is presented. Our approach utilizes the intersection of the cone of the video-light with the cave boundaries: walls, floor, and ceiling, resulting in the construction of a wire frame outline of the cave. Successive frames are combined using a state of the art visual odometry algorithm while simultaneously inferring scale through the stereo reconstruction. Results from experiments at a cave, part of the Sistema Camilo, Quintana Roo, Mexico, validate our approach. The cave wall reconstruction presented provides an immersive experience in 3-D.

“Backchannel Opportunity Prediction for Social Robot Listeners,” by Hae Won Park, Mirko Gelsomini, Jin Joo Lee, Tonghui Zhu, and Cynthia Breazeal from MIT Media Lab and Politecnico di Milano.

This paper investigates how a robot that can produce contingent listener response, i.e., backchannel, can deeply engage children as a storyteller. We propose a backchannel opportunity prediction (BOP) model trained from a dataset of children’s dyad storytelling and listening activities. Using this dataset, we gain better understanding of what speaker cues children can decode to find backchannel timing, and what type of nonverbal behaviors they produce to indicate engagement status as a listener. Applying our BOP model, we conducted two studies, withinand between-subjects, using our social robot platform, Tega. Behavioral and self-reported analyses from the two studies consistently suggest that children are more engaged with a contingent backchanneling robot listener. Children perceived the contingent robot as more attentive and more interested in their story compared to a non-contingent robot. We find that children significantly gaze more at the contingent robot while storytelling and speak more with higher energy to a contingent robot.

“Visual Servoing in an Optimization Framework for the Whole-body Control of Humanoid Robots,” by Don Joven Agravante, Giovanni Claudio, Fabien Spindler, and Francois Chaumette from Inria Rennes.

In this paper, we show that visual servoing can be formulated as an acceleration-resolved, quadratic optimization problem. This allows us to handle visual constraints, such as field of view and occlusion avoidance, as inequalities. Furthermore, it allows us to easily integrate visual servoing tasks into existing whole-body control frameworks for humanoid robots, which simplifies prioritization and requires only a posture task as a regularization term. Finally, we show this method working on simulations with HRP-4 and real tests on Romeo.

“UAV with Two Passive Rotating Hemispherical Shells for Physical Interaction and Power Tethering in a Complex Environment,” by Carl John Salaan, Kenjiro Tadakuma, Yoshito Okada, Eri Takane, Kazunori Ohno, and Satoshi Tadokoro from Tohoku University.

For the past few years, unmanned aerial vehicles (UAVs) have been successfully employed in several investigations and exploration tasks such as aerial inspection and manipulations. However, most of these UAVs are limited to open spaces distant from any obstacles because of the high risk of falling as a result of an exposed propeller or not enough protection. On the other hand, a UAV with a passive rotating spherical shell can fly over a complex environment but cannot engage in physical interaction and perform power tethering because of the passive rotation of the spherical shell. In this study, we propose a new mechanism that allows physical interaction and power tethering while the UAV is well-protected and has a good flight stability, which enables exploration in a complex environment such as disaster sites. We address the current problem by dividing the whole shell into two separate hemispherical shells that provide a gap unaffected by passive rotation. In this paper, we mainly discuss the concept, general applications, and design of the proposed system. The capabilities of the proposed system for physical interaction and power tethering in a complex space were initially verified through laboratory-based test flights of our experimental prototype.

“Find Your Own Way: Weakly-Supervised Segmentation of Path Proposals for Urban Autonomy” by Dan Barnes, Will Maddern, and Ingmar Posner from University of Oxford.

We present a weakly-supervised approach to segmenting proposed drivable paths in images with the goal of autonomous driving in complex urban environments. Using recorded routes from a data collection vehicle, our proposed method generates vast quantities of labelled images containing proposed paths and obstacles without requiring manual annotation, which we then use to train a deep semantic segmentation network. With the trained network we can segment proposed paths and obstacles at run-time using a vehicle equipped with only a monocular camera without relying on explicit modelling of road or lane markings. We evaluate our method on the largescale KITTI and Oxford RobotCar datasets and demonstrate reliable path proposal and obstacle segmentation in a wide variety of environments under a range of lighting, weather and traffic conditions. We illustrate how the method can generalise to multiple path proposals at intersections and outline plans to incorporate the system into a framework for autonomous urban driving.

“Pneumatic Reel Actuator: Design, Modeling, and Implementation,” by Zachary M. Hammond, Nathan S. Usevitch, Elliot W. Hawkes, and Sean Follmer from Stanford University.

We present the design, modeling, and implementation of a novel pneumatic actuator, the Pneumatic Reel Actuator (PRA). The PRA is highly extensible, lightweight, capable of operating in compression and tension, compliant, and inexpensive. An initial prototype of the PRA can reach extension ratios greater than 16:1, has a force-to-weight ratio over 28:1, reach speeds of 0.87 meters per second, and can be constructed with parts totaling less than $4 USD. We have developed a model describing the actuator and have conducted experiments characterizing the actuator’s performance in regards to force, extension, pressure, and speed. We have implemented two parallel robotic applications in the form of a three degree of freedom robot arm and a tetrahedral robot.

“Angular Momentum Compensation in Yaw Direction using Upper Body based on Human Running,” by T. Otani, K. Hashimoto, S. Miyamae, H. Ueta, M. Sakaguchi, Y. Kawakami, H.O. Lim, and A. Takanishi from Waseda University, University of Calgary, and Kanagawa University.

Humans utilize their torsos and arms while running to compensate for the angular momentum generated by the lower-body movement during the flight phase. To enable this capability in a humanoid robot, the robot should have human-like mass, a center of mass position, and inertial moment of each link. To mimic this characteristic, we developed an angular momentum control method using a humanoid upper body based on human motion. In this method, the angular momentum generated by the movement of the humanoid lower body is calculated, and the torso and arm motions are calculated to compensate for the angular momentum of the lower body. We additionally developed the humanoid upper-body mechanism that mimics the human link length and mass property by using carbon fiber reinforced plastic and a symmetric structure. As a result, the developed humanoid robot could generate almost the same angular momentum as that of human through human-like running motion. Furthermore, when suspended in midair, the humanoid robot produced the angular momentum compensation in the yaw direction.

The Conversation (0)