Yorai Shaoul, Zhe Chen*, Naveed Gul Mohamed*,
Federico Pecora, Maxim Likhachev, and Jiaoyang Li.
Explore multi-robot collaboration.
Collaboration is:
Collaboration
Robots work together, and depend on each other, to complete a global task.
Collaboration is not:
Coordination
Robots execute learned behaviors in a shared
environment to complete private tasks.
Given
Given
We want to
Compute a set of motions \(\Tau := \{\tau^i\}_{i=1}^N \) such that, upon execution, all objects arrive at their goals.
The online problem
Compute short-horizon motions \(\Tau := \{\tau^i\}_{i=1}^N \), execute them, and repeat. Eventually, the objects must arrive at their goals.
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
We interleave plannning and learning in a generative collaboration framework \(\text{GC}\scriptsize{\text{O}}\).
Learn what we "must," and plan what we can.
Difficult to model
Available geometric models
Agenda
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
We seek to learn two things:
In this work, we do so jointly with imitation learning.
We seek to learn two things:
We seek to learn two things:
We seek to learn two things:
We seek to learn two things:
Many manipulation trajectories and contact points
combinations are valid, but not all are useful.
Many manipulation trajectories and contact points
combinations are valid, but not all are useful.
Many manipulation trajectories and contact points
combinations are valid, but not all are useful.
Easy to demonstrate
Observe manipulation dynamics under various contact formations.
Flow-Matching
Easy to generate.
(Sample noisy interactions.)
Many manipulation trajectories and contact points
combinations are valid, but not all are useful.
We cannot
sample from this!
We know how to
sample from this!
Flow-matching generates new samples from noise by learning a velocity field and integrating it.
Flow-Matching
\(\Phi(^0\mathcal{T})\)
We can learn contacts and manipulation trajectories with "vanilla" flow matching.
Flow-Matching
\(\Phi(^0\mathcal{T})\)
We can learn contacts and manipulation trajectories with flow matching.
Dataset of \(\langle O, T, \mathcal{K}, \mathcal{T} \rangle\)
Initial Contact
Points
Manipulation
Trajectories
Transform
Observation
We can learn contacts and manipulation trajectories with flow matching.
Continuous flow matching for contact modeling.
Continuous flow matching for contact modeling.
Continuous flow-matching was unstable.
Solution:
Split the representation, but generate concurrently.
Continuous flow matching.
Contact Points
\[\mathcal{K} \in \mathbb{R}^{B \times 2}\]
Manipulation Trajectories
\[\mathcal{T} \in \mathbb{R}^{B \times H \times 2}\]
(always rooted at the origin)
Learn two velocity fields, \(u^{\theta}_{\mathcal{K},t}( ^t\mathcal{K})\) and \(u^{\theta}_{\mathcal{T},t}( ^t\mathcal{T})\), and concurrently generate both contact points and manipulation trajectories:
\[^{t+\Delta t}\mathcal{K} \gets ^t\mathcal{K} + u_{\mathcal{K},t}^\theta( ^t\mathcal{K}) \cdot \Delta t\] \[^{t+\Delta t}\mathcal{T} \gets ^t\mathcal{T} + u_{\mathcal{T},t}^\theta( ^t\mathcal{T}) \cdot \Delta t\]
Continuous-Continuous.
Continuous flow-matching co-generation was better, but still unstable.
We can learn contacts and manipulation trajectories with "vanilla" flow matching.
Flow-Matching
\(\Phi(^0\mathcal{T})\)
This was not very stable.
We fundamentally have two decisions to make:
We fundamentally have two decisions to make:
We fundamentally have two decisions to make here:
Noisy
Noisy
Clean
Clean
We fundamentally have two decisions to make here:
We fundamentally have two decisions to make here:
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
Generate continuous manipulation trajectories.
We fundamentally have two decisions to make here:
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
Tie contact points to the
discrete observation-space.
Generate continuous manipulation trajectories.
We can pose this as a discrete problem, asking
which pixels in our image observation should be used for contact points?
Flow
Diffusion
Discrete Flow
Let's look at this 2D state space for example:
As before, we require two components:
Getting an initial value is easy. We could
For the velocity field, things are a bit different in the discrete case.
For some state \(x=(x^1, x^2)\), we want
That is, we could define a factorized velocity \(u^i(\cdot, x) = \delta_{X^i_1}\).
In continuous flow matching, we asked a model to predict the velocity at some interpolated state \(x_t\).
Given \(X_0\) and \(X_1\), one way to obtain \(X_t\) in the discrete case is to have each element \(X_t^i\) as
Similar to the continuous case
In essence, a similar process to the continuous case.
Starting with \(X_t = X_0\):
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
Observation
Transform
Contacts
\(K_t \in \mathscr{D}^{2\cdot N}\)
Trajectories
\(X_t \in \mathbb{R}^{2\cdot N \cdot H} \)
The models flexibly select number of robots, given a budget.
Let's assume, for a moment, that we know how to plan for robots and objects.
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
Let's assume, for a moment, that we know how to plan for robots and objects.
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
Let's assume, for a moment, that we know how to plan for robots and objects.
\(\text{GC}\scriptsize{\text{O}}\) composes per-object models, scaling to nine robots and five objects, including cases where objects outnumber robots.
Despite being trained only on rectangles and circles, the models handle arbitrary polygons thanks to the flexibility in the observation medium.
In distribution.
Out of distribution.
Gradually increasing test difficulty.
Agenda
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
Object- and robot-planning fall under the same algorithmic umbrella.
Plan compositions of short actions
Object- and robot-planning fall under the same algorithmic umbrella.
Plan compositions of short actions
Problem Formulation
Go from here to there (it does not matter who goes where).
Minimizing
Anonymous motion planning has been explored in the Multi-Agent Path Finding (MAPF) community.
kei18.github.io
Most algorithms, like TSWAP, operate on regular grids.
But we want to operate in "continuous" spaces.
We take an intermediate approach, with motion-primitives.
Key ideas: swap goals between robots when it benefits the system.
Anonymous MAPF, traditionally, is solved on regular grids.
Regular grid.
Motion primitives.
Swapping goals between robots when their next-best steps conflict with each other works... sometimes.
Other times it leads to deadlocks.
An extremely scalable turn-by-turn procedure for labeled MAPF.
In a nutshell, high priority robots can push away lower priority ones.
An extremely scalable turn-by-turn procedure for labeled MAPF.
A single robot (loosely) operates as follows:
Then, in the order of priorities:
This is not only a continuous-space issue, actually. PIBT only guarantees reachability and not remaining at goals.
Works... sometimes.
What we want
Livelocks
Deadlocks
What we get
(from naively applying existing AMAPF ideas)
To mitigate these issues, we developed the "\(\text{G\scriptsize SPI}\)" algorithm, a "continuous-space" hybrid adaptation of PIBT with gaol-swapping inpired by TSWAP and C-UNAV.
PIBT
+ Goal Swapping
= \(\text{G\scriptsize SPI}\) (Goal Swapping with Priority Inheritance)
Suffers from livelocks
Suffers from deadlocks
In a nutshell:
Setup:
Loop:
\(\text{G\scriptsize SPI}\)
\(\text{{G\scriptsize SPI}}\) is efficient and scales to 300 robots in our experiments.
Straight lines are the result of effective goal swapping.
\(\text{{G\scriptsize SPI}}\) handles stress tests that often break algorithms:
Some results
Results
C-UNAV / ORCA
Failure cases for baselines
PIBT
TSWAP
Move robots, manipulate a single object.
Agenda
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
Agenda
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects