Yorai Shaoul
February 11, 2026
Thesis Committee
YouTube: DanielLaBelle
We want to enable systems with
Mobile Aloha
PerAct [Shridhar 2022]
VLP [Du 2025]
DG-MAMP [Parimi 2025]
MOSAIC [Mishani 2026]
RT-1 [Brohan 2022]
Diffuser [Janner 2022]
Diffusion Policy [Chi 2023]
RT-2 [Zitkovich 2023]
ACT [Zhao 2023]
UMI [Chi 2024]
MPD [Carvalho 2024]
\(\pi_0\) [Physical Intelligence 2024]
DP3 [Ze 2024]
\(\pi_{0.5}\) [Physical Intelligence 2025]
A* [Hart 1968]
D* [Stentz 1994]
PRM [Kavraki 1996]
RRT [LaValle 1998]
RRT-Connect [Kuffner 2000]
D* Lite [Koenig 2002]
CHOMP [Ratliff 2009]
RRT* [Karaman 2011]
STOMP [Kalakrishnan 2011]
TrajOpt [Schulman 2014]
CBS [Sharon 2015]
dRRT* [Shome 2019]
MAPF-LNS [Li 2022]
For multi-robot systems,
much less work has been done at the boundary between planning and learning.
We can enable multi-arm manipulation if we
plan what we can model, and learn what we cannot.
To capitalize on the complementary strengths, establish algorithms that:
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Plan what we can, and learn what we must.
Based on
Our goal: find a collision-free solution for all arms that minimizes an objective (sum of costs).
The common approach: directly apply your favorite popular motion planner!
4-Robot Bin-Picking
8-Robot Shelf Rearrangement
Sampling in high dimensions
(56 DoF for 8 arms) is hard!
Kuffner, J.J. and LaValle, S.M. RRT-connect: An efficient approach to single-query path planning. ICRA 2000.
Kavraki, L.E., Svestka, P., Latombe, J.C. and Overmars, M.H. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. T-RO 2002.
Multi-Agent Path Finding algorithms are known for their scalability, can they help?
Recently, exciting work MAPF, a simplified abstract planning problem, has shown impressive results.
Check out the work at
!
In particular, the CBS framework has been especially influential.
Plan for all agents individually.
Find collisions (conflicts) between their current paths.
Impose a single constraint on each participating agent and replan.
Repeat.
Sharon, G., Stern, R., Felner, A. and Sturtevant, N.R. Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence, 2015.
In particular, the CBS framework has been especially influential.
This framework guarantees completeness and optimality for MAPF.
Plan for all agents individually.
Find collisions (conflicts) between their current paths.
Impose a single constraint on each participating agent and replan.
Repeat.
A relevant improvement to CBS is ECBS. Here solution quality can be traded for computational efficiency (i.e., finding solutions more quickly).
Barer, M., Sharon, G., Stern, R. and Felner, A. Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. SoCS 2014.
A relevant improvement to CBS is ECBS. Here solution quality can be traded for computational efficiency (i.e., finding solutions more quickly).
Joint angles
B. J. Cohen, S. Chitta and M. Likhachev, "Search-based planning for manipulation with motion primitives," ICRA 2010
Joint angles
Joint angles
Joint angles
The (in)efficiency of single-robot planning greatly affects CBS-like algorithms.
4-Robot Bin-Picking
8-Robot Shelf Rearrangement
The repetitive planning in CBS takes time.
_
Successive single-agent planner calls are nearly identical.
-> CBS may require many replanning calls to find a solution.
The planning graph remains the same
Start and goal states remain the same between replanning calls
The path segments before and after the constraints are still valid
The only change is the addition of a new constraint
By reusing experience, the new search effort (expanded states) is smaller.
Reusing computation helps! And retains bounded sub-optimality and completeness guarantees.
Reusing search efforts significantly speeds up the search.
4-Robot Bin-Picking
8-Robot Shelf Rearrangement
Reusing search efforts significantly speeds up the search.
xECBS has failure modes.
xECBS imposes constraints that only prohibit one configuration at a single time
(or transition), and may need many replanning calls before a collision is resolved.
Can we use "stronger"
constraints in CBS?
Can we use "stronger" constraints in CBS?
Can we use "stronger" constraints in CBS?
Yes, but we need to be careful.
Can we use "stronger" constraints in CBS?
Yes, but we need to be careful.
xECBS with point constraints (spheres with small radii) maintains completeness and solves more problems.
But some constraints can render problems unsolvable.
We develop Generalized-ECBS [SoCS-24], which allows for using any constraint in CBS without losing completeness guarantees.
Check it out in SRMP! https://srmp.readthedocs.io/
Mishani, I.*, Shaoul, Y.*, Natarajan, R.*, Li, J. and Likhachev, M., 2025. SRMP: Search-Based Robot Motion Planning Library. arXiv 2025.
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Plan what we can, and learn what we must.
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Plan what we can, and learn what we must.
Search-based algorithms are effective for multi-arm motion planning
[ICAPS 24, SoCS 24]
Based on
In this part, we'll consider two types of problems:
Coordination
Robots have individual tasks and negotiate space
Collaboration
Robots have a shared objective and must work together
Motion Pattern
Formally, given
Compute
Available
Data
Expressive Modeling
Scale with Agents
Scale to large environments
Learn directly [Carvalho et al. 2023]
Available
Data
Expressive Modeling
Scale with Agents
Scale to large environments
Learn directly [Carvalho et al. 2023]
Learn cost maps for
classical planning
What we want: Rely on local, single-robot data,
and model flexibly
Learn directly [Carvalho et al. 2023]
Available
Data
Expressive Modeling
Scale with Agents
Scale to large environments
Learn cost maps for
classical planning
Motion planning diffusion models [Carvalho 2023, Janner 2022] have shown impressive performance for generating robot trajectories under implicit objectives.
Diffusion models generate trajectories from noisy trajectories.
Carvalho et al. 2023
Dataset
Generated
Denoising Process
Beginning from a pure noise trajectory \(^K\boldsymbol{\tau}^i\),
the model de-noises it incrementally for \(K\) denoising steps.
Motion Planning Diffusion models [Carvalho 2023, Janner 2022] generate trajectories from noisy trajectories.
Denoising Process
Denoising Process
Visualizing the guidance function gradient as a "force" applied to each trajectory point.
Denoising Process
Allows us to design "soft" spatio-temporal constraints via guidance functions.
Since single-robot diffusion models can admit constraints, we can use them within the CBS framework!
We must learn the motion patterns from data
And we can plan to coordinate the learned models
Since single-robot diffusion models can admit constraints, we can use them within the CBS framework!
Motion Pattern
Learning single-robot trajectory generators and coordinating them with search proved to be an effective recipe.
"Conveyor"
"Highways"
Freespace
"Drop-Region"
Learning single-robot trajectory generators and coordinating them with search proved to be an effective recipe.
"Conveyor"
"Highways"
Freespace
"Drop-Region"
MMD-CBS
MMD-x(E)CBS
New Constraint
Previous Path
Noise a little.
Denoise.
PP
MMD-ECBS
"Weak
Constraints"
The higher success-rate, the better.
...without compromising data adherence.
Number of Agents
Success Rate
Number of Agents
Data Adherence
MMD scales much better than fixed-team-size models.
A familiar trend appears:
A naive application of CBS is insufficient, and ideas from improved algorithms help.
Easy
Hard
A familiar trend appears:
A naive application of CBS is insufficient, and ideas from improved algorithms help.
Easy
Hard
In this part, we'll consider two types of problems:
Coordination
Robots have individual tasks and negoatiate space
Collaboration
Robots have a shared objective and must work together
When robots work together, and depend on each other, to complete a global task, learning single-robot models becomes harder
Compute a set of robot motions
\(\Tau := \{\tau^i\}_{i=1}^N \) such that, upon execution, all objects arrive at their goals.
Plan Object Motions
Generate Manipulation Interactions
Plan Robot Motions
Input: object images and goals. Output: short horizon robot motions \(\Tau := \{\tau^i\}_{i=1}^N \).
Plan Object Motions
Generate Manipulation Interactions
Plan Robot Motions
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
A new anonymous multi-robot motion planning algorithm.
A novel flow matching co-generation manipulation interaction generation model.
We fundamentally have two decisions to make here:
\(\Phi(^0\mathcal{T})\)
Noise
~Data Distribution
Flow-Matching
Flow-matching generates new samples from noise by learning a velocity field and integrating it.
Learn
\(\mathcal{K}, \mathcal{T} \gets \pi_\theta( \mathcal{I}, T \in SE(2))\)
from simulated demonstrations
We fundamentally have two decisions to make here:
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
Generate continuous manipulation trajectories.
We fundamentally have two decisions to make here:
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
Tie contact points to the
discrete observation-space.
Generate continuous manipulation trajectories.
We can pose this as a discrete problem, asking
which pixels in our image observation should be used for contact points?
Flow
Diffusion
Discrete Flow
Let's look at this 2D state space for example:
As before, we require two components:
Getting an initial value is easy. We could
For the velocity field, things are a bit different in the discrete case.
For some state \(x=(x^1, x^2)\), we want
That is, we could define a factorized velocity \(u^i(\cdot, x) = \delta_{X^i_1}\).
In continuous flow matching, we asked a model to predict the velocity at some interpolated state \(x_t\).
Given \(X_0\) and \(X_1\), one way to obtain \(X_t\) in the discrete case is to have each element \(X_t^i\) as
Similar to the continuous case
In essence, a similar process to the continuous case.
Starting with \(X_t = X_0\):
We fundamentally have two decisions to make here:
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
We fundamentally have two decisions to make here:
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
The models flexibly select number of robots, given a budget.
\(u^\theta_\text{dis}\)
\(u^\phi_\text{cont}\)
Observation
Transform
Contacts
\(K_t \in \mathscr{D}^{2\cdot N}\)
Trajectories
\(X_t \in \mathbb{R}^{2\cdot N \cdot H} \)
Plan Object Motions
Generate Manipulation Interactions
Plan Robot Motions
Plan motions
to contact points
Learn physical
interactions
Plan motions
for objects
Object- and robot-planning fall under the same algorithmic umbrella.
Plan compositions of short actions
Object- and robot-planning fall under the same algorithmic umbrella.
Anonymous Multi-Agent Path Finding (MAPF), traditionally, is solved on regular grids.
Regular grid.
Motion primitives.
kei18.github.io
What we want
Livelocks
(PIBT with motion primitives)
Deadlocks
(TSWAP with motion primitives)
What we get
(from naively applying existing AMAPF ideas)
To mitigate these issues, we developed the "\(\text{G\scriptsize SPI}\)" algorithm, a "continuous-space" hybrid adaptation of PIBT with goal-swapping inpired by TSWAP and C-UNAV.
PIBT
+ Goal Swapping
= \(\text{G\scriptsize SPI}\) (Goal Swapping with Priority Inheritance)
Priority inheritance with backtracking adapted to non-point robots moving along motion primitives.
Goals and priorities are swapped when this benefits the higher priority robot and the system.
\(\text{{G\scriptsize SPI}}\) is efficient and scales to 300 robots in our experiments.
\(\text{{G\scriptsize SPI}}\) handles stress tests that often break algorithms:
Some results
Results
C-UNAV / ORCA
Failure cases for baselines
PIBT
TSWAP
Object- and robot-planning fall under the same algorithmic umbrella.
Plan Object Motions
Generate Manipulation Interactions
Plan Robot Motions
Plan Robot Motions
Input: object images. Output: short horizon robot motions \(\Tau := \{\tau^i\}_{i=1}^N \).
\(\text{GC}\scriptsize{\text{O}}\) composes per-object models, scaling to nine robots and five objects, including cases where objects outnumber robots.
Gradually increasing test difficulty.
Despite being trained only on rectangles and circles, the models handle arbitrary polygons thanks to the flexibility in the observation medium.
In distribution.
Out of distribution.
Gradually increasing test difficulty.
\(\text{GC}\scriptsize{\text{O}}\) composes per-object models, scaling to nine robots and five objects, including cases where objects outnumber robots.
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Plan what we can, and learn what we must.
Search-based algorithms are effective for multi-arm motion planning
[ICAPS 24, SoCS 24]
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Plan what we can, and learn what we must.
Search-based algorithms are effective for multi-arm motion planning
[ICAPS 24, SoCS 24]
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Single-robot and single-objective model composition scales to larger problems
[ICLR 25, In Review 2026]
Plan what we can, and learn what we must.
Search-based algorithms are effective for multi-arm motion planning
[ICAPS 24, SoCS 24]
The building blocks we have developed:
What we want, is to enable flexible multi-arm manipulation systems.
AI Generated, Grok
Proposed work
AI Generated, Grok
Collaborative Multi-Arm Manipulation
Coordinated Multi-Arm Manipulation
Our main goal:
AI Generated, Grok
Plan Object Motions
Generate Manipulation Interactions
Plan Robot Motions
Follow the GCo framework, and create new methods for scaling to manipulators.
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
Follow a similar approach as \(\text{G\scriptsize{CO}}\). Generating demonstration data in simulation.
Collaborative Multi-Arm Manipulation: Current Ideas
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
We need a method that reasons over
Assignments
Most modern manipulation policies output
motions for hands, and not arms. Embodying them is not trivial.
Collaborative Multi-Arm Manipulation: Current Ideas
We need a method that reasons over
Assignments
Redundant motions
Collaborative Multi-Arm Manipulation: Current Ideas
We need a method that reasons over
Assignments
Redundant motions
Collaborative Multi-Arm Manipulation: Current Ideas
We need a method that reasons over
Assignments
Redundant motions
Collaborative Multi-Arm Manipulation: Current Ideas
We propose \(\text{OM-CBSA}\), a multi-arm embodiment approach.
Collaborative Multi-Arm Manipulation: Current Ideas
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
Plan Object Motions
Generate
Manipulation Interactions
Embody
Manipulation Interactions
Plan Robot
Motions
AI Generated, Grok
Collaborative Multi-Arm Manipulation
Coordinated Multi-Arm Manipulation
Our main goal:
AI Generated, Grok
Draw on:
AI Generated, Grok
Li, J., Tinka, A., Kiesel, S., Durham, J.W., Kumar, T.S. and Koenig, S. Lifelong multi-agent path finding in large-scale warehouses. AAAI 2021.
Constraint
"Action chunk"
Draw on:
AI Generated, Grok
Li, J., Tinka, A., Kiesel, S., Durham, J.W., Kumar, T.S. and Koenig, S. Lifelong multi-agent path finding in large-scale warehouses. AAAI 2021.
"Action chunk"
MMD Repair
AI Generated, Grok
Li, J., Tinka, A., Kiesel, S., Durham, J.W., Kumar, T.S. and Koenig, S. Lifelong multi-agent path finding in large-scale warehouses. AAAI 2021.
"Action chunk"
MMD Repair
AI Generated, Grok
Collaborative Multi-Arm Manipulation
Coordinated Multi-Arm Manipulation
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Single-robot and single-objective model composition scales to larger problems
[ICLR 25, In Review 2026]
Plan what we can, and learn what we must.
Search-based algorithms are effective for multi-arm motion planning
[ICAPS 24, SoCS 24]
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation
Single-robot and single-objective model composition scales to larger problems
[ICLR 25, In Review 2026]
Utilize established planning algorithms with new composable learned models for effective multi-arm manipulation.
Plan what we can, and learn what we must.
Search-based algorithms are effective for multi-arm motion planning
[ICAPS 24, SoCS 24]
Multi-Arm Policy Embodiment
~2 months
Paper writing, completing experiments, and deeper analysis.
~3 months
Work on (and support for) policy learning and coordination mechanisms.
Multi-Arm Coordination
Multi-Arm Collaboration
~3 months
Develop interaction models in simulation.
~2 months
Set up real-world robot system.
~3 months
Creating algorithmic planning+learning framework, and writing.
Max
Jiaoyang
Itamar
Shivam
Rishi
Ram
Federico
Philip
Zhe
Naveed
Thank you for your attention! Questions?
Multi-Robot-Arm Motion Planning
Interleaving Planning and Learning on the Plane
Interleaving Planning and Learning for
Multi-Arm Manipulation