Generative Trajectory Stitching through
Diffusion Composition

1 Georgia Tech 2 Harvard University
Equal advising

Trajectory Stitching via Compositional Trajectory Generation. The proposed method, CompDiffuser, generates long-horizon plans by compositionally sampling a sequence of coherent short trajectory segments (while only being trained on short-horizon data).


Abstract

Effective trajectory stitching for long-horizon planning is a significant challenge in robotic decision-making. While diffusion models have shown promise in planning, they are limited to solving tasks similar to those seen in their training data. We propose CompDiffuser, a novel generative approach that can solve new tasks by learning to compositionally stitch together shorter trajectory chunks from previously seen tasks. Our key insight is modeling the trajectory distribution by subdividing it into overlapping chunks and learning their conditional relationships through a single bidirectional diffusion model. This allows information to propagate between segments during generation, ensuring physically consistent connections. We conduct experiments on benchmark tasks of various difficulties, covering different environment sizes, agent state dimension, training data quality, and show that CompDiffuser significantly outperforming existing methods.


Compositional Trajectory Sampling


Compositional Trajectory Denoising Process. We train CompDiffuser on the OGBench PointMaze Large Stitch Dataset, which only contains short trajectories that travel at most 4 blocks. However, at test time, the agent needs to travel up to 15 blocks to reach the given goal (purple star) from the given start (blue circle). The proposed method is able to compositionally sample 5 trajectories, as shown by different colors, to construct a goal-reaching trajectory of much longer horizon than those seen in the training dataset. replay


Composing Different Numbers of Trajectories

Unseen Long Horizon Evaluation Task
Start (circle) & Goal (star)

# of Composed Trajectories $K~ $
replay
Compose Different Numbers of Trajectories at Inference. Our method is only trained on short trajectories segment with maximum length of 5 blocks, while is able to compositionally generate coherent long trajectory given the start (circle) and goal (star). When $K$ is smaller and barely sufficient to reach the goal (e.g., 8), the length of overlapping chunk between segments decreases to extend the travel distance of the overall compositional plan. While if given a larger $K$ (e.g., 11), some parts of the compositional plan might travel redundant distances (back and forth) to consume the extra length.

OGBench AntMaze Stitch Datasets

  • AntMaze Medium Stitch replay

    Task 1

    Task 2

    Task 3

  • AntMaze Medium Stitch replay

    Task 4

    Task 5

Qualitative Results in OGBench AntMaze Medium Stitch Environments. We present the environment rollouts of CompDiffuser in 5 OGBench tasks. The blue circle indicates the start position, the pink circle indicates the goal position, and the yellow circle denotes the current subgoal waypoint for the agent.
  • AntMaze Large Stitch replay

    Task 1

    Task 2

    Task 3

  • AntMaze Large Stitch replay

    Task 4

    Task 5

Qualitative Results in OGBench AntMaze Large Stitch Environment. The blue circle denotes the start position, the pink circle denotes the goal position, and the yellow circle denotes the current subgoal waypoint for the agent.
  • AntMaze Giant Stitch replay

    Task 1

    Task 2

    Task 3

  • AntMaze Giant Stitch replay

    Task 4

    Task 5

Qualitative Results in OGBench AntMaze Large Stitch Environment. The blue circle denotes the start position, the pink circle denotes the goal position, and the yellow circle denotes the current subgoal waypoint for the agent.

Compositional Planning in High Dimension

Task:  
replay

topdown view

agent view
Synthesized Plans in 15D and 29D on OGBench AntMaze Stitch Environments. Note that our method is only trained on short trajectory segments while generates the full goal-reaching trajectory (from bottom left to top right) by composing multiple trajectories (as shown by different colors in the topdown view).

Stitching Extremely Low Quality Data

Task:  
replay

Example Demonstrations

Our Rollout
Example Demonstrations in OGBench AntMaze Explore Dataset and Qualitative Environment Rollout of CompDiffuser.

Compositional Planning in Complex Dynamics

Synthesized Plan: Top-down View

Synthesized Plan: Zoom-in View
Qualitative Compositional Plan in OGBench AntSoccer Stitch Dataset. We generate four trajectories compositionally, as highlighted by the color of the circle beneath the ant (color order: blue -> orange -> green -> red; the circle is just for visualization purpose). This enables our planner to complete new tasks: the ant moves to the soccer ball and then dribbles the ball to the goal location (indicated by pink circle). We use inpainting to enable the goal conditioning in our planner (the inpainted goal state is shown in the last frame of the video). See the Dataset types Section in the OGBench website for dataset examples.

OGBench AntSoccer Stitch Environments

  • AntSoccer Arena Stitch replay

    Task 1

    Task 2

    Task 3

  • AntSoccer Arena Stitch replay

    Task 4

    Task 5

Environment Rollout of CompDiffuser in AntSoccer Arena Environments. In this task, the ant needs to move the ball to the goal position as shown by the pink circle. The initial position of the ant and the ball is shown by the blue and yellow circle respectively. The model is trained on two different types of trajectories: 1) the ant moves without dribbling the ball; 2) the ant moves while dribbling balling. At test time, our method stitches these two types of trajectories to achieve the goal by first moving to the ball from the far side and dribble the ball to the goal position.
  • AntSoccer Medium Stitch replay

    Task 1

    Task 2

    Task 3

  • AntSoccer Medium Stitch replay

    Task 4

    Task 5

Environment Rollout of CompDiffuser in AntSoccer Medium Environments. In this task, the ant needs to move the ball to the goal position as shown by the pink circle. The initial position of the ant and the ball is shown by the blue and yellow circle respectively. The model is trained on two different types of trajectories: 1) the ant moves without dribbling the ball; 2) the ant moves while dribbling balling. At test time, our method stitches these two types of trajectories to achieve the goal by first moving to the ball from the far side and dribble the ball to the goal position.

OGBench HumanoidMaze Stitch Datasets

  • HumanoidMaze Medium Stitch replay

    Task 1

    Task 2

    Task 3

  • HumanoidMaze Medium Stitch replay

    Task 4

    Task 5

Environment Rollout of CompDiffuser in Humanoid Medium Environment. The humanoid agent is marked by a green circle and the current subgoal is indicated by a yellow circle. Compared to the ant agent, we observe that the humanoid agent is less responsive in tracking the yellow subgoal given by the planner. This is probably due to the locomotion complexity of a humanoid agent and might be mitigated by leveraging more robust and specialized inverse dynamics model.
  • HumanoidMaze Large Stitch replay

    Task 1

    Task 2

    Task 3

  • HumanoidMaze Large Stitch replay

    Task 4

    Task 5

Environment Rollout of CompDiffuser in Humanoid Large Environment. The humanoid agent is marked by a green circle and the current subgoal is indicated by a yellow circle.
  • HumanoidMaze Giant Stitch replay

    Task 1

    Task 2

    Task 3

  • HumanoidMaze Giant Stitch replay

    Task 4

    Task 5

Environment Rollout of CompDiffuser in Humanoid Giant Environment. The humanoid agent is marked by a green circle and the current subgoal is indicated by a yellow circle. Due to the complexity of the inverse dynamics modeling, the humanoid may occasionally lose track of the given subgoal. A replanning will be triggered when the distance between the agent's position and the subgoal exceeds a threshold, a


Conclusion

We introduce CompDiffuser, a generative trajectory stitching method that leverages the compositionality of diffusion models. We introduce a noise-conditioned score function formulation that helps in performing autoregressive sampling of multiple short-horizon trajectory diffusion models and eventually stitching them to form a longer-horizon goal-conditioned trajectory. Our method demonstrates effective trajectory stitching capabilities as evident from the extensive experiments on tasks of various difficulty, including different environment sizes, planning state dimensions, trajectory types, and training data quality.


BibTeX


      To be updated soon.