MuJoCo Robotics Lab — Part 8: Motion Planning & Collision Avoidance

Lab 4 of the MuJoCo Robotics Lab series. The UR5e controller from the previous lab can track any feasible trajectory to sub-centimetre precision — but someone still has to generate that trajectory. This post builds the full planning pipeline: HPP-FCL collision queries against the executed MuJoCo geometry, an RRT* planner in the 6-DOF joint space, random shortcutting to strip redundant waypoints, and TOPP-RA time parameterization that produces a 2.659 s time-optimal trajectory respecting velocity and acceleration limits. The whole pipeline ends with the Lab 3 PD+gravity controller tracking the result.

Problem

The gravity-compensated PD controller from Lab 3 will track anything you hand it, as long as the reference is smooth and dynamically feasible. That is a necessary condition, not a sufficient one. In a cluttered workspace — table, mounted gripper, fixed obstacles — the hard question is how to generate a reference path that is simultaneously collision-free in 6-dimensional configuration space and fast enough to be useful. Joint-space interpolation between a start and a goal almost always clips something. The planner has to reason about geometry, and then it has to hand the geometry layer to a time parameterizer that respects the robot's actual kinematic limits.

Theory

Narrow-phase collision checking

The planner asks one question repeatedly: is configuration q in collision? To answer it honestly we need the same geometry MuJoCo will eventually execute. The canonical scene is Menagerie's universal_robots_ur5e with a mounted robotiq_2f85, a table geom, and the obstacle boxes. Pinocchio is still used for FK utilities, but collision truth comes from HPP-FCL queries against that exact MuJoCo scene state.

Three things are checked for every sample:

arm-vs-environment (table, obstacles, floor)
arm-vs-gripper and inter-link self-collision
edge validity — the straight-line segment q1 → q2 in joint space is discretized at resolution 0.05 rad and every intermediate q is queried

Internal Robotiq finger-link proximity is excluded. The fingers touch each other by design when the gripper closes, and feeding that into the planner poisons every configuration where the gripper is anywhere except wide open.

RRT* sampling

Configuration space is 6D with joint limits [-2pi, 2pi] per joint. The distance metric is plain L2:

d(q1, q2) = || q1 - q2 ||_2

RRT* grows a tree rooted at q_start. At each iteration it samples a random configuration (with probability goal_bias = 0.1 it samples the goal instead), finds the nearest tree node, and steers toward the sample by at most step_size = 0.3 rad:

q_new = q_nearest + min(1, step_size / d) * (q_rand - q_nearest)

Two things make it RRT* rather than plain RRT. First, best-parent selection: before attaching q_new, search all tree nodes inside rewire_radius = 1.0 and pick whichever yields the lowest cumulative cost:

cost(q_new) = min over i in near_indices of
              tree[i].cost + d(tree[i].q, q_new)
              subject to edge(tree[i].q, q_new) collision-free

Second, rewiring: after q_new is attached, every neighbor is re-examined. If routing a neighbor through q_new would lower its cost and the edge is free, the neighbor's parent is replaced and the cost change is propagated recursively through the subtree. These two steps are what give RRT* its asymptotic optimality — the tree keeps straightening itself out as samples accumulate. The planner does not return on first goal contact; it keeps going until max_iter is reached, updating the goal node's parent whenever a shorter route appears.

Shortcutting

RRT* paths are shorter than raw RRT paths, but they still contain kinks from the sampling process. Random shortcutting is dead simple and effective: pick two non-adjacent waypoints i < j, check the straight-line edge path[i] → path[j], and if it's collision-free delete everything between them. Repeat for max_iter = 200 iterations. On the capstone scene this routinely collapses 35 raw waypoints down to 3.

TOPP-RA time parameterization

A geometric path q(s) with path parameter s in [0, 1] says nothing about timing. TOPP-RA (Time-Optimal Path Parameterization by Reachability Analysis) walks along the path and computes, at every grid point, the maximum squared path velocity u(s) = s_dot^2 such that joint velocity and acceleration constraints hold:

| dq/ds * s_dot |        <= v_max     (velocity)
| dq/ds * s_ddot + d2q/ds2 * s_dot^2 | <= a_max   (acceleration)

Forward and backward reachability sweeps produce the unique time-optimal s_dot(s) profile under those box constraints. Integrating gives the trajectory time stamps. A quintic fallback is available when TOPP-RA cannot be compiled, using the same velocity and acceleration box to stay within limits on the tested scenarios.

Implementation

The pipeline has four modules, each behind a small API:

CollisionChecker -> RRTStarPlanner -> shortcut_path -> parameterize_topp_ra -> execute_trajectory

CollisionChecker loads the canonical MuJoCo scene once, registers the arm-vs-environment and relevant self-collision geom pairs, and exposes three methods:

cc.is_collision_free(q)              # -> bool
cc.is_path_free(q1, q2, resolution)  # -> bool  (discretized edge check)
cc.compute_min_distance(q)           # -> float

Every planner query runs against the actual MuJoCo scene state, so there is zero drift between what the planner reasons about and what the controller later executes. That mismatch was the single biggest source of planning bugs in earlier iterations — a separate URDF planning model could disagree with the executed stack and the planner would confidently hand over paths that collided on playback.

The planner itself is a flat list of RRTNode records with parent references by index:

@dataclass
class RRTNode:
    q: np.ndarray          # joint configuration (6,)
    parent: int | None     # index of parent node
    cost: float            # cumulative C-space path length from root

planner = RRTStarPlanner(
    collision_checker=cc,
    step_size=0.3,
    goal_bias=0.1,
    rewire_radius=1.0,
    goal_tolerance=0.15,
)

path = planner.plan(
    q_start=Q_HOME,
    q_goal=q_target,
    max_iter=5000,
    rrt_star=True,
    seed=42,
)

The main loop is about forty lines. Sample, steer, collision-check the edge, pick the best parent inside rewire_radius, attach, rewire neighbors, propagate costs. Path extraction is a single backward walk through parent indices. seed is threaded through everything so runs are reproducible — non-reproducible planners are a debugging nightmare.

Downstream of the planner, shortcutting and time parameterization compose into a single call pair:

short = shortcut_path(path, cc, max_iter=200)
times, pos, vel, acc = parameterize_topp_ra(
    short,
    VEL_LIMITS,   # per-joint velocity box
    ACC_LIMITS,   # per-joint acceleration box
)

The returned (times, pos, vel, acc) tuple is the interface the Lab 3 controller already expects: at each simulation step, look up the desired q_d, qd_d, compute tau = Kp (q_d - q) + Kd (qd_d - qd) + g(q), map through the Menagerie actuator model, step MuJoCo. Nothing in the execution layer had to change.

Results

Capstone scene, Q_HOME to a goal across the obstacle field, seed = 42, max_iter = 5000.

Metric	Value
RRT* path found	yes
Raw waypoints from planner	35
Waypoints after shortcutting	3
Shortcut reduction	91%
TOPP-RA trajectory duration	2.659 s
Velocity limit violations	0
Acceleration limit violations	0
RMS joint tracking error (capstone)	0.0125 rad
Final position error (capstone)	0.0016 rad
RMS joint tracking error (blocked-path scene)	0.0124 rad
Final position error (blocked-path scene)	0.0041 rad
Direct start-to-goal edge collision-free	false

The headline number is the 2.659 s trajectory on a problem where a straight line from start to goal punches through an obstacle. RRT* finds a topologically correct route, shortcutting crushes it down to three waypoints, TOPP-RA squeezes the time-optimal schedule out of the remaining geometry, and the Lab 3 controller tracks the whole thing to under 0.005 rad final error. The tests in test_collision.py, test_planner.py, and test_trajectory.py all pass on the canonical stack — including a direct RRT-vs-RRT* cost comparison that verifies the rewiring actually produces shorter paths.

Key Takeaways

Plan on the executed geometry. Using a separate collision URDF is tempting and always causes silent drift. Route HPP-FCL queries through the same MuJoCo scene the controller will run.
Seed every sampling planner. Reproducibility is not a nice-to-have; without it you cannot tell whether a change fixed a bug or just got lucky.
RRT earns its keep through rewiring.* First-solution RRT paths are visibly worse; the rewire step is what gives you an asymptotically optimal tree in exchange for a modest per-iteration cost.
Shortcutting is free performance. 200 random shortcut attempts turn a 35-waypoint kinked path into a 3-waypoint near-straight one. Do it every time.
Separate geometry from time. RRT* finds a path, TOPP-RA schedules it. Coupling the two is a classic over-engineering trap — keep the geometric planner agnostic to dynamics and let a time parameterizer handle limits.
Reuse the tracking stack. The planning pipeline outputs exactly what the Lab 3 PD+gravity controller already consumes. No execution-layer changes were needed, which is how layered robotics stacks are supposed to feel.