Skip to content

Communication Protocol

As introduced in the Quick Start using fake_port, the platform uses socket-based communication to exchange structured data between the simulation controller and the inference module. You can extend fake_port by integrating your own model controller to conduct benchmarking.

Outgoing Data

At each simulation timestep, the following structured data is sent through the socket:

data = {
"camera_data": camera_data,
"instruction": instruction,
"joint_position_state": joint_position_state,
"franka_hand_pose": franka_hand_pose,
"franka_pose": franka_pose,
"timestep": step,
"reset": reset,
}

Field Definitions

Field NameTypeDescription
camera_datadictData collected from robot-mounted cameras.
instructionstringNatural language instruction or task prompt for the agent.
joint_position_statenp.ndarray (shape: (9,))Current joint positions of the robot arm (e.g., Franka).
franka_hand_posetuple(np.ndarray, np.ndarray)World-frame pose of the Franka hand: position (3,), quaternion (4,) (scalar-first).
franka_posetuple(np.ndarray, np.ndarray)World-frame pose of the Franka base.
timestepintCurrent timestep index during rollout.
resetboolWhether to reset the environment or the model.

This data is received by the model or controller, which is expected to return a structured action message in response.


Camera Data Structure

Each camera_data dictionary entry includes the following fields per camera:

Field NameTypeDescription
pnp.ndarray (shape: (3,))Camera position in world coordinates.
qnp.ndarray (shape: (4,))Camera orientation as a quaternion (scalar-first) in world coordinates.
rgbnp.ndarrayRGB image array. Resolution depends on configuration in configs/cameras/.
depthnp.ndarrayDepth image array.
intrinsics_matrixnp.ndarray (shape: (3, 3))Camera intrinsics matrix.

Incoming Data (Returned Action)

The returned data should be an action dictionary. Currently, only Franka robots are supported. The expected return is a 9-dimensional array representing joint control:

  • First 7 elements: Joint angle deltas (in radians). If your model outputs absolute joint positions, you must convert them to deltas. Example:

    while True:
    data = wait_message(receive_socket)
    if data["reset"]:
    last_joint_position = data["joint_position_state"]
    processed_data = process_data(data)
    action = model.inference(processed_data)
    delta_joint_position[:7] = action[:7] - last_joint_position[:7]
    last_joint_position = action
    action = {"action": delta_joint_position}
    send_message(send_socket, action)
  • Last 2 elements: Gripper width control. [0.04, 0.04] represents fully open, and [0.0, 0.0] represents fully closed.

Return in format:

action = {"action": nine-dimensional-array}