Data Preparation

Before launching a training job, it's critical to collect high-quality demonstration data that clearly reflects the behaviour you want the robot to learn. This section explains how to curate and structure your dataset for optimal model performance.

Sensor Index Settings

Index your sensors sequentially on the Command Console, and make sure they are in the same order during data collection and AI inferencing, e.g. if the base camera is index 0 and end-effector camera is index 1 for data collection, they should keep the same index for AI inferencing. You can remove or add extra cameras at any time, but make sure you don't change the index of the existing camera.

Demonstration Collection

You can collect demonstration data using teleoperation in the AMAS platform. To get started, follow the Data Collection Guide.

We recommend collecting multiple demonstrations of the same task (e.g. 50 and more), varying initial conditions where possible to improve generalisation.

Data Curation

High Action Consistency

Make sure demonstrations are:

  • Smooth and repeatable.

  • Consistent in timing and intention.

  • Free from erratic or unintended control inputs.

High Initial State Variance

Include varied starting poses, object locations, or environmental conditions, such as:

  • Randomised arm position or object location and/or orientation.

  • Different lighting conditions.

Sufficient Demonstration Coverage

  • At least 50 high-quality episodes per task is a good starting point.

  • Avoid short or incomplete demonstrations.

  • Make sure each episode completes the full task from start to finish.

Exclude Bad Data

Duration data collection, cancel the episode recording if you think the data is unusable. After data collection, you can review the collected data on Web Console, and flag data you want to exclude in model training. Check Data Visualisation Guide for how to flag data.

Was this helpful?