Data Collection
Overview
Section titled “Overview”Collecting high-quality training data is crucial for building effective AI models. This guide covers best practices for data collection in Qualia based on LeRobotDataset v3.0 .
Qualia supports various data collection methods for training vision-language-action (VLA) models:
- Manual data collection - Record demonstrations manually
- Automated collection - Use scripts to gather data at scale
- Import existing datasets - Bring your own data
Installation
Section titled “Installation”LeRobotDataset v3.0 will be included in lerobot >= 0.4.0. Until that stable release, you can use the main branch by following the build from source instructions.
LeRobotDataset v3.0 is a standardized format for robot learning data. It provides unified access to multi-modal time-series data, sensorimotor signals and multi‑camera video, as well as rich metadata for indexing, search, and visualization on the Hugging Face Hub.
Record a dataset
Section titled “Record a dataset”Run the command below to record a dataset with the SO-101 and push to the Hub:
lerobot-record \ --robot.type=so101_follower \ --robot.port=/dev/tty.usbmodem585A0076841 \ --robot.id=my_awesome_follower_arm \ --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 1920, height: 1080, fps: 30}}" \ --teleop.type=so101_leader \ --teleop.port=/dev/tty.usbmodem58760431551 \ --teleop.id=my_awesome_leader_arm \ --display_data=true \ --dataset.repo_id=${HF_USER}/record-test \ --dataset.num_episodes=5 \ --dataset.single_task="Grab the black cube"More information about LeRobotDataset v3.0
Section titled “More information about LeRobotDataset v3.0”For a better explanation on the characteristic OF LeRobotDataset v3.0 we encourage you to read more about it on their official HUgginFace page.