Generate Your Own Data / Benchmark
In this chapter, we will walk through how to use GenManip to generate your own datasets or custom benchmarks.
You can directly edit USD scenes and write the corresponding Config file, then run GenManip to generate data. The video below demonstrates the full process — we recommend watching it on YouTube in its original quality:
Data Generation vs. Benchmark Generation
Benchmark generation follows the same process as data generation. Typically, we first perform a closed-loop validation based on the current layout to ensure that our Oracle solver is capable of completing the task.
- If the validation succeeds, it means the task is solvable.
- If you are already confident that the task is solvable, you can set
mode
in the Config file toBenchmark
to skip closed-loop validation and directly save the layout.
Generating Data
Once you have written the Config file, use the following commands to generate data:
# Iterate over each dictionary in demonstration_configs and generate num_episode data samplespython demogen_V4.py -cfg configs/tasks/xxx.yml
# Iterate over each dictionary in demonstration_configs and render num_episode data samplespython render_V3.py -cfg configs/tasks/xxx.yml
The generated data will be saved in:
saved/demonstrations/<task_name>/
Generating Test Cases
To generate evaluation test cases, run:
# Iterate over each dictionary in evaluation_configs and generate num_test test casespython demogen_V4.py -cfg configs/tasks/xxx.yml --eval
# Standardize the folder names (000, 001, 002, ...)python standalone_tools/collect_eval_folder.py -l saved/tasks/<task_name>
The generated test cases will be saved in:
saved/tasks/<task_name>/
The standalone_tools/collect_eval_folder.py
script renames the folders under saved/tasks/<task_name>
into a zero-padded numeric format (000
, 001
, 002
, etc.).
Parallel Execution Support
One important feature of GenManip is its ability to run at scale in parallel:
- Whether you are running
demogen_V4.py
,render_V3.py
, oreval_V3.py
, you can launch multiple instances across different servers simultaneously. - These programs use filesystem-based file locks and
listdir
synchronization to avoid conflicts and maintain consistent progress.
You can launch any number of processes across different servers — just make sure they share the same saved
directory.