Skip to content

This repo is designed for General Robotic Operation System

License

Notifications You must be signed in to change notification settings

kevinPOI/RSSM_Navigation_World_Model

 
 

Repository files navigation

CYBER: A General Robotic Operation System for Embodied AI

Show Data

The development of world models in robotics has long been a cornerstone of advanced research, with most approaches relying heavily on vast, platform-specific datasets. These datasets, while valuable, often limit scalability and generalization to different robotic platforms, restricting their broader applicability.

In contrast, CYBER approaches world modeling from a "first principles" perspective, drawing inspiration from how humans naturally acquire skills through experience and interaction with their environment. CYBER is the first general Robotic Operational System designed to adapt to both teleoperated manipulation and human operation data, enabling robots to learn and predict across a wide range of tasks and environments. It builds with a Physical World Model, a cross-embodied Visual-Language Action Model (VLA), a Perception Model, a Memory Model, and a Control Model to help robots learn, predict, and memory across various tasks and embodiments.

At the same time, CYBER also provide millions of human operation datasets and baseline models over HuggingFace 🤗 to enhance embodied learning, and experimental evalaution tool box to help researchers to test and evaluate their models in both simulation and real world.


🌟 Key Features

  • 🛠️ Modular: Built with a modular architecture, allowing flexibility in various environments.
  • 📊 Data-Driven: Leverages millions of human operation datasets to enhance embodied learning.
  • 📈 Scalable: Scales across different robotic platforms, adapting to new environments and tasks.
  • 🔧 Customizable: Allows for customization and fine-tuning to meet specific requirements.
  • 📚 Extensible: Supports the addition of new modules and functionalities, enhancing capabilities.
  • 📦 Open Source: Open-source and freely available, fostering collaboration and innovation.
  • 🔬 Experimental: Supports experimentation and testing, enabling continuous improvement.

🛠️ Modular Components

CYBER is built with a modular architecture, allowing for flexibility and customization. Here are the key components:

🌍 World Model is now available. Additional models will be released soon.

⚙️ Setup

Pre-requisites

You will need Anaconda installed on your machine. If you don't have it installed, you can follow the installation instructions here.

Installation

You can run the following commands to install CYBER:

bash scripts/build.sh

Alternatively, you can install it manually by following the steps below:

  1. Create a clean conda environment:

     conda create -n cyber python=3.10 && conda activate cyber
    
  2. Install PyTorch and torchvision:

     conda install pytorch==2.3.0 torchvision==0.18.0 cudatoolkit=11.1 -c pytorch -c nvidia
    
  3. Install the CYBER package:

     pip install -e .
    

🤗 Hugging Face Integration

CYBER leverages the power of Hugging Face for model sharing and collaboration. You can easily access and use our models through the Hugging Face platform.

Available Data

Currently, four tasks are available for download:

  • 🤗 Pipette: Bimanual human demonstration dataset of precision pipetting tasks for laboratory manipulation.
  • 🤗 Take Item: Single-arm manipulation demonstrations of object pick-and-place tasks.
  • 🤗 Twist Tube: Bimanual demonstration dataset of coordinated tube manipulation sequences.
  • 🤗 Fold Towels: Bimanual manipulation demonstrations of deformable object folding procedures.

Available Models

Our pretrained models will be released on Hugging Face soon:

  • Cyber-World-Large (Coming Soon)

  • Cyber-World-Base

  • Cyber-World-Small (Coming Soon)

Using the Models (Coming Soon)

For more details, please refer to the Hugging Face documentation.

🕹️ Usage

Please refer to the experiments for more details on data downloading and model training.


💾 File Structure

├── ...
├── docs                   # documentation files and figures 
├── docker                 # docker files for containerization
├── examples               # example code snippets
├── tests                  # test cases and scripts
├── scripts                # scripts for setup and utilities
├── experiments            # model implementation and details
│   ├── configs            # model configurations
│   ├── models             # model training and evaluation scripts
│   ├── notebooks          # sample notebooks
│   └── ...
├── cyber                  # compression, model training, and dataset source code
│   ├── dataset            # dataset processing and loading
│   ├── utils              # utility functions
│   └── models             # model definitions and architectures
│       ├── action         # visual language action model
│       ├── control        # robot platform control model
│       ├── memory         # lifelong memory model
│       ├── perception     # perception and scene understanding model
│       ├── world          # physical world model
│       └── ...
└── ...

📕 References

Magvit2 and GENIE adapted from 1xGPT Challenge 1X Technologies. (2024). 1X World Model Challenge (Version 1.1) [Data set]

@inproceedings{wang2024hpt,
author    = {Lirui Wang, Xinlei Chen, Jialiang Zhao, Kaiming He},
title     = {Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers},
booktitle = {Neurips},
year      = {2024}
}
@article{luo2024open,
  title={Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation},
  author={Luo, Zhuoyan and Shi, Fengyuan and Ge, Yixiao and Yang, Yujiu and Wang, Limin and Shan, Ying},
  journal={arXiv preprint arXiv:2409.04410},
  year={2024}
}

📄 Dataset Metadata

property value
name CyberOrigin Dataset
url
description Cyber represents a model implementation that seamlessly integrates state-of-the-art (SOTA) world models with the proposed CyberOrigin Dataset, pushing the boundaries of artificial intelligence and machine learning.
provider
property value
name CyberOrigin
license
property value
name Apache 2.0

📫 Contact

If you have technical questions, please open a GitHub issue. For business development or other collaboration inquiries, feel free to contact us through email 📧 ([email protected]). Enjoy! 🎉

About

This repo is designed for General Robotic Operation System

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.6%
  • Python 5.4%