A scalable multimodal pipeline for processing, indexing, and querying multimodal documents
Ever needed to take 8000 PDFs, 2000 videos, and 500 spreadsheets and feed them to an LLM as a knowledge base? Well, MMORE is here to help you!
Note: Please see section Manual Installation below how to install without docker
- Install docker
- Open a terminal and build the image with the following command
docker build . --tag mmore
To build for CPU-only platforms (results in smaller image size), you can use
docker build --build-arg PLATFORM=cpu -t mmore .
Start an interactive session with
docker run -it -v ./test_data:/app/test_data mmore
Note: we are mapping the folder test_data
to the location /app/test_data
inside the container. The default location given in the examples/process_config.yaml
maps to this folder, which we are using in the next step.
Inside the docker session you can run
# run processing
mmore process --config_file examples/process_config.yaml
# run indexer
mmore index --config-file ./examples/index/indexer_config.yaml
# run rag
mmore rag --config-file ./examples/rag/rag_config_local.yaml
Currently only for Linux systems
- Install system dependencies
sudo apt update
sudo apt install -y ffmpeg libsm6 libxext6 chromium-browser libnss3 libgconf-2-4 libxi6 libxrandr2 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxrender1 libasound2 libatk1.0-0 libgtk-3-0 libreoffice
- Install uv: https://docs.astral.sh/uv/getting-started/installation/
- Clone this repository
git clone https://github.com/swiss-ai/mmore
cd mmore
- Install project and dependencies
uv sync
If you want to install a CPU-only version you can run
uv sync --extra cpu
- Run a test command
To run the following commands either prepend every command with
uv run
or run once:
source .venv/bin/activate
# run processing
mmore process --config_file examples/process_config.yaml
# run indexer
mmore index --config-file ./examples/index/indexer_config.yaml
# run rag
mmore rag --config-file ./examples/rag/rag_config_local.yaml
To launch the MMORE pipeline follow the specialised instructions in the docs.
-
📄 Input Documents
Upload your multimodal documents (PDFs, videos, spreadsheets, and more) into the pipeline. -
🔍 Process Extracts and standardizes text, metadata, and multimedia content from diverse file formats. Easily extensible ! Add your own processors to handle new file types.
Supports fast processing for specific types. -
📁 Index Organizes extracted data into a hybrid retrieval-ready Vector Store DB, combining dense and sparse indexing through Milvus. Your vector DB can also be remotely hosted and only need to provide a standard API.
-
🤖 RAG Use the indexed documents inside a Retrieval-Augmented Generation (RAG) system that provides a LangChain interface. Plug in any LLM with a compatible interface or add new ones through an easy-to-use interface. Supports API hosting or local inference.
-
🎉 Evaluation
Coming soon An easy way to evaluate the performance of your RAG system using Ragas
See the /docs
directory for additional details on each modules and hands-on tutorials on parts of the pipeline.
Category | File Types | Supported Device | Fast Mode |
---|---|---|---|
Text Documents | DOCX, MD, PPTX, XLSX, TXT, EML | CPU | ❌ |
PDFs | GPU/CPU | ✅ | |
Media Files | MP4, MOV, AVI, MKV, MP3, WAV, AAC | GPU/CPU | ✅ |
Web Content (TBD) | Webpages | GPU/CPU | ✅ |
We welcome contributions to improve the current state of the pipeline, feel free to:
- Open an issue to report a bug or ask for a new feature
- Open a pull request to fix a bug or add a new feature
- You can find ongoing new features and bugs in the [Issues]
Don't hesitate to star the project ⭐ if you find it interesting! (you would be our star)
This project is licensed under the Apache 2.0 License, see the LICENSE 🎓 file for details.
This project is part of the OpenMeditron initiative developed in LiGHT lab at EPFL/Yale/CMU Africa in collaboration with the SwissAI initiative. Thank you Scott Mahoney, Mary-Anne Hartley