Skip to content

Latest commit

 

History

History
25 lines (23 loc) · 1.02 KB

index.md

File metadata and controls

25 lines (23 loc) · 1.02 KB

🤖 MMORE Index

💡 TL;DR

The Index module handles the indexing and post-processing of the extracted data from the multimodal documents. It creates an indexed Vector Store DB based on Milvus. We enable the use of hybrid retrieval, combining both dense and sparse retrieval.

You can customize various parts of the pipeline by defining an inference indexing config file.

💻 Minimal Example:

Here is a minimal example to index processed documents.

  1. Create a config file:

    indexer:
        dense_model_name: sentence-transformers/all-MiniLM-L6-v2
        sparse_model_name: splade
        db:
            uri: ./proc_demo.db
            name: my_db
    collection_name: my_docs
    documents_path: './output'
  2. Index your documents by calling the inference script:

    python run_index.py --config_file /path/to/config.yaml

See examples/index for other examples.