multimodal-pre-trained-model

Here are 3 public repositories matching this topic...

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

nlp ocr computer-vision document-ai multimodal-pre-trained-model eccv-2022

Updated Jul 11, 2024
Python

jpWang / LiLT

Star

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

nlp information-extraction document-analysis document-understanding multilingual-models document-ai multimodal-pre-trained-model

Updated Oct 31, 2022
Python

marslanm / Multimodality-Representation-Learning

Star

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .

cross-modal multimodal-deep-learning multimodal-datasets transformer-models multimodal-pre-trained-model vision-language-pretraining multimodal-applications multimodal-pretext

Updated Oct 19, 2023

Improve this page

Add a description, image, and links to the multimodal-pre-trained-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-pre-trained-model topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly