Skip to content
This repository has been archived by the owner on Mar 19, 2023. It is now read-only.

Latest commit

 

History

History
12 lines (8 loc) · 588 Bytes

README.md

File metadata and controls

12 lines (8 loc) · 588 Bytes

Video-Description-CV-NLP

This project is a Deep Learning Project with the description of the video/image. It's a deep learning project combined NLP and CV, literally.

Dataset

MSR-VTT dataset. The link of the dataset is msr-vtt

This dataset has 10000 videos in total including training dataset, test dataset and validation dataset. Each video has 20 captions and all the videos are categorized to 20 classes.

Architecture

The architecture is from the S2VT paper.

network