A real-time smart webcam application using the Charades dataset, TensorFlow, and OpenCV.
The key to building intelligent AI systems is data: Data with the right insight into our lives. Since 2016 we have been using our Charades dataset to train models that understand videos of 157 different boring daily activities, such as watching TV
, sitting on a couch
, looking outside a window
. This repository takes a tiny/fast SqueezeNet 1.1 frozen TensorFlow model trained on the Charades dataset using the Charades Algorithms codebase and runs it on a real-time webcam feed. Note that this is a simple frame-classification model that achieves 13.5% mean average precision on the Charades benchmark, but there are now sophisticated models that obtain 34.4% (Google DeepMind) and 39.5% (Carnegie Mellon University), therefore this is a simple (not very smart) real-time model that we hope will allow anyone to use this in various applications. First one to use this in a Pi/Phone wins a beer (restrictions apply).
- Packages: matplotlib, numpy, Pillow, or alternatively
conda env create -f environment.yml
python charades_webcam.py
Optional arguments (default value):- Device index of the camera
--source=0
- Width of the frames in the video stream
--width=480
- Height of the frames in the video stream
--height=360
- Number of workers
--num-workers=2
- Size of the queue
--queue-size=5
- Device index of the camera
- Anaconda / Python 3.5
- TensorFlow 1.2
- OpenCV 3.0
- (All requirements should be available through pip/conda)
- OpenCV 3.1 might crash on OSX after a while. See open issue and solution here.
Shoutout to Dat Tran for a great real-time object detector that was the basis for this code. https://github.com/datitran/object_detector_app
See LICENSE for details. Copyright (c) 2018 Gunnar Sigurdsson.