GitHub - tjohnson250/fastai_barbieswomen: Alternate Dataset for Fast.ai Lesson 1 of Deep Learning Part 1

A dataset and Jupyter notebook for exploring Lesson 1 of the Fast.ai Deep Learning 1 course using barbies vs. women, instead of the original Kaggle Dogs and Cats dataset. This is verty small dataset, so it is difficult to get stable results. However, I can usually get 90% plus and sometimes find weights that give 96% on the validation set.

@semih suggested classifying photos of barbies vs. women: http://forums.fast.ai/t/wiki-lesson-1/9398/

barbieswomen.zip contains training and validation data set up using the folder structure required for fast.ai.

To use this with the version of fast.ai used in the courses, it best to clone this repo and then move the notebooks into one of the course folders, then unzip the datafile and move it the train and valid directories into the courses data folder.

The Barbie and Women Import Notebook contains sample code for creating the dataset.

I created the dataset using two python scripts:

googleimagesdownload: https://github.com/hardikvasa/google-images-download

You can install this using pip install google-images-download

make_train_valid.py from https://github.com/prairie-guy/ai_utilities

googleimagesdownload requires a machine with a chrome browser and the appropriate chromedriver (see the googleimagesdownload GitHub repo for instructions). Otherwise, you are limited to 100 images.

Download the images using these commands.

googleimagesdownload -k "woman" -o "barbieswomen" --format jpg --usage_rights labeled-for-reuse -l 150 --chromedriver ./chromedriver
googleimagesdownload -k "barbie" -o "barbieswomen" --format jpg --usage_rights labeled-for-reuse -l 150 --chromedriver ./chromedriver

Examine the images and remove incorrect images. I removed all paintings and images that were not clearly women or barbies. I also removed images that contained both women and barbies, since the model is forced to choose between one or the other classification.

Use imagemagick to resize images for easier uploading and processing:

cd women
convert -resize '640' *.jpg woman.jpg

You will now see your original files and new files titled woman-n.jpg in the same directory. If you are happy with the resizing delete, the originals and convert the other directory of images.

If you are sure the resize is exactly as you need, you can also use mogrify instead of convert to resize and replace your originals:

mogrify -resize '640' *.jpg woman.jpg

Make the train and valid datasets/directory structure:

make_train_valid.py barbieswomen --train .80 --valid .20

Now compress the directory and upload to your VM. If using Paperspace through SSH, execute:

scp barbieswomen.zip paperspace@<your machine's public IP address>:./barbieswomen.zip

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Barbie and Women Import.ipynb		Barbie and Women Import.ipynb
barbieswomen.zip		barbieswomen.zip
lesson1-barbies-women.ipynb		lesson1-barbies-women.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

tjohnson250/fastai_barbieswomen

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages