Subreddits we are crawling and their creation date

Their creation date is important because we want to start crawling BEFORE their creation date.

Subreddit	Creation date
loseit	2010-07-29
keto	2010-05-27
brogress	2013-10-14
progresspics	2011-06-23
btfc	2011-01-15
gainit	2011-01-05

Here's an example query:

python subreddit_fetcher.py --start="2010-05-01" --end="2015-08-17" --output_prefix="all_subreddits" --subreddits="loseit,keto,brogress,progresspics,btfc,gainit" --step_days="15"

Using q with this: q --delimiter=, "select count(*) from features_extracted.csv"

Feature Extractor python feature_extractor.py --input= --output=

Reasons why we aren't catching more:

heights with cm

python feature_extractor.py --input=all_subreddits-2010-05-01-to-2015-08-17.csv --output=all_subreddits_extracted_features.csv python generate_output.py --input=all_subreddits_extracted_features.csv --output=csv_output.csv

Optional: remove titles

q -O -H --delimiter=, "select id,previous_weight_lbs,current_weight_lbs,height_in,gender,score,photos,first_image_aspect_ratio from csv_output.csv" > csv_output_no_title.csv

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
feature_extractor.py		feature_extractor.py
generate_output.py		generate_output.py
reddit_common.py		reddit_common.py
rules.yaml		rules.yaml
subreddit_fetcher.py		subreddit_fetcher.py
synced_csv_dict.py		synced_csv_dict.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Subreddits we are crawling and their creation date

Optional: remove titles

About

Releases

Packages

Languages

kapily/reddit-wwill

Folders and files

Latest commit

History

Repository files navigation

Subreddits we are crawling and their creation date

Optional: remove titles

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages