Reddit Data Scraping Project

Overview

This project demonstrates how to scrape data from Reddit using Python's PRAW (Python Reddit API Wrapper) library. The script allows you to extract valuable information from subreddit posts, making it easy to analyze Reddit content programmatically.

Features

Scrape top posts from any subreddit
Extract key post information:
- Post title
- Score
- URL
- Number of comments
- Post body text
- Creation date
- Author information
- Post ID

Prerequisites

Requirements

Python 3.7+
PRAW library
pandas
python-dotenv

Reddit API Credentials

To use this script, you'll need to:

Create a Reddit Account
Set up a Reddit Developer Application
- Go to https://www.reddit.com/prefs/apps
- Click "Create App" or "Create Another App"
- Choose "script" as the application type
- Fill in the necessary details
- Note down the following credentials:
  - Client ID
  - Client Secret
  - User Agent

Installation

Clone the repository:

git clone https://github.com/yourusername/Reddit_Scraping.git
cd Reddit_Scraping

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install required packages:

pip install praw pandas python-dotenv

Configuration

Create a .env file in the project root with your Reddit API credentials:

REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
REDDIT_USER_AGENT=your_user_agent
REDDIT_USERNAME=your_reddit_username
REDDIT_PASSWORD=your_reddit_password

⚠️ Important Security Note:

Never share your .env file publicly
Add .env to your .gitignore

Usage

Basic Scraping

from reddit_scraper import RedditScraper

# Initialize scraper
scraper = RedditScraper()

# Scrape top posts from a subreddit
datascience_posts = scraper.scrape_subreddit(
    subreddit_name='datascience',
    sort_by='top',
    time_filter='all',
    limit=20
)

# Save scraped data
scraper.save_to_file(datascience_posts)

Advanced Usage

# Scrape multiple subreddits
multi_subreddit_data = scraper.scrape_multiple_subreddits(
    ['datascience', 'MachineLearning', 'learnpython'],
    limit=30
)

Customization

Change sort_by: 'top', 'hot', 'new'
Modify time_filter: 'all', 'year', 'month', 'week', 'day'
Adjust limit to control number of posts

Ethical Considerations

Respect Reddit's API Terms of Service
Be mindful of rate limits
Use scraping responsibly

Troubleshooting

Ensure all environment variables are correctly set
Check your internet connection
Verify Reddit API credentials

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Disclaimer

This project is for educational purposes. Always respect Reddit's terms of service and API usage guidelines.

Contact

Your Name - [Your Email or LinkedIn]

Project Link: https://github.com/koolgax99/reddit-scrapping-praw

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
examples		examples
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
reddit_datascience_example.py		reddit_datascience_example.py
reddit_scrapper.py		reddit_scrapper.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit Data Scraping Project

Overview

Features

Prerequisites

Requirements

Reddit API Credentials

Installation

Configuration

Usage

Basic Scraping

Advanced Usage

Customization

Ethical Considerations

Troubleshooting

Contributing

License

Disclaimer

Contact

About

Releases

Packages

Languages

koolgax99/reddit-scrapping-praw

Folders and files

Latest commit

History

Repository files navigation

Reddit Data Scraping Project

Overview

Features

Prerequisites

Requirements

Reddit API Credentials

Installation

Configuration

Usage

Basic Scraping

Advanced Usage

Customization

Ethical Considerations

Troubleshooting

Contributing

License

Disclaimer

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages