Reddit2Excel

A simple tool to scrape a list of keywords from Reddit into a neatly formatted .xlsx file.

Installation

Use the package manager pip to install the required libraries.

pip install -r requirements.txt

or

Using pipenv:

pipenv install

Obtaining a Reddit Developer Key

If you do not have a Reddit account you must first sign up for one.

Go to Reddit Apps.
Select “script” as the type of app.
Name your app and give it a description.
Set-up the redirect url to be http://localhost:8080. The redirect URI will be used to get your refresh token.
Once you click on “create app”, you will get a box showing you your client_id and client_secrets.
In the folder containing this README file (the main folder for this project)
- Open the .env file and enter the client id and secret like the following and save the file.

client_id = "YourClientIDHere"
client_secret = "YourClientSecretHere"
user_agent = "YourAppNameHere"

Search Keywords & Phrases

Adding/Removing Keywords & Phrases

To change the list of phrases and keywords, open the keywords.txt file under the Keywords&Lists directory.

Keywords and phrases must be similar like this:

This is an example phrase
KeywordExample
YouGetTheIdea

Changing the Number of Posts Scraped

By default this script will scrape the top 100 posts for each keyword or phrase for the chosen time period. To adjust this you can adjust the limit under the keyword_search function.

def keyword_search(keyword):
    for submission in allsubs.search(
        keyword, sort="top", syntax="lucene", time_filter=data_time, limit=100):

Doing so may result in hitting the API request limit. The maximum allowed is 1000, which can be achieved by setting limit to "none".

More information on this can be found in the praw api docs.

Filtering Subreddits

You can filter out results from specific subreddits by opening the filtered_subreddits.txt file under the Keywords&Lists directory.

List the undesired subreddits like this:

exampleSubreddit
UGetTheIdea
IhOpe

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Keywords&Lists		Keywords&Lists
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
reddit.py		reddit.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit2Excel

Installation

Obtaining a Reddit Developer Key

Search Keywords & Phrases

Adding/Removing Keywords & Phrases

Changing the Number of Posts Scraped

Filtering Subreddits

About

Releases

Packages

Languages

License

Durhamster/Reddit2Excel

Folders and files

Latest commit

History

Repository files navigation

Reddit2Excel

Installation

Obtaining a Reddit Developer Key

Search Keywords & Phrases

Adding/Removing Keywords & Phrases

Changing the Number of Posts Scraped

Filtering Subreddits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages