code and data to generate the antiwork dataset
This repository covers the code and data to generate the antiwork dataset:
cp orig_settings.py settings.py
Need to download basic metadata of posts using PushShift API
SubredditDownloader % subredditdownloader -db ../anti-work.db -r antiwork -ds pushshift --reddit-client-id FILL --reddit-client-secret FILL --reddit-username elibtronic --start-utc 1630468800 --end-utc 1641013200
anti-work.db
and save as csv file antiwork.csv
Uses the post_id
of harvested heads to get full api data using PRAW and save as Pickle file
Inital AP interations.ipynb
with grab PRAW objects and dump them into raw_data
directory.Work done in Master_Builder.ipynb
.
#CONFIGS
has all the values you can set for number of records etc#CONFIGS #PICKLE section
are the fields from PRAW that you want in final CSV fileData is built locally. Once that is done analysis it completed in Google Colab according to the following Notebooks: