Gul M. Kurtoglu Eskisar gmk2131@columbia.edu gmkurtoglu@gmail.com gul.kurtoglu@deu.edu.tr
Emotions, Populism and Social Media: Is there a link?
This project is related to my original project proposal to Fulbright Commission. For this class, I aim to scrape the web (political party websites, parliamentary websites, social media websites including Reddit and Eksisozluk) using Beautiful Soup and possibly Selenium to obtain data on populism. Once I collect the data, I aim to conduct a stance detection analysis (and ideally a sentiment analysis) using NLP methods.
To track progress on the project, we will use the following intermediate milestones for your overall project. Each milestone will be marked with a tag in the git repository, and we will check progress and provide feedback at key milestones.
Date | Milestone | Deliverables | Git tag |
---|---|---|---|
March 29 | Submit project description | README.md | proposal |
April 5 | Update project scope/direction based on instructor/TA feedback | README.md | approved |
April 12 | Basic project structure with empty functions/classes (incomplete implementation), architecture diagram | Source code, comments, docs | milestone1 |
April 19 | Progress on implementation (define your own goals) | Source code, unit tests | milestone2 |
April 26 | Completely (or partially) finished implementation | Source code, documentation | milestone3 |
May 10 | Final touches (conclusion, documentation, testing, etc.) | Conclusion (README.md) | conclusion |
The column Deliverables lists deliverable suggestions, but you can choose your own, depending on the type of your project.
My project would scrape the web to collect data on populism, and later analyze this data through sentiment analysis. Due to time and other limits, I probably will not have time to be able to undertake the sentiment analysis portion of the project by the indicated deadline by the instructor.
For the scraping part of the project, I will probably be able to use my own laptop, but if the scraped data is too large, I may have to find a way to use cloud services to run my scripts for the NLP part of the project.
It is quite likely that I will be using Beautiful Soup and/or Selenium to scrape the websites that I want. For the NLP part, I've heard of a program named Spacy, and intend to check it out to see if I can use it.
I am still not sure what is required here, so I will have to skip this part (again) for the time being.
It is easy to measure the success of my project due to the nature of webscraping--if unsuccessful, my program would not enable me to collect any data at all.
My potential challenges for this project remain, but I hope to be able to remediate at least some of them before the deadline.
I still intend to use the materials (homework scripts, relevant materials) from Prof. Daniel Bauer's NLP class, if I can.
I remain convinced that if I can succeed with it, my project can eventually help me contribute significantly to my original field of expertise.