NODEP-UA 9982.SY1 Experiential Learning Seminar | Final Project
This is a social science project with a tiny Data Science (NLP) element.
All data is collection from the public domain. They are mostly trancripts from YouTube and also some blog posts and Quora Q&A.
I mainly conducted qualitative research using interviews and digital ethnography.
I also did a little bit of quantitative research by examining the word frequency on all the text corpus. The results support my theories in qualitative research.
-
transcriptConverter.py
is used to clean up the format of YouTube transcript after copy pasteing. -
wordFrequency.py
is used to count the frequency of words after some basic data cleaning. Packages used: nltk, matplotlib.
I will showcase here the code and the majority of data I used for analysis.
*** For more details, please read the report named quantitative.docx.