Comedic Dynamics - Analyzing Segmentation, Offense and Thematic Patterns in Stand Up Comedy Transcripts 🎭

Abstract 📄

This project aims to enhance the transparency and safety of stand-up comedy by analyzing thematic patterns and sentiments within comedy specials. Utilizing NLP techniques such as topic modeling and sentiment analysis, we identify dominant themes and polarity scores, helping creators to better frame their content disclaimers.

Introduction 🎤

The project examines stand-up comedy specials, focusing on controversial topics like race, ethnicity, and politics, which often lead to public debates. We aim to understand these thematic patterns and sentiments to propose improved content disclaimers, fostering a safer viewing experience.

Background 📚

Our approach leverages Topic Modeling to uncover hidden thematic structures and Offense Analysis in Comedy to understand potentially harmful effects of humor. Techniques include Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) to identify topics and sentiments within comedy transcripts.

Approach 🛠️

Corpus Extraction: Extracted from "Scraps from the Loft" using BeautifulSoup, covering over 450 transcripts in English.
Text Preprocessing: Involves cleaning text, language identification, tokenization, and Parts-of-Speech tagging using Python libraries.
Offense and Subjectivity Analysis: Employing the VADER and TextBlob packages for sentiment and subjectivity analysis.
Named Entity Recognition: Using SpaCy's NLP model to identify and analyze entities within the transcripts.
Topic Modeling: Using LDA and NMF models to identify prevalent topics, with a focus on optimizing topic coherence.

Results 📈

The analysis identifies relationships between sentiment polarity and subjectivity, showing different trends among comedians. Topic modeling results in distinct thematic categories such as observational, cultural, and political comedy. NMF showed better performance in topic coherence compared to LDA.

Conclusions 🏁

The project effectively maps out the thematic and emotional landscape of stand-up comedy, offering insights that can help tailor content disclaimers.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.venv		.venv
data		data
.DS_Store		.DS_Store
EDA.ipynb		EDA.ipynb
LDA_Gensim.ipynb		LDA_Gensim.ipynb
NMF.ipynb		NMF.ipynb
README.md		README.md
bertopic.ipynb		bertopic.ipynb
data_parsed.csv		data_parsed.csv
polarity_subjectivity_data_without_LemmStemm.csv		polarity_subjectivity_data_without_LemmStemm.csv
preprocess_final.csv		preprocess_final.csv
standup_transcripts.csv		standup_transcripts.csv
text_preprocessing.ipynb		text_preprocessing.ipynb
urls.csv		urls.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comedic Dynamics - Analyzing Segmentation, Offense and Thematic Patterns in Stand Up Comedy Transcripts 🎭

Abstract 📄

Introduction 🎤

Background 📚

Approach 🛠️

Results 📈

Conclusions 🏁

About

Releases

Packages

Contributors 2

Languages

saurabh-112000/Humor-Analysis---NLP

Folders and files

Latest commit

History

Repository files navigation

Comedic Dynamics - Analyzing Segmentation, Offense and Thematic Patterns in Stand Up Comedy Transcripts 🎭

Abstract 📄

Introduction 🎤

Background 📚

Approach 🛠️

Results 📈

Conclusions 🏁

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages