Skip to content

This Python code analyzes a dataset of blog posts, focusing on the polarity and subjectivity of the text. It cleans the text data, visualizes word frequency using word clouds, and explores the sentiment of the text based on age groups and blog topics. The results show differences in sentiment, subjectivity, and word usage among age groups and blog

Notifications You must be signed in to change notification settings

vaddhiparthy/BlogSentimentAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Sentiment analysis of Blog authorship corpus Data

This undertaking employs advanced text analytics techniques to effectively analyze and summarize an extensive corpus of blogger content.

The initial step involved reading and understanding the dataset by importing it into a data frame, eliminating null and duplicate values, and visually inspecting random data points to gain insight into the structure and content of the data.

Subsequently, various data cleaning techniques were applied, including dropping unimportant columns, removing all numbers, symbols and extra spaces, and standardizing all text to upper and lowercase alphabets. This was followed by the elimination of stop words, non-English words, misspelled words, and chat acronyms, to ensure a high level of data integrity.

To gain a deeper understanding of the content, advanced text analysis techniques were applied, such as computing polarity and subjectivity values across various demographics, and visualizing their distribution.

Furthermore, trends in word usage were identified across various demographics and blog categories using advanced data visualization techniques such as word clouds, providing a comprehensive understanding of the corpus and its underlying themes.

About

This Python code analyzes a dataset of blog posts, focusing on the polarity and subjectivity of the text. It cleans the text data, visualizes word frequency using word clouds, and explores the sentiment of the text based on age groups and blog topics. The results show differences in sentiment, subjectivity, and word usage among age groups and blog

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages