This is my final project for the AllWomen Data Science and Data Analytics Bootcamp, which was presented in April 2021.
Every day we are exposed to ads and brand campaigns, therefore it's natural that some of those catchy phrases stuck to our minds.
I wanted to investigate what hides behind the brand slogans, if there are any patterns they have in common and if gender of the CEO has any impact on them. I also analyse the colors of the logos of the brands and companies.
-
The original slogan dataset comes from Kaggle and can be downloaded here:
https://www.kaggle.com/rihim421/brands-slogan-and-trademark-analysis I manually added the missing slogans and a category each brand belongs to.
-
The list of the CEOs was scraped from Wikipedia. I extracted the gender by using the genderize API and filled in the missing values by manual search. https://genderize.io
-
I also analysed the logos of the brands and those were downloaded manually.
Here are some of the resources and inspirations I used for the project:
https://towardsdatascience.com/topic-modelling-in-python-with-nltk-and-gensim-4ef03213cd21
https://towardsdatascience.com/a-practitioners-guide-to-natural-language-processing-part-i-processing-understanding-text-9f4abfd13e72
https://towardsdatascience.com/nlp-part-3-exploratory-data-analysis-of-text-data-1caa8ab3f79d
https://towardsdatascience.com/color-identification-in-images-machine-learning-application-b26e770c4c71
https://www.kaggle.com/yening2000/nlp-with-tensorflow-spacy
https://www.sloganlist.com
There are several parts the project is divided into:
- Data cleaning and EDA
- NLP part 1: Sentiment Analysis, data preparation for NER
- NLP part 2: LDA, Unsupervised Model, NER, spaCy, N-grams
- Analysis of the gender of the CEOs of the companies
- Data preparation for maps creation
- Logos analysis - color extraction from the images of the brand logos
Here's a link to the final presentation: https://readymag.com/u393092791/2746732/
If you have any feedback or suggestions on how to improve the project and the code, you can reach out to me through LinkedIn: https://www.linkedin.com/in/agnieszka-plech-81746b185/
I hope you enjoy this project as much as I did :)