Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twitter sentiment analysis #495

Open
Yuvika-14 opened this issue Jan 12, 2024 · 22 comments
Open

Twitter sentiment analysis #495

Yuvika-14 opened this issue Jan 12, 2024 · 22 comments
Labels
Up-for-Grabs ✋ Issues are open to the contributors to be assigned

Comments

@Yuvika-14
Copy link

ML-Crate Repository (Proposing new issue)

Project Title :Twitter sentiment analysis
Aim :to predict whether a comment is positive or negative on twitter.
Dataset:https://www.kaggle.com/datasets/kazanova/sentiment140
Approach:ensemble methods, gradient boosting, neural networks.
Take care of the missing data if there are any categorical values then will use one hot encoder or label encoder depending upon the issue.


📍 Follow the Guidelines to Contribute in the Project :

  • You need to create a separate folder named as the Project Title.
  • Inside that folder, there will be four main components.
    • Images - To store the required images.
    • Dataset - To store the dataset or, information/source about the dataset.
    • Model - To store the machine learning model you've created using the dataset.
    • requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
  • Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

🔴🟡 Points to Note :

  • The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
  • "Issue Title" and "PR Title should be the same. Include issue number along with it.
  • Follow Contributing Guidelines & Code of Conduct before start Contributing.

To be Mentioned while taking the issue :

  • Full name :
  • GitHub Profile Link :
  • Participant ID (If not, then put NA) :
  • Approach for this Project :
  • What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.)

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

@Yuvika-14 Yuvika-14 added the Up-for-Grabs ✋ Issues are open to the contributors to be assigned label Jan 12, 2024
@hemant933
Copy link
Contributor

Full name :Hemant chaudhary
GitHub Profile Link : github.com/hemant933
Participant ID (If not, then put NA) :
Approach for this Project : first we need to process the data to handle any missing data , than apply models specified by using vectorization and pipeling .
What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.) IWOC

so , plz assign it to me

@abhisheks008
Copy link
Owner

Hi @Yuvika-14 this project is already present in this project repo, https://github.com/abhisheks008/ML-Crate/tree/main/Sentimental%20Analysis%20of%20tweets.

If you wanna enhance this project then you can share your approach.

@JagritiGautam793
Copy link

JagritiGautam793 commented Jan 21, 2024

Full name :Jagriti Gautam
GitHub Profile Link : https://github.com/JagritiGautam793
Participant ID (If not, then put NA) :
Approach for this Project :After processing of the data i will try for the real time analysis of the tweets by using twitter api (Tweepy)and after then applying nlp and pretrained transformer model after that visualizing the data through donut charts and also analysing through word cloud..

What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.) IWOC

so , plz assign it to me

@JagritiGautam793
Copy link

Can u assign me @abhisheks008

@abhisheks008
Copy link
Owner

Please check the previous comments.

@JagritiGautam793
Copy link

JagritiGautam793 commented Jan 21, 2024

Sir i will improve this model by using Roberta Model(from hugging face) rather than vader ....This transformer model account for words but also the context related to words .As human language depend more on context ... Vader is not that much accurate in analyzing the context ..Every negative comment may not be negative but sarcastic Roberta is more efficient in getting that..
Sir pls if u can assign me and guide through it

@abhisheks008
Copy link
Owner

Cool, assigned to you @JagritiGautam793

@abhisheks008 abhisheks008 added enhancement New feature or request Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 IWOC2024 IWOC 2.0 Open Source Event and removed Up-for-Grabs ✋ Issues are open to the contributors to be assigned labels Jan 21, 2024
@abhisheks008 abhisheks008 added Up-for-Grabs ✋ Issues are open to the contributors to be assigned and removed enhancement New feature or request Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 IWOC2024 IWOC 2.0 Open Source Event labels Feb 12, 2024
@abhisheks008
Copy link
Owner

Unassigned as the open source event ended up.

@JagritiGautam793
Copy link

Sir i am almost done .. Was the event date not upto 15 th

@abhisheks008
Copy link
Owner

No @JagritiGautam793 IWOC 2024 deadline was Feb 11th, 2024 23:59 hours.

@abhisheks008
Copy link
Owner

This issue is not being assigned to you as the program has already completed, hence the assignment has been removed.

@abhisheks008
Copy link
Owner

@JagritiGautam793

@shivansh-2003
Copy link
Contributor

shivansh-2003 commented May 29, 2024

Can You Please Assign this issue under SSOC. 2024 Season 3
Shivansh Mahajan
Github:- https://github.com/shivansh-2003
Participation ID:- NA
I will first convert the file into text procced file then tokenize each element of text after that i would use different encoding methods like count vectorizer by sklearn textprocessing by tensorflow and word2vec by gensin library then i would feed the encoded file to possible LSTM , RNN neural network to draw sentimnetal analysis
I have been Recently Doing Few NLP Projects on NER , Sentimental Analysis , Text Classifcation
I am well versed with fundamentals of NLP check out my linkedin :-https://www.linkedin.com/in/shivansh-mahajan-13227824a/ and Git repository .
My some recent Project in NLP Projects https://www.linkedin.com/feed/update/urn:li:activity:7199784822737682432/(SPAM classifier)
https://www.linkedin.com/feed/update/urn:li:activity:7201206409605091328/ . (NLP APP)
can u assign me with this issue @abhisheks008
Participation Role:- SSOC Season 3

@abhisheks008
Copy link
Owner

Contributions will start from June 1, 2024. Till then please have some patience.

@Tanishka023
Copy link

Full name : Tanishka Bhalla
GitHub Profile Link: https://github.com/Tanishka023
Participant ID (If not, then put NA) :NA
Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation
What is your participant role? SSoC Season 3

@abhisheks008
Copy link
Owner

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

Can you implement 3-4 models for this project?

@Tanishka023
Copy link

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

Can you implement 3-4 models for this project?

Yes! @abhisheks008

@abhisheks008
Copy link
Owner

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

Can you implement 3-4 models for this project?

Yes! @abhisheks008

One issue at a time.

@shivamkrishna1000
Copy link

shivamkrishna1000 commented Jun 2, 2024

Full Name : Shivam Krishna
GitHub Profile Link : https://github.com/shivamkrishna1000
Participant ID : NA
Approach for this Project : First I will handle the missing values if any -> use label encoding if needed -> use TfidfVectorizer for feature extraction -> Use various models such as XGB/Random Forest Classification/LR/SVM/Gradient Boosting Classifier for model training -> Model evaluation based on accuracy score.
What is your participant role? SSoC Season 3

Please assign me this issue.

@why-aditi
Copy link
Contributor

Aditi Kala
Github:- https://github.com/why-aditi
Participation ID:- NA
Approach: Text Cleaning: Remove unnecessary elements from the tweets such as: URLs, Hashtags, Punctuation, Numbers, Special characters, Remove common words that do not contribute much to the sentiment
Tokenization: Split the text into individual words or tokens.
Sentiment analysis: VADER
Extract sentiment scores or labels (e.g., positive, negative, neutral)
Participation Role:- SSOC Season 3

@abhisheks008 abhisheks008 added Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 SSOC and removed Up-for-Grabs ✋ Issues are open to the contributors to be assigned labels Jun 2, 2024
@abhisheks008
Copy link
Owner

Full Name : Shivam Krishna GitHub Profile Link : https://github.com/shivamkrishna1000 Participant ID : NA Approach for this Project : First I will handle the missing values if any -> use label encoding if needed -> use TfidfVectorizer for feature extraction -> Use various models such as XGB/Random Forest Classification/LR/SVM/Gradient Boosting Classifier for model training -> Model evaluation based on accuracy score. What is your participant role? SSoC Season 3

Please assign me this issue.

Implement 5-6 models for this dataset.

Assigned @shivamkrishna1000

@shivamkrishna1000
Copy link

Full Name : Shivam Krishna GitHub Profile Link : https://github.com/shivamkrishna1000 Participant ID : NA Approach for this Project : First I will handle the missing values if any -> use label encoding if needed -> use TfidfVectorizer for feature extraction -> Use various models such as XGB/Random Forest Classification/LR/SVM/Gradient Boosting Classifier for model training -> Model evaluation based on accuracy score. What is your participant role? SSoC Season 3
Please assign me this issue.

Implement 5-6 models for this dataset.

Assigned @shivamkrishna1000

Sure Sir

@abhisheks008 abhisheks008 added Up-for-Grabs ✋ Issues are open to the contributors to be assigned and removed Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 SSOC labels Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Up-for-Grabs ✋ Issues are open to the contributors to be assigned
Projects
None yet
Development

No branches or pull requests

8 participants