Skip to content

This repository contains a DistilBERT model fine-tuned using the Hugging Face Transformers library on the IMDb movie review dataset. The model is trained for sentiment analysis, enabling the determination of sentiment polarity (positive or negative) within text reviews.

License

Notifications You must be signed in to change notification settings

YonghaoZhao722/distilbert-base-uncased-finetuning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DistilBERT Sentiment Analysis

This repository contains a DistilBERT model fine-tuned using the Hugging Face Transformers library on the IMDb movie review dataset. The model is designed for sentiment analysis, enabling the determination of sentiment polarity (positive or negative) within text reviews.

The model is based on the paper DistilBERT: a distilled version of BERT: smaller, faster, cheaper and lighter.arXiv

📝 Contents

  • dataset/: Contains scripts or code related to dataset handling and processing.
  • pretrained/: (Please manually download and place the pytorch_model.bin file from the link below)
  • predict.ipynb: Notebook demonstrating the prediction process using the fine-tuned DistilBERT model.

🤗 Pretrained Model

Please download the pre-trained model pytorch_model.bin from the following link and move it to the pretrained/ folder: Download Model

🔨 Preparation

To get started, clone the repository and navigate to the project directory:

git clone https://github.com/zyh040521/distilbert-base-uncased-finetuning
cd distilbert-base-uncased-finetuning

💡 Building the Environment

To set up the required environment, install the dependencies listed in the requirements.txt file using pip:

pip install -r requirements.txt

🌟 Usage

Run main.ipynb

😊 Predict

Run predict.ipynb

🔥 TODO

  • Use the evaluate library to assess model accuracy

About

This repository contains a DistilBERT model fine-tuned using the Hugging Face Transformers library on the IMDb movie review dataset. The model is trained for sentiment analysis, enabling the determination of sentiment polarity (positive or negative) within text reviews.

Topics

Resources

License

Stars

Watchers

Forks