News Headline Classification with Statistical Machine Learning

This project focuses on categorizing news headlines into six categories using a statistical machine learning approach. The dataset consists of 3000 headlines evenly distributed across six categories: Politics, Economy, Sports, Current Affairs, Health, and Technology.

Project Structure

Data: The dataset contains 500 headlines for each of the six categories. The data has been cleaned, with stop words and numbers removed.
Model: A Random Forest classifier is used.
Evaluation: Cross-validation is employed to test the model, achieving a precision of 0.70 and an accuracy of 0.61.

Requirements

To set up the environment, the following packages need to be installed:

requirements
chardet
gensim
openpyxl

Installation

Clone the repository:

git clone https://github.com/eraybuyukkanat/nlp_news_classification.git
cd your-repo

Install the necessary packages:

 pip install -r requirements.txt

pip install chardet

pip install gensim

pip install openpyxl

Usage

Ensure your dataset is properly formatted and placed in the appropriate directory.
Open and run the Jupyter Notebook to train and evaluate the model:
```
jupyter notebook classification.ipynb
```

Results

The model achieves the following performance metrics:

Precision: 0.70 Accuracy: 0.61

_{## License}

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
LICENSE.txt		LICENSE.txt
README.md		README.md
classification.ipynb		classification.ipynb
data.csv		data.csv
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News Headline Classification with Statistical Machine Learning

Project Structure

Requirements

Installation

Usage

Results

About

Releases

Packages

Languages

License

eraybuyukkanat/nlp_news_classification

Folders and files

Latest commit

History

Repository files navigation

News Headline Classification with Statistical Machine Learning

Project Structure

Requirements

Installation

Usage

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages