Welcome to the Hezartech Datasets repository. This repository contains various datasets curated and maintained by Hezartech for machine learning, data science, and AI research purposes.
Hezartech is committed to advancing machine learning and AI by providing high-quality datasets. These datasets can be used for various applications, including sentiment analysis, named entity recognition, and more.
- Amazon: Amazon product comments with 1, 4 and 5 stars within text data.
- X_Twitter: X post's that sent to a firm (we filter tweets via queries)
- SikayetVar: SikayetVar articles data.
- GenerativeAI_Datasets: Datasets that generates data with Generative AI.
- Mixin_Datasets: Mixed up datasets for general purpose training-finetune our model.
P.S: 15 thousand post data was pulled from X (formerly Twitter). However, after the jury's recommendation, that data set was not uploaded to Github.
To use these datasets, you can clone the repository and load the datasets using your preferred programming language or tool.
$ git clone https://github.com/hezartech/datasets.git
# Loading a Dataset (Python Example)
import pandas as pd
# Load the sentiment analysis dataset
df = pd.read_csv('datasets/sentiment_analysis.csv')
CSV: Commonly used for tabular data. JSON: Used for structured data with nested fields.
We welcome contributions to enhance and expand the dataset collection. To contribute:
Fork the repository. Create a new branch (git checkout -b feature-branch). Make your changes. Submit a pull request. Please make sure to adhere to the contribution guidelines.
This repository is licensed under the MIT License. See the LICENSE file for more details.
For any questions, suggestions, or collaborations, please contact us at:
Email: hezartech@gmail.com