The Repository is the collection of sindh text datasets from tweets to articles to books. The purpose of this repository is to provide an authentic datasource for development and research in the field of Sindhi NLP (Natural language Processing).
The dataset in this repository are arranged in such a way there is folder for each dataset that will contain one or more than one csv files and a readme file for each dataset describing everything about the dataset.
root/
dataset_1/
-dataset.csv
-README.md
dataset_2/
dataset2.csv
-README.md
...
dataset_n
dataset_n.csv
README.md
-README.md
This repository will contain file with csv and md format. .
Dataset File, format CSV.
We believe that there is nothing to small to contribute even if it is correcting the typo. We encourged everyone to contribute no matter if you're a newbie or experienced github contributor. Followinga are some of the ideas where you can contribute
- Publish a new sindhi dataset which is not available in our repositories
- Label the dataset
- Documentation
- Translate documentation to sindhi
- Clean the existing dataset
- Create Sample Notebooks on the existing datasets
Here you will learn how you can contribute to this project and can make your impact
You can fork this repository by clicking on fork button on top right corner. Once you fork this will create a copy of repo on your account
To clone the repository go to your account open this repo and either click on clone button or run the command below to get this repository on your local machine
git clone "URL you just copied"
e.g. git clone https://github.com/yourgithubusername/sindhi-NLP-dataset.git
On your local machine go the project folder that you cloned and use following git
command inside that folder
create a new branch using below command
git checkout -b
e.g. git checkout -b owais431
Make whatever contribution you want to make. We believe that there is nothing to small to contribute even if it is correcting the typo. We encourged everyone both expereinced github contributors as well as newbie to contributing as much as possible.
Now we have to add changes that we made to the branch so for that we will run following command
git add .
Now we have to commit changes, commit message should always be clear, to commit use command below
git commit -m "clear-commit-message-to-show-what-you-did"
Now you have to push the changes that you made to remote repository on specified branch to do so use command below
git push origin name-of-your-branch
name of branch is same as you created in step 3
Once you have pushed your code to GitHub, now it's time to create pull request
, you will go to the repository click on compare and pull request
and submit the pull request.
Soon, we will be merging all your pull requests to the main branch of project and you will also get notification once your pull request is merged with existing code base.
This project is licensed under the MIT License - see the LICENSE.md file for details