Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 635 Bytes

README.md

File metadata and controls

21 lines (15 loc) · 635 Bytes

Description

This project is about creating a hindi dataset cleaning python package.It will be a command line based solution to pre-process hindi dataset. The abilities of this package will include-

  • pre-processing given file into hindi characters only.
  • splitting paragraphs into sentences
  • removal of punctuations from the dataset if required.

Technologies In Use

  • Python
  • Data Science

Number of member/s required: 2

Start Date: 11-08-2020

Expected Deadline: 20-09-2020

Contributors