GitHub - ankushbhatia2/keyword-extract: This is a simple library for extracting keywords from data with/without using a corpus.

KeyWord Extractor Python

This is a simple library to extract keywords from a document using a corpus. Built in python 3.5.

DEPENDENCIES:

NLTK

INSTALLATION:

Installing dependencies:
    pip install --upgrade nltk
    In python Interpreter, run:
        import nltk
        nltk.download()
    Download the required packages : (pos_tagger, stop_words(optional))

Installing the package:
    Download or clone the package.
    On command line, run : python setup.py install

USAGE :

from keywords import KeyWords
from nltk.corpus import stopwords

with open('script.txt', 'r', encoding="utf8") as f:
    data = f.read()

with open('transcript_1.txt', 'r', encoding="utf8") as f1:
    corpus_1 = f1.read()

stopWords = stopwords.words('english')
keyword = KeyWords(corpus=corpus_1, stop_words=stopWords, alpha=0.8)
d = keyword.get_keywords(data, n=20)
for i in d:
    print("Keyword : %s \n Score : %f" %(i[0], i[1]))

Functionalities:

init :

Arguements:
    corpus (optional) : Add your own corpus for prioritizing keywords based on the corpus.
    stop_words : Your own stop_words list
    alpha (default 0.5) : Alpha is the weighting factor which decides how much weightage you want to give to the corpus.

get_keywords :

Arguements :
    text : Your data
    n(default 20) : Number of keywords you want to extract

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.idea		.idea
__MACOSX		__MACOSX
__pycache__		__pycache__
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
keywords.py		keywords.py
script.txt		script.txt
setup.py		setup.py
test.py		test.py
transcript_1.txt		transcript_1.txt
transcript_2.txt		transcript_2.txt
transcript_3.txt		transcript_3.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases 1

Packages

Languages

License

ankushbhatia2/keyword-extract

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages