-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
129 additions
and
126 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,68 @@ | ||
<h1 align="center">wikionary - Модуль для поиска Синонимов, Антонимов и т.д.</h1> | ||
# RusLingua 📚 | ||
|
||
<br> | ||
RusLingua is a Python library for retrieving various linguistic information about Russian words. It provides a simple API to get synonyms, antonyms, word associations, cognate words, and definitions. | ||
|
||
<h1 align="center"> -Как это работает?- </h1> | ||
## Features | ||
|
||
- Get synonyms of a word 👥 | ||
- Get antonyms of a word 👎 | ||
- Get word associations 💭 | ||
- Get cognate words (words with a common root) 🌳 | ||
- Get definitions from dictionaries 📖 | ||
|
||
## Quickstart | ||
|
||
```python | ||
from ruslingua import RusLingua | ||
|
||
ruslingua = RusLingua() | ||
|
||
synonyms = ruslingua.get_synonyms('дом') | ||
antonyms = ruslingua.get_antonyms('дом') | ||
associations = ruslingua.get_associations('дом') | ||
cognates = ruslingua.get_cognate_words('дом') | ||
definition = ruslingua.get_definition('дом') | ||
|
||
print(synonyms) | ||
print(antonyms) | ||
print(associations) | ||
print(cognates) | ||
print(definition) | ||
``` | ||
|
||
## Installation | ||
|
||
``` | ||
pip install ruslingua | ||
``` | ||
|
||
## Usage | ||
|
||
Import the RusLingua class and instantiate it: | ||
|
||
```python | ||
import wikionary as w | ||
from ruslingua import RusLingua | ||
|
||
ruslingua = RusLingua() | ||
``` | ||
|
||
Then call the methods with a word to get the linguistic information: | ||
|
||
```python | ||
synonyms = ruslingua.get_synonyms('дом') | ||
antonyms = ruslingua.get_antonyms('дом') | ||
associations = ruslingua.get_associations('дом') | ||
cognates = ruslingua.get_cognate_words('дом') | ||
definition = ruslingua.get_definition('дом') | ||
``` | ||
|
||
The methods return lists of strings. | ||
|
||
## Credits | ||
|
||
print(w.getSynonims('привет')) #Найти синонимы к слову привет | ||
print(w.getAntonyms('привет')) #Найти антонимы к слову привет | ||
print(w.getPhraseologs('привет')) #Найти фразеологизмы к слову привет | ||
print(w.getHyperonims('привет')) #Найти гиперонимы к слову привет | ||
print(w.getAssociations('привет')) #Найти ассоциации к слову привет | ||
print(w.getRandomWord()) #Найти случайное слово | ||
RusLingua retrieves data from various sources: | ||
|
||
print(w.inflectWord('привет')) #Сделать разбор по падежам | ||
``` | ||
- [jeck.ru](https://jeck.ru) - synonyms 👥 | ||
- [razbiraem-slovo.ru](https://razbiraem-slovo.ru) - antonyms 👎 and cognate words 🌳 | ||
- [wordassociations.net](https://wordassociations.net) - word associations 💭 | ||
- [gramota.ru](https://gramota.ru) - definitions 📖 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
from .ruslingua import RusLingua |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
import requests | ||
from lxml import html | ||
from urllib.parse import unquote | ||
|
||
class RusLingua: | ||
def __init__(self): | ||
self.user_agent = ('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 ' | ||
'(KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36') | ||
self.headers = {'User-Agent': self.user_agent} | ||
|
||
def get_html_tree(self, url): | ||
response = requests.get(url, headers=self.headers) | ||
return html.fromstring(response.content) | ||
|
||
def get_synonyms(self, word): | ||
synonyms = [] | ||
tree = self.get_html_tree( | ||
f'https://jeck.ru/tools/SynonymsDictionary/{word}') | ||
|
||
for url in tree.xpath('//td/a/@href'): | ||
if '+' not in url: | ||
parts = url.split('Dictionary/') | ||
if len(parts) > 1: | ||
synonyms.append(unquote(parts[1])) | ||
|
||
return synonyms | ||
|
||
def get_antonyms(self, word): | ||
tree = self.get_html_tree(f'https://razbiraem-slovo.ru/antonyms/{word}') | ||
elements = tree.xpath("//*[contains(@href, 'antonyms')][count(@class)=0]") | ||
|
||
antonyms = [element.text.strip() for element in elements if element.text.strip()] | ||
return antonyms | ||
|
||
def get_cognate_words(self, word): | ||
tree = self.get_html_tree(f'https://razbiraem-slovo.ru/odnokorennye/{word}') | ||
elements = tree.xpath("//*[contains(@href, 'po-sostavu')][count(@class)=1]") | ||
|
||
cognates = [element.attrib['href'].split('/')[-1] for element in elements if 'разбор' in element.text] | ||
return cognates | ||
|
||
def get_associations(self, word): | ||
associations = [] | ||
tree = self.get_html_tree( | ||
f'https://wordassociations.net/ru/ассоциации-к-слову/{word}') | ||
|
||
for url in tree.xpath('//li/a/@href'): | ||
if 'D1%83/' in url: | ||
associations.append(unquote(url.split('D1%83/')[1]).lower()) | ||
|
||
return associations | ||
|
||
def get_definition(self, word): | ||
tree = self.get_html_tree( | ||
f'https://gramota.ru/poisk?query={word}&mode=slovari&l=1&dicts[]=42') | ||
|
||
full_text = ''.join(tree.xpath("//div[contains(@class, 'description')]//text()")) | ||
|
||
return full_text.strip() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.