Synonym Comparison Tool

This program uses text processing to find the closest synonym for a given word based on a list of word choices. To achieve this, the program uses the cosine similarity between frequency vectors for the words to determine the synonym.

How It Works

The program consists of two classes: 'Main' and 'Synonyms'.

The 'Main' class:

Generates a corpus of classic literature by attempting to create a predefined array of URLs to literature from www.gutenberg.org.
Enters a loop which:
- Prompts the user to enter a primary word and a list of word choices.
- Uses the 'Synonyms' class to calculate the cosine similarity between the primary word and the word choices.
- Prints the result.

The 'Synonyms' class:

Contains the logic for parsing a corpus of text files and analyzing the frequency of occurrences of each word in the text.
Constructs a HashMap of decriptors for each word in the corpus and the resulting descriptor vectors for each word in the descriptors map.
Calculates cosine similarity between frequency vectors for the words.
- Returns -1.0 if either the primary word or the currently searched, word choice was not found in corpus.
Contains a 'toString' method to return the result.
- Returns the prompt "There are no synonyms" if either the primary word or all the word choices were not found in corpus.

Limitations

Due to the limited size of corpus, the program may determine an incorrect synonym. This is can be remedied by either solution.

Increase the size of corpus to increase the amount of data the program has access to.
Tailor the corpus to match the category of the primary word.
- If primary word is a historical word, then fill corpus with historical text;
- If primary word is a sports word, then fill corpus with sports text.

Acknowledgments

The program uses the following eight classic novels from Project Gutenberg as the corpus for generating synonyms:

Pride and Prejudice, by Jane Austen
The Adventures of Sherlock Holmes, by A. Conan Doyle
A Tale of Two Cities, by Charles Dickens
Alice's Adventures In Wonderland, by Lewis Carroll
Moby Dick; or The Whale, by Herman Melville
War and Peace, by Leo Tolstoy
The Importance of Being Earnest, by Oscar Wilde
The Wisdom of Father Brown, by G.K. Chesterton

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Synonyms		Synonyms
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synonym Comparison Tool

How It Works

Limitations

Acknowledgments

About

Languages

AdamZieman/synonym-comparison-tool

Folders and files

Latest commit

History

Repository files navigation

Synonym Comparison Tool

How It Works

Limitations

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages