- Implementation of Brown Clustering Algorithm
- The algorithm trains on data and does heirarchial clustering
- Based on this clustering, it generates a unique vector of each word
-
Dependencies : numpy and scipy
-
On running the code, it trains on the small subset of data named - "subset_data.txt"
-
This dataset contains dummy pos tags
-
It clusters the similar words and prints and saves the clusters and the vectors of the words.
python3 brown_clustering.py