Make a conda virtual environment from the environment.yml
file as discussed here. Make the virtual environment available in Jupyter Notebooks as discussed here. Start Jupyter Notebooks and select the environment. Run the main.ipynb
notebook.
The Python PYTHONHASHSEED
environment variable is fixed so that the built-in Python hash()
function yields consistent results. Pass the seed
variable to the MinHashing
class constructor as minhashing uses the numpy.random.randint()
function.