-
Notifications
You must be signed in to change notification settings - Fork 0
Home
This project is implementation of one of the most frequent algorithms (MapReduce) used for word occurrences in a given speech/text in simplest manner using Open MPI. Open MPI is used because MapReduce is mainly works on Distributed Systems.
Project is implemented in C.
Input file is tokenized version of text file (No punctuation mark and all letters are lowercase). Output file consists of word and its corresponding occurrence in each line in lexicographical order.
Example commands (Both compile and run):
Compile:
mpicc code.c -o executable_name
Run:
mpirun -np 3 ./executable_name tokenized_speech output_file
mpirun --oversubscribe (If number of processors exceeds 4) -np 7 ./executable_name tokenized_speech output_file
To test results, run MPITest.py file with argument type described above. It prints result to the console whether true or false.
Example commands:
./MPITest.py input_file example_output_file generated_output_file_by_user
python MPITest.py input_file example_output_file generated_output_file_by_user