-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
43 lines (24 loc) · 1.16 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
1.Set up Hadoop in Ubuntu OS
Open Eclipse and import all the required Hadoop jars
2.Create a java project in Eclipse
3.import the DocWordCount.java class
4.Download the canterbury input files
5.Edit the Run configuration and pass the arguments of the input files
as input space output
ex: /home/user/Downloads/input /home/user/Downloads/DocWordCount.out
6.Run the java file by clicking Run As -> Java Project
7. For TermFrequency.java class
pass the arguments as /home/user/Downloads/input /home/user/Downloads/TermFrequency.out
8. For TFIDF.java class
pass the arguments /home/user/Downloads/input /home/user/Downloads/Term /home/user/Downloads/TFIDF.out
9. For Search.java class
1st Query
pass the arguments as /home/user/Downloads/TFIDF.out /home/user/Downloads/query1.out "computer science"
2nd Query
pass the arguments as /home/user/Downloads/TFIDF.out /home/user/Downloads/query2.out "data analysis"
10. For Rank.java class
1st Query
/home/user/Downloads/query1.out /home/user/Downloads/query1-rank.out
2nd Query
/home/user/Downloads/query2.out /home/user/Downloads/query2-rank.out
NOTE: Replace input and output path as per convenient