This is the coursework of CSE210 in XJTLU which used to practice the skills of object-oriented-programming
- T3-1: calculate the number of distinct researchers in the dataset.
- T3-2: calculate the number of distinct interests in the dataset.
- T3-3: given a researcher’s name, show detailed information about him/her (e.g., university, department, interests).
- T3-4: given an interest, calculate the number of researchers who have that interest.
- T3-5: given two interests, show the number of times they co-occur.
- T3-6: given a researcher, find similar researchers based on their interests. This could be used in expert recommendation applications. It can be done using different methods, here are some examples: (1) cosine similarity (https://en.wikipedia.org/wiki/Cosine_similarity), (2) clustering, you need to self-study some clustering algorithms and choose one to use. The book [1], especially Chapter 6.6, is a good one for you to get started; any other fundamental textbooks on data mining and machine learning are also fine. The Weka library is recommended for your implementation, https://www.cs.waikato.ac.nz/ml/weka/documentation.html, http://weka.sourceforge.net/doc.dev/), (3) probabilistic topic models see reference [2] for details, the Mallet API from University of Massachusetts, is recommended, http://mallet.cs.umass.edu/api. Marking for this task will be based on the quality of the research, techniques and algorithm you use, as well as the produced results.
If you want to understand all these program, please read the instruction file at First. In addition, if you want to use the code, you need to build path for all jar files in folder of jar