Implementation of the HITS and SALSA classification algorithms on sparse matrix for highly parallel systems.
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j
-
First download metadata.json.gz from http://jmcauley.ucsd.edu/data/amazon/links.html
-
Extract file :
gunzip metadata.json.gz
-
Make it JSON readable (clean single quotes, make it array):
scripts/clean.py metadata.json clean_metadata.json
Nb : Since the
metadata.json
file contains 9.4 million entries, you can extract a sample using, for example:head -n 10000 metadata.json > sample.json
to extract the first 10000 entries