Skip to content

Releases: andrewtavis/wikirec

wikirec 1.0.0

28 Dec 16:21
Compare
Choose a tag to compare

wikirec 0.2.2

20 May 09:59
Compare
Choose a tag to compare

Changes include:

  • The WikilinkNN model has been added allowing users to derive recommendations based which articles are linked to the same other Wikipedia articles
  • Examples have been updated to reflect this new model
  • books_embedding_model.h5 is provided for quick experimentation
  • enwiki_books.ndjson has been updated with a more recent dump
  • Function docstring grammar fixes
  • Baseline testing for the new model has been added to the CI

wikirec 0.2.1

29 Apr 11:51
Compare
Choose a tag to compare

Changes include:

  • Support has been added for gensim 3.8.x and 4.x
  • Wikipedia links are now an output of data_utils.parse_to_ndjson
  • Dependencies in requirement and environment files are now condensed

wikirec 0.2.0

16 Apr 15:21
Compare
Choose a tag to compare

Changes include:

  • Users can now input ratings to weigh recommendations
  • Fixes for how multiple inputs recommendations were being calculated
  • Switching over to an src structure
  • Code quality is now checked with Codacy
  • Extensive code formatting to improve quality and style
  • Bug fixes and a more explicit use of exceptions
  • More extensive contributing guidelines

wikirec 0.1.1.7

14 Mar 20:37
Compare
Choose a tag to compare

Changes include:

  • Multiple Infobox topics can be subsetted for at the same time

  • Users have greater control of the cleaning process

  • The cleaning process is verbose and uses multiprocessing

  • The workflow for all models has been improved and explained

  • Methods have been developed to combine modeling techniques for better results

wikirec 0.1.0

08 Mar 19:13
Compare
Choose a tag to compare

wikirec 0.1.0 (March 8, 2021)

First stable release of wikirec

  • Functions to subset Wikipedia in any language by infobox topics have been provided

  • A multilingual cleaning process that can clean texts of any language to varying degrees of efficacy is included

  • Similarity matrices can be generated from embeddings using the following models:

    • BERT
    • Doc2vec
    • LDA
    • TFIDF
  • Similarity matrices can be created using either cosine or euclidean relations

  • Usage examples have been provided for multiple input types

  • Optimal LDA topic numbers can be inferred graphically

  • The package is fully documented

  • Virtual environment files are provided

  • Extensive testing of all modules with GH Actions and Codecov has been performed

  • A code of conduct and contribution guidelines are included