Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation maintenance and small fixes #82

Closed
14 tasks done
Sauro98 opened this issue Feb 8, 2021 · 14 comments
Closed
14 tasks done

Documentation maintenance and small fixes #82

Sauro98 opened this issue Feb 8, 2021 · 14 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@Sauro98
Copy link
Member

Sauro98 commented Feb 8, 2021

Here is a collection of work that needs to be done regarding documentation and minor issues with the code itself. Picking an item from this list can be a good way to start getting acquainted with the codebase and provide a useful contribution. 😄 👍🏻

In general, I would say that the pages for the individual algorithms in the linfa-clustering and linfa-elasticnet sub-crates can be good references for the structure of a documentation page.

It's suggested to look at the existing documentation in your local build rather than on doc.rs since some pages may have already been updated.

  • Update linfa-bayes's sub-crate documentation to include a brief description of the algorithm used and add an example of the usage of the provided model along with maybe some hints regarding when to choose a naive Bayes predictor. There are already some examples in the dedicated page for the params structure but maybe it's better to add at least one to the main page for the sub-crate
  • linfa-hierarchical: The documentation for this sub-crate needs improvements like the ones listed for linfa-bayes
  • linfa-ica: Also needs updates similar to linfa-bayes
  • linfa-kernel: Add a description of what kernel methods are useful for, similarly to what's written in the linfa-svm crate, and add descriptions to the Kernel and KernelView subtypes. There are some examples in the KernelBase struct Documentation but maybe it would be a good idea to have an example of kernel transformation directly on the crate's main page.
  • linfa-linear: The crate's documentation is inside the ols module instead of being in the crate's root and it is outdated since it says it only provides an implementation of the least squares algorithm
  • linfa-linear: glm::TweedieRegressor provides its own fit method instead of implementing the Fit trait, and the same for predict. Moving the methods inside the trait would be a good way to align it with the rest of the algorithms provided and ensure compatibility
  • linfa-logistic could use some more documentation to explain the algorithm and some examples in its main page
  • linfa-logistic implements its own fit and predict methods instead of implementing the Fit and Predict traits. Like in the case of linear regression it would be a good thing to align the interface with the other sub-crates
  • linfa-reduction completely lacks any documentation. An explanation of why dimensionality reduction is useful in ML and descriptions of the single algorithms would really help with understanding the usefulness of the crate
  • linfa-svm: this is another reference for what other crates' documentation should look like. Right now the predict method returns an Array1 for classification and a Vec for regression. It would be less confusing if the regression case was modified to return an Array1 too.
  • linfa-trees: it needs changes similar to linfa-bayes, with the addition of documentation regarding the methods for the params structure
  • linfa: The main page of the rustdocs still says that linfa only provides the K-Means algorithm which may be very confusing to someone that only sees the crate in doc.rs for the first time. Here a completely revised page compete with project goals and links to the various sub-crates would be useful, so that one can find the algorithm they need without manually searching for the sub-crates.
  • linfa: The dataset module page could use a bit of explanation about the main differences between the four dataset types and a brief recollection of what utility methods a dataset provides
  • linfa: The metrics module page could use a brief list of the provided metrics so that one does not have to go looking for them in the sup-pages if they only want to know whether a metric is provided or not
@Sauro98 Sauro98 added help wanted Extra attention is needed good first issue Good for newcomers labels Feb 8, 2021
@Sauro98 Sauro98 changed the title Documentation manteinance and small fixes Documentation maintenance and small fixes Feb 8, 2021
@quietlychris
Copy link
Member

I'll be starting work on some of these tonight!

@quietlychris
Copy link
Member

quietlychris commented Feb 10, 2021

Apologies for the delay--I've been having some issues with building the linfa-bayes sub-crate, due to an issue with the rand::rngs::SmallRng import. Not exactly sure why it's happening yet, but I've experienced the same thing on both aarch64 and x86-64 architectures, using compilers rustc 1.149.0-stable and rustc 1.51.0-nightly respectively, after cloning a fresh repository from master. Can anyone else check if they're seeing the same behaviour?

Edit: Never mind, it's only happening when I try running cargo build directly in the linfa-bayes sub-directory, not when I run cargo build --all in the main one. I thought I might be able to build the documentation in each sub-crate separately, but not a huge deal to work from the main library.

@Sauro98
Copy link
Member Author

Sauro98 commented Feb 10, 2021

Can confirm, the same thing is happening to me

@bytesnake
Copy link
Member

I pushed the missing feature flag in a4362fa, do you have any idea why it compiles even without the proper feature flag. The error actually slipped through our CI system :/

@quietlychris
Copy link
Member

I'm not exactly sure. When I run the cargo build --all, linfa_bayes is listed as being built alongside the others. Maybe it has something to do with the dependency not being available from the higher-level crate where it's being called from (rand is not an explicit dependency in linfa_bayes's Cargo.toml file)?

I'm also seeing the same behaviour in linfa-kernel, where the rand::rngs::SmallRng is causing a compilation error, but not in any of the other sub-crates.

@relf
Copy link
Member

relf commented Feb 10, 2021

Hi. For what it's worth, I have no problem with cargo build --all on CentOS7, stable, rustc 1.49, but on Windows 7, stable-msvc, rustc 1.49 I got the following error when compiling linfa-reduction:

error[E0432]: unresolved imports `ndarray_linalg::lobpcg`, `ndarray_linalg::lobpcg`, `ndarray_linalg::TruncatedOrder`
 --> linfa-reduction\src\diffusion_map\algorithms.rs:3:21
  |
3 |     eigh::EighInto, lobpcg, lobpcg::LobpcgResult, Lapack, Scalar, TruncatedOrder, UPLO,
  |                     ^^^^^^  ^^^^^^                                ^^^^^^^^^^^^^^ no `TruncatedOrder` in the root
  |                     |       |
  |                     |       could not find `lobpcg` in `ndarray_linalg`
  |                     no `lobpcg` in the root

error[E0432]: unresolved imports `ndarray_linalg::TruncatedOrder`, `ndarray_linalg::TruncatedSvd`
 --> linfa-reduction\src\pca\algorithms.rs:6:22
  |
6 | use ndarray_linalg::{TruncatedOrder, TruncatedSvd};
  |                      ^^^^^^^^^^^^^^  ^^^^^^^^^^^^ no `TruncatedSvd` in the root
  |                      |
  |                      no `TruncatedOrder` in the root

Not sure it is on linfa. It seems I miss some ndarray-linalg parts. Any idea?

@bytesnake
Copy link
Member

mh strange are you using the current version of ndarray-linalg 0.12.1? Try to perform a cargo update and see whether the dependency is updated

@Sauro98
Copy link
Member Author

Sauro98 commented Feb 10, 2021

It works for me on windows 10, rustc 1.49 stable-x86_64-pc-windows-msvc, both when building the whole workspace and when building linfa-reduction by itself.
Edit: well, the build itself works but then actually running it is a different story since it gives me a bunch of problems with BLAS (AMD processor if that is of any help)

@relf
Copy link
Member

relf commented Feb 10, 2021

cargo update did the job! It works now. cargo build --all should have handled that or did I miss something?

@bytesnake
Copy link
Member

bytesnake commented Feb 10, 2021

@relf no we should fix the ndarray-linalg version to 0.12.1 on the linfa side (done in cf3c611)

@quietlychris I found the reason for your error: cargo build --all seems to take the development dependencies into consideration as well and ndarray-rand uses the small_rng feature flag which got enabled for all sub-crates. linfa-bayes on the other hand does not use ndarray-rand and consequently misses the feature flag.

@relf
Copy link
Member

relf commented Feb 10, 2021

@relf no we should fix the ndarray-linalg version to 0.12.1 on the linfa side

Yep you nailed it, from cargo update output, indeed:

ndarray-linalg v0.12.0 -> v0.12.1

@Sauro98
Copy link
Member Author

Sauro98 commented Mar 10, 2021

Posting here under the "small fixes" label: on mobile the code snippets in the website exit the code box on the right side

@bytesnake
Copy link
Member

with #96 all issues are resolved, but the line break in the website's code box is still smoking

@bytesnake
Copy link
Member

fixed the issue with an overflowing code box in 402668d 🦀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants