Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PageRank notebook #111

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

navyagarwal
Copy link

Here's the notebook for the PageRank algorithm.

I would love to get an initial review. (I am concerned that some portions might have become too complex)

(The function my_draw_networkx_edge_labels at the end of the notebook will be added to the main NetworkX repo.)

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a partial review here... but I thought I would submit it rather than wait for the time to get through everything in the notebook. I hope it doesn't ramble too much. :)

This looks quite good. These comments and suggestions are intended to discuss areas that are maybe hard to describe. And there is a choice to be made between more detail and thus maybe too much information, versus less detail and thus maybe too little information. So, if anything here is too detailed or not detailed enough, adjust as you see fit.

content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
content/algorithms/pagerank/pagerank.md Show resolved Hide resolved

To represent the transition probabilities, a transition matrix is constructed as shown above. It is a square matrix with its rows and columns corresponding to the states (web pages). Entry $ P_{i j} $ in matrix represents the transition probability of going from state $i$ to state $j$.

The fundamental components of a Markov chain are the set of states, transition probabilities, and an initial state distribution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These components are the "inputs" to the markov chain. Another important component is the output: the chance of being in each state in the long run.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the long-run probability of being in each state is the output of computing the stationary distribution of a Markov chain and not inherently a component of it. Plus, there exist periodic and reducible Markov chains that will not have a stable long run probability distribution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- I was using the word "component" as you do in the text. Not in the sense of a graph component, but in the sense of a part of the Markov chain process. Maybe the wording "The fundamental parts of a Markov chain...". Or with the wording I used, "The fundamental inputs to a Markov chain..."

@MridulS
Copy link
Member

MridulS commented Jul 10, 2023

@navyagarwal I have fixed up the notebooks (heading levels and linting) so they pass our CI. You would need to pull down the changes git pull origin pagerank to your local machine before pushing any new changes. Are you using Google Colab to make the notebooks or doing them locally? The linting needs to be setup before pushing with pre-commit :)

@navyagarwal
Copy link
Author

@MridulS I am writing the notebooks locally, and I didn't know about the linting before, but I think I found the instructions now, this here is the one, right?

@MridulS
Copy link
Member

MridulS commented Jul 10, 2023

Yeah those are the ones :) but it misses one step (a git add content/) after the conversion to ipynb. I would suggest looking at the CI workflow step for linting https://github.com/networkx/nx-guides/blob/main/.github/workflows/notebooks.yml

And as you can see it's not too intuitive to do this currently but we are stuck with this process for now 😅 but we really should do a better job of documenting everything here.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some minor wording suggestions for the PageRank Notebook. It basically finishes the suggestions I started a few months ago. You can adapt/change/ignore any and all of the suggestions. Thanks for the nice pagerank description!

content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
content/algorithms/pagerank/pagerank.md Outdated Show resolved Hide resolved
Co-authored-by: Dan Schult <dschult@colgate.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants