Skip to content

Pavel's changes #70

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Pavel's changes #70

wants to merge 2 commits into from

Conversation

pavelkomarov
Copy link
Contributor

@pavelkomarov pavelkomarov commented Nov 3, 2020

Like I say in my own readme: "I've simplified and updated the code, instituted rate limiting so exporting large or all playlists actually works, gotten rid of the outdated tests, set up automatic deployment to github pages, fixed a parsing bug, enhanced the set of features, added logout functionality, and provided an ipython notebook to do something interesting with the data."

@pavelkomarov
Copy link
Contributor Author

This PR is just me trodding roughshod over your repo. You probably want to modify it some, but at least this way you have access to all my changes as a starting point. I'd prefer you work from here so some lines of code get attributed to me.

@pavelkomarov
Copy link
Contributor Author

pavelkomarov commented Nov 3, 2020

There are some oddities about this PR I think you will need to change before you merge. For instance, I've left you a CNAME which is identical to mine. If two repos try to hook in to the same website, I'm not sure how github handles that. We may need to get you your own url. It looks like you were going to use your .io/exportify location, which works but isn't as snazzy.

@pavelkomarov
Copy link
Contributor Author

Another thing is I completely removed your tests. You speak elsewhere about wanting to update those anyway. Honestly not having them has been fine for me. I had some great help from @shape55 early on to smoke out major bugs, and since then people seem to be pretty good about finding the repo and opening issues if they encounter a problem. Best practice is to have tests, but I don't have a whole lot of experience testing web apps, so I'm curious about what the most convenient and useful way to do this would be.

@watsonbox
Copy link
Owner

watsonbox commented Nov 3, 2020

🙌 Thanks so much for taking the time to backport some of this stuff.

I'll do everything I can to make sure as much gets attributed to you as possible, as I've been doing today with the other PRs 👍 .

@watsonbox
Copy link
Owner

@pavel-aicradle I'm replying to your comment here as it's more relevant on this PR.

I did start trying to merge in this PR, and indeed got through a few commits which I was building up by hand, but it became increasingly difficult because:

  1. Having merged in older PRs from other contributors, I needed to be extra careful to avoid overwriting their changes. Without a passing test case for each of those historical updates, this was far from trivial.
  2. Much of your React re-factoring is based on this observation:
    // There used to be JSX syntax in here, but JSX was abandoned by the React community because Babel does it better.
    // Around the web there seems to be a movement to not use this syntax if possible, because it means you literally
    // have to pass this .js file through a transformer to get pure JavaScript, which slows down page loading significantly.
    
    However, that doesn't seem to be the case, see this for example. I took the decision initially to go with a dynamic transform to avoid a compile step and for such a small library I figured the overhead would be negligible. Also, with RawGit there wasn't any explicit deploy step.
  3. I'm not sure of the approach to the problem of rate limiting (more to come on that).
  4. I wanted to completely upgrade the React dev stack (which you see in Update React development stack #72). This should allow for much better code structuring, testing, dependency management, more robust dev tooling, leverage many more libs etc. One of the most important parts for me is the testing library which is much improved over what I put in place 5 years ago.

My plan is therefore to work on that, get it all in place, then merge in your Jupyter notebook and credit this work in the README.

@pavelkomarov
Copy link
Contributor Author

pavelkomarov commented Nov 10, 2020

You should also consider keeping the place where I say "This is where the magic happens" and wherever I solve the get-rid-of-jquery problem. In general this structure with all the promises and async/awaits is way more straightforward than the telescoping callbacks of old, and the fact your code was missing these new language features was what frustrated and confused me enough to rewrite rather than amend. If it overwrites someone else's changes, they're still labeled a contributor. That's 4 lyfe.

My rate limiting is based on just waiting, accounting for how many queries Spotify's servers are willing to take per time per client ID. I see no good reason to use more than one client ID; it allows you to see the usage of your service in the dev dashboard, which is pretty neat, and you can't very well dynamically generate them to spread load--as far as I know. If one user at a time makes queries, then it will always work. If two or more happen to click the button at the exact same time, then there is no logic to slow the querying down or queue them, so it's possible the server would get overwhelmed and return those concurrent users 429 errors. But this is exceedingly unlikely unless your service gets an enormous number of users. If you truly want to scale, you'll have to consider a better solution. You could hack it and just lower everyone's speed limit, since a collision of n users at once is even more unlikely. But that's gross.

I'm not as attached to my React rework. Again, I didn't and still don't have much experience in it. I'm glad they've updated JSX in the time since, because it seemed like a way less verbose solution than all those React calls.

@watsonbox
Copy link
Owner

watsonbox commented Nov 12, 2020

I'm not sure of the approach to the problem of rate limiting (more to come on that)

@pavel-aicradle Hopefully #75 should show more clearly what I meant by that.

To summarize:

  • Any waiting logic / throttling we use isn't guaranteed to be future-proof for a single user
  • Per-user throttling can still fail due to the cumulative throughput of multiple users, as you point out
  • The better solution you mention is IMO to use the Retry-After header, and it's the one Spotify recommend
  • I've very keen to avoid duplicating the rate limiting strategy in the code, or even having it visible in the main application logic at all

That's the reason for working on a new implementation instead of merging what you have here. Happy to have your thoughts.

@pavelkomarov
Copy link
Contributor Author

Where do you intend to sequester it? A separate utils.js file or something?

@watsonbox
Copy link
Owner

Where do you intend to sequester it? A separate utils.js file or something?

You can see that all the rate limiting code is tucked away in the helpers module. It could in fact be removed or completely re-implemented without any change to the main application logic.

https://github.com/watsonbox/exportify/pull/75/files#diff-82be894eff44fefa4b9947fab234f0ac2966b9871d71b9beaed6848ea711536cR27-R52

As usual I'm trying to move in increments here to avoid releasing many different moving parts, but it's true that the helpers module itself is not a great place for the transport layer, which should really have its own module. I figured I'd deal with that at the same time as replacing $.ajax for the requests themselves, but the rate limiting approach should work with anything promise-based.

@watsonbox watsonbox mentioned this pull request Nov 20, 2020
@watsonbox
Copy link
Owner

Hey @pavel-aicradle 👋

Thanks again for all the work in this PR. You should have seen by now that I merged in what I could and credited you on those commits. This includes adding the necessary data for compatibility with your Jupyter notebook statistical analysis.

As for that analysis work itself, I figure there are three possibilities:

  1. Merge it into this repo
  2. Link to it from this repo
  3. Leave it completely separate

I thought about it a bit and my view is that (1) means having Python code in this repo that which I almost certainly wouldn't maintain or enhance, and kind of goes against my philosophy for this project that it should be more focused on exporting than presentation. (2) would have been my preference but since your repo is primarily a hard fork of the whole app I expect this would be very confusing for users, especially given the current contents of the README.

For that reason I've gone for option (3), but ultimately if this work were converted into a repo (exportify-statistical-analysis?) with a more technical focus on how to use the Jupyter Notebook in collaboration with Exportify, we could certainly revisit it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants