Skip to content
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.

All endpoints are not available since saturday #88

Open
ssaurel opened this issue Apr 10, 2017 · 30 comments
Open

All endpoints are not available since saturday #88

ssaurel opened this issue Apr 10, 2017 · 30 comments

Comments

@ssaurel
Copy link

ssaurel commented Apr 10, 2017

Hello,

It seems that all endpoints are not available since saturday. I get an "Access Denied" error now whereas it worked great until friday 04/07/2017.

Others users have the same problem than me ?

Sylvain

@jorgegil96
Copy link

Noticed this on a separate app of mine that uses the same endpoints.

Testing with http://stats.nba.com/stats/playoffpicture?LeagueID=00&SeasonID=22015 it seems like the endpoint is still public but they're detecting non-browser requests from programs like nba_py.
Adding a fake browser user agent to the headers is not working anymore.

Would be nice to get some help from someone with more networking experience.

@ssaurel
Copy link
Author

ssaurel commented Apr 11, 2017

I tested the scoreboardV2 endpoint in my browser : http://stats.nba.com/stats/scoreboardV2?DayOffset=0&LeagueID=00&gameDate=04/09/2017

I get the Access Denied message in response.

@jorgegil96
Copy link

Ah I hadn't noticed, that really sucks.

@ssaurel
Copy link
Author

ssaurel commented Apr 11, 2017

Yes, I used also the referer header previously to make the calls working. But, it seems they have changed their API to accept only same domain request.

@bttmly
Copy link

bttmly commented Apr 11, 2017

However, you can still make these requests from their website. For instance open up http://stats.nba.com/ and then pop open the console and enter:

fetch("http://stats.nba.com/stats/scoreboardV2?DayOffset=0&LeagueID=00&gameDate=04/09/2017")
  .then(resp => resp.json())
  .then(data => console.log(data))

it works fine! (assuming you have a modern browser). So the question is, how do their API servers tell apart these requests from ones we send with a programmatic HTTP client? I don't know much about the minutiae of HTTP. It seems like it should be possible to spoof whatever they are doing. In fact if you use Chrome's "copy as cURL" feature, that works too

curl 'http://stats.nba.com/stats/scoreboardV2?DayOffset=0&LeagueID=00&gameDate=04/09/2017' -H 'DNT: 1' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36' -H 'Accept: */*' -H 'Referer: http://stats.nba.com/' -H 'Connection: keep-alive' --compressed

More to the point, I'm curious where these failed requests are originating from? There is another issue which seems like they may be blocking requests from AWS or other cloud providers. FWIW the test suite for this repo passes for me on my local machine, as do the tests for a similar Node.js client.

@jakejones
Copy link

For those of you who only need it to work on a single server. My solution was to visit stats.nba.com/scores using a browser on my server (Given you have a GUI), then replicate the request header the browser made in the curl request that my app uses. I don't know if this will work long term yet, but it seems to be working for now.

@ccagrawal
Copy link

I have a similar package in R, and I ran into the same issue.

I was able to fix it by adding the following Request Header:
'Accept-Language' = 'en-US,en;q=0.8,af;q=0.6'

@imjcham
Copy link

imjcham commented Apr 20, 2017

Looks like stats nba is restricting based on user-agent from what I can tell. I found that Chrome/48 and higher work, anything below Chrome/48 gets blocked. Wondering what kind of weird firewall or rule this is....

@ccagrawal
Copy link

ccagrawal commented Apr 21, 2017

@USJake Here are all my headers:

add_headers(
      'Accept-Language' = 'en-US,en;q=0.8,af;q=0.6',
      'Referer' = 'http://stats.nba.com/player/',
      'User-Agent' = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36'
    )

@ssaurel
Copy link
Author

ssaurel commented Apr 21, 2017

@ccagrawal It works for all the endpoints for you ?

@ssaurel
Copy link
Author

ssaurel commented Apr 21, 2017

I tried the same solution that @bttmly and @UnfixedSNIPERJ in php with a Curl session but it doesn't work. This is the code I tried

<?php

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://stats.nba.com/stats/scoreboardV2?DayOffset=0&LeagueID=00&gameDate=04/09/2017");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET");

curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');

$headers = array();
$headers[] = "Dnt: 1";
$headers[] = "Accept-Encoding: gzip, deflate, sdch";
$headers[] = "Accept-Language: en";
$headers[] = "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36";
$headers[] = "Accept: */*";
$headers[] = "Referer: http://stats.nba.com/";
$headers[] = "Connection: keep-alive";

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($ch);

if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
} else {
    echo $result;
}

curl_close ($ch);

?>

@ssaurel
Copy link
Author

ssaurel commented Apr 23, 2017

@UnfixedSNIPERJ I have the following version of Curl installed : 7.21.0 . Can you tell me your version ?

@jakejones
Copy link

@ssaurel That code worked on my local machine(curl version: 7.51.0) but interestingly it didn't work on my server (Digital Ocean)(curl version: 7.47.0).

However adding the following header made it work on my server also:
$headers[] = "origin: http://stats.nba.com";

@ssaurel
Copy link
Author

ssaurel commented Apr 23, 2017

@jakejones With this last change, it works now ! My PHP scripts work great now :). If some of your can be interested with a PHP version, tell me and I will create a repository on GitHub. For example : http://www.ssaurel.com/baskethoops/index.php?date=04/22/2017

@samody
Copy link

samody commented May 6, 2017

I have an application that pulls allot of data once a day from multiple endpoints. I started threading it a couple weeks ago to speed the process up, which admittedly wasn't very kind to stats.nba.com. Shortly after that, it started failing. Today i was using Wireshark, i could see that the app would get through a 100 requests or so then the endpoint would only send me an ACK and no data.

Then, also today
I updated my Header to include the updated USER AGENT, but really made no difference. However i did find that slowing my Request rate down to 1/second let me get a little further, but ultimately the same result. Going to add all the headers mentioned above, and try again

Same result when running the app from a different location/ WANIP address.

@bttmly i too noticed that they're not accepting requests from AWS

How many requests are you guys making in total? and how quickly?

EDIT
Added these headers, combined with a request rate of 1/ seconds, and every request is now being fulfilled.. YAY!

    'user-agent': ('Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36'),
    'Dnt': ('1'),
    'Accept-Encoding': ('gzip, deflate, sdch'),
    'Accept-Language': ('en'),
    'origin': ('http://stats.nba.com')

@johngriebel
Copy link

Are people still having this issue? I have an application that hits stats.nba.com with some code I've written as well as nba_py. It's been a month or so since I last worked on my application, and I was planning on picking it up again soon, but this may cause significant problems obviously.

Playing around in the shell, nba_py seems to be working fine. Can anybody else confirm? Perhaps the terminal seems fine because I am not exceeding a rate limit, as @samody suggested?

@gyerli
Copy link

gyerli commented May 29, 2017

Well, folks, I think this is the end of public programmatical use of stats.nba.com.

My code was running on AWS and it was ok until around 4/6/2017. I migrated to the local machine, still not working. Tried all the header variations, VPN connections etc. No luck... Most of the HTTP requests blocked.

Still trying but I have little to no hope. I will let you guys know if figure out something.

@johngriebel
Copy link

@gyerli Things seem to be working fine for me this morning. Perhaps we could compare configurations or something?

@gyerli
Copy link

gyerli commented May 31, 2017

@johngriebel What is working for you? Even test_nba_py.py is stuck on HTTP request. Please try this and let me know if it works.

import nba_py

def test():
    a = nba_py.Scoreboard(month=2, day=21, year=2015)
    print a
test()

@johngriebel
Copy link

@gyerli Works fine for me.
nba_py test

@bttmly
Copy link

bttmly commented May 31, 2017

@johngriebel where did you run that? It seems to work for people from their local machines but not from cloud instances (particularly AWS but also a few others)

@johngriebel
Copy link

@bttmly Well that makes sense, this was on a server I host myself. It's strange that @gyerli can't get things working even locally.

@ssaurel
Copy link
Author

ssaurel commented Jun 2, 2017 via email

@TrevorMcCormick
Copy link
Contributor

TrevorMcCormick commented Jun 28, 2017

@samody can confirm the headers added to the request work for me. using my own ipython notebook on a local machine. making one request at a time but i will inspect rate limits.

#Gets Lebron's common player info
url = 'http://stats.nba.com/stats/commonplayerinfo/?playerid=2544'  
headers = {'user-agent': ('Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36'),
    'Dnt': ('1'),
    'Accept-Encoding': ('gzip, deflate, sdch'),
    'Accept-Language': ('en'),
    'origin': ('http://stats.nba.com')}`  
response = requests.get(url, data = json.loads(response.text), headers = headers)
data = json.loads(response.text)
keys = data['resultSets'][0]['headers']
values = data['resultSets'][0]['rowSet'][0]
return dict(zip(keys, values))
#Returns...
{u'BIRTHDATE': u'1984-12-30T00:00:00',
 u'COUNTRY': u'USA',
 u'DISPLAY_FIRST_LAST': u'LeBron James',
 u'DISPLAY_FI_LAST': u'L. James',
 u'DISPLAY_LAST_COMMA_FIRST': u'James, LeBron',
 u'DLEAGUE_FLAG': u'N',
...

@ejbrennan99
Copy link

It's also possible that the exact same setup but from two different locations (IP address) might not work. It's not impossible that a particular IP address that is making lots of requests is getting blacklisted permanently - much like they seem to have done with the AWS ip ranges.

Does anyone know if the NBA is actually trying to prevent people from using this data at all? or just trying to limit the amount of requests? Seems odd that they have a public api, that requires no authentication, and yet seem to work very hard to prevent people from using it; especially when the correct solution (if they were in fact trying to deny access), would be to implement some security on the endpoints.

@johngriebel
Copy link

@ejbrennan99 I think they are trying to limit requests at the very least. I've been testing load limits, and it seems I can get about 15 requests in almost instantaneously. After that point, queries hang indefinitely. I haven't tried throttling yet to see what the limits might be, nor have I figured out how long the lockout period is.

@ngxiaoyi
Copy link

@johngriebel same limit rate according to my try.

@samody
Copy link

samody commented Aug 4, 2017

Best bet to obtain reliability is to slow your request rate. If you fire off another request as soon as the previous is fulfilled, the endpoint stops responding with data and only sends you an ACK. Delaying by 1 second in between requests yields success. It does take some time to get all the data though.

Threading the requests was fun while it lasted ( hope this isn't my fault 🐙 )

@ryomayes
Copy link

ryomayes commented Aug 4, 2017

After trying many different R and Python NBA endpoint frameworks, this one is the only one that seems to work as of today. @ccagrawal your R package is fantastic, but it looks like the headers are outdated. I forked your repo and copy/pasted the headers from nba_py and it works on my local machine now.

Noted on the limit rate - hopefully this is as strict as NBA gets.

@huangzhenyu
Copy link

It couldn't work again!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests