Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Players not being considered in prediction #383

Closed
Tdarnell opened this issue Aug 5, 2021 · 5 comments
Closed

Players not being considered in prediction #383

Tdarnell opened this issue Aug 5, 2021 · 5 comments
Labels
bug Something isn't working

Comments

@Tdarnell
Copy link
Contributor

Tdarnell commented Aug 5, 2021

It seems players (including big scorers such as Mo Salah) are not having their points predicted for the 2122 season when running airsenal_run_prediction

For example the only mention of Salah in my terminal output for the usual initialise database, update and run prediction pipeline is:
Filling history dataframe for Mohamed Salah: 0/206 done
Filling history dataframe for Mohamed Salah: 0/206 done

As someone I'd consider a nailed on player, it seems odd not to have him even considered as a points scorer by the algorithm.

@nbarlowATI
Copy link
Member

nbarlowATI commented Aug 5, 2021

Hi @Tdarnell , it does seem that something went wrong somewhere along the chain...
I just ran the init, update, predict pipeline, and it predicted Salah to be the top-scoring midfielder over the next 3 matches (19.06 pts).
To try and debug, could you try:

python
>>> from airsenal.framework.utils import *
>>> p = get_player("Mohamed Salah")
>>> tag = get_latest_prediction_tag()
>>> get_predicted_points_for_player(p, tag)

This should return a dict keyed by gameweek, with the points predictions for each week. If it's all zeros, then something went wrong (maybe the model fitting failed 3 times, which is all the retries it normally gets).
If you don't get a dict at all, then something more fundamental probably went wrong at some earlier stage of database filling...

@nbarlowATI
Copy link
Member

(note that the unreliability of fitting the team model is part of why we are hoping to be able to switch from Stan to numpyro before the start of the season (though this is now getting a bit tight! :) ))

@Tdarnell
Copy link
Contributor Author

Tdarnell commented Aug 5, 2021

They are indeed all 0's:

{1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0, 15: 0.0, 16: 0.0, 17: 0.0, 18: 0.0, 19: 0.0, 20: 0.0, 21: 0.0, 22: 0.0, 23: 0.0, 24: 0.0, 25: 0.0, 26: 0.0, 27: 0.0, 28: 0.0, 29: 0.0, 30: 0.0, 31: 0.0, 32: 0.0, 33: 0.0, 34: 0.0, 35: 0.0, 36: 0.0, 37: 0.0, 38: 0.0}

I should note that I have tried running this on two separate computers and had the same result both times, there is a systematic problem with my setup that causes this.

Both are running in windows subsystem for linux Ubuntu 20.04, and both times were using a miniconda3 environment. I will test with system python now and see if I get a different result.

@nbarlowATI
Copy link
Member

Hmm... ok, my main suspect would be the Stan model - I guess if the team model failed 3 times it wouldn't have got as far as filling the player dataframes, but you can check that you have something like

Fitting team model...
attempt 1 of 3... SUCCESS!

in your output from airsenal_run_prediction (I think searching for "SUCCESS!" should be sufficient).

For the player-level model, it fits separately for "DEF", "MID", "FWD" - if it works, there should be something in your airsenal_run_prediction output a bit like:

Fitting player model for MID ...
Initial log joint probability = -38755.1
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
      14      -18847.1    0.00339835     0.0196027           1           1       16   
Optimization terminated normally: 

Or, to see if it might be some problem with filling the historic scores in the database, a quick check is

python
>>> from airsenal.framework.utils import *
>>> p = get_player("Mohamed Salah")
>>> len(p.scores)

it should be 114 (corresponding to 38 matches x 3 seasons).

@Tdarnell
Copy link
Contributor Author

Tdarnell commented Aug 5, 2021

System python runs correctly in WSL, I assume it is a problem with conda environments within a linux container.

but you can check that you have something like

Fitting team model...
attempt 1 of 3... SUCCESS!

I did have this.

For the player-level model, it fits separately for "DEF", "MID", "FWD" - if it works, there should be something in your airsenal_run_prediction output a bit like:

I had not noticed this until looking for the "MID" model, but I am getting these errors during the MID prediction:
sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) database disk image is malformed

My guess would be it's a WSL conda issue returning, since I can't reproduce it using plain python 3.9 #81

@jack89roberts jack89roberts added the bug Something isn't working label Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants