Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small change to derive_predictors_and_scores for speed + normalization #119

Open
wants to merge 138 commits into
base: staging-collab-2
Choose a base branch
from

Conversation

PalkaPuri
Copy link
Contributor

@PalkaPuri PalkaPuri commented Sep 11, 2024

I noticed that generating a combined DF of predictors and scores was taking very long for large datasets (my kernel kept dying). This operation was very compute expensive because we were using df.merge which involves searching through column values to find matching rows. However, we can get away with using something simpler like pd.concat since all predictor/score DataFrames inherit the grid from local_windows and hence the rows match by design. This should help clear the speed bottleneck.

I also updated the function to return eCDF normalized values as an option

@PalkaPuri
Copy link
Contributor Author

Added option to scale values based on empirical CDF

dimkab and others added 3 commits October 2, 2024 16:01
…uff (collab2 ). (#126)

* Some work on the random_foragers notebook and fixing stuff.

* Linting + completing the random_foragers notebook.

* Finished random_foragers

* Interactive plots now should be displayed in HTML

* make format

* small fixes to random foragers

* Some more tweaks + zero-index fixes.

* Hungry birds simulation updated.

* Minor.

* Completed the follower NB.

* Saves the samples from each one of the R,H,F to disk, for later plotting in a single figure.

* Comparative fig.

* Minor.

* Make lint and format

* Typos

* Improved explanations of the predictors and the scores.

* Updated the model description in the random notebook.

* Minor

* Added option for initial positions. Updated RHF.

* reviewed random

* added toc to followers

* fixed followers

* Small formulas + model updates

* small modification

* small fixes, re-ran

* fixing save and display in follower

* format lint, dilling in hungry

---------

Co-authored-by: rfl-urbaniak <rfl.urbaniak@gmail.com>
Copy link
Collaborator

@rfl-urbaniak rfl-urbaniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pull origin from the current version of staging, make sure you resolve all conflicts and pass all the tests.

@rfl-urbaniak rfl-urbaniak added status:awaiting response Awaiting response from creator and removed status:WIP Work-in-progress not yet ready for review labels Oct 3, 2024
@PalkaPuri PalkaPuri changed the title Small change to derive_predictors_and_scores for speed Small change to derive_predictors_and_scores for speed + normalization Oct 6, 2024
@PalkaPuri PalkaPuri added status:awaiting review Awaiting response from reviewer and removed status:awaiting response Awaiting response from creator labels Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request status:awaiting review Awaiting response from reviewer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants