- 
                Notifications
    
You must be signed in to change notification settings  - Fork 331
 
fix: serialization for arbitrary HSGP instance to time_varying_intercept/media in MMM #2016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: serialization for arbitrary HSGP instance to time_varying_intercept/media in MMM #2016
Conversation
          Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@             Coverage Diff             @@
##             main    #2016       +/-   ##
===========================================
+ Coverage   36.79%   92.51%   +55.72%     
===========================================
  Files          68       68               
  Lines        9373     9396       +23     
===========================================
+ Hits         3449     8693     +5244     
+ Misses       5924      703     -5221     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @jamespooley
Can you take a look to see if the model will load (adding a test would be appreciated)
The loading logic is here:
pymc-marketing/pymc_marketing/mmm/multidimensional.py
Lines 589 to 591 in 9c429ed
| "time_varying_intercept": json.loads( | |
| attrs.get("time_varying_intercept", "false") | |
| ), | 
| 
           @williambdean Drafted some tests to show that the loading logic itself... pymc-marketing/pymc_marketing/mmm/multidimensional.py Lines 589 to 591 in 9c429ed 
 
 ...works fine (i.e., returns the same information in the user-specific HSGP instance) with the updated code. But I also added a temporary test to clarify the issue that when calling  As mentioned, that error will go away by adding a  pymc-marketing/pymc_marketing/mmm/multidimensional.py Lines 278 to 288 in 9c429ed 
 But it seems like the proper solution would be something along these lines, just for the  pymc-marketing/pymc_marketing/mmm/multidimensional.py Lines 585 to 586 in 9c429ed 
 Any thoughts here?  | 
    
          
 Good point. Let's just start with the most obvious cases at the moment. We can make a function like that   | 
    
| 
           @williambdean Sounds good to me. The following wouldn’t be pretty and I wouldn’t be super proud of it, but because there isn’t currently an equivalent of  attrs["time_varying_media"] = json.dumps(
    self.time_varying_media
    if not isinstance(self.time_varying_media, HSGPBase)
    else {
        **self.time_varying_media.to_dict(),
        **{"hsgp_class": self.time_varying_media.__class__.__name__},
    }
)…and then a dead simple implementation of  from pymc_marketing.mmm.hsgp import HSGP, HSGPPeriodic, SoftPlusHSGP
def hsgp_from_dict(data: dict | bool):
    if isinstance(data, bool):
        return data
    HSGP_CLASSES= {
        "HSGP": HSGP,
        "SoftPlusHSGP": SoftPlusHSGP,
        "HSGPPeriodic": HSGPPeriodic,
    }
    data = data.copy()
    cls = HSGP_CLASSES[data.pop("hsgp_class")]
    return cls.from_dict(data)It's certainly hacky, but don’t know if you think an approach along these lines is too hacky even for the first pass at this. But it could work until a proper implementation is tackled in future PRs. If you're aligned on something like ☝️ as a "good enough for now" approach, I can clean up and test the implementation, update the code/tests, and then finalize this PR.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for these alterations!
| 
           I see this is still a draft PR, @jamespooley. I will merge and we can continue working in a separate PR if needed.  | 
    
.to_dict() logic to fix serialization errors when passing HSGP instances| 
           Sounds good, @williambdean. Glad I could make a tiny contribution here, and I'd happily pitch in on future PRs related to this.  | 
    
| 
           This is no tiny PR! Thanks @jamespooley 🚀  | 
    
Description
This is a draft PR being opened to let you know the bug is being worked on.
The code changes here get rid of post-fitting serialization errors like
TypeError: Object of type SoftPlusHSGP is not JSON serializablewhen passing HSGP instances to thetime_varying_interceptortime_varying_mediaparameters of themultidimensional.MMMclass.There’s not an open issue for this yet, but there’s also a related deserialization issue when passing arbitrary HSGP instances. The tl;dr is that after fitting and
.save()-ing a model with them, there’s an error when.load()-ing the model.One (admittedly not great) hotfix to get rid of that error is to replace
in this block of code (and similarly with
time_verying_media) withBut even though this gets rid of the
.load()-ing error, the time-varying structure for the saved model isn’t recovered after the.load(), and the time-varying components have the default configuration.So I’m wondering if the logic there needs to be updated to use
HSGPKwargslogic like that here or maybe something new likehsgp_from_dict()like how saturation_from_dict() is used?If so, that’s likely another PR, but I wanted to raise it here since they’re both related to the “pass arbitrary HSGP instances” functionality.
Related Issue
Checklist
pre-commit.ci autofixto auto-fix.📚 Documentation preview 📚: https://pymc-marketing--2016.org.readthedocs.build/en/2016/