-
Notifications
You must be signed in to change notification settings - Fork 2
Clarify meaning of ingest!
versus update!
#10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10 +/- ##
=======================================
Coverage 27.50% 27.50%
=======================================
Files 5 5
Lines 80 80
=======================================
Hits 22 22
Misses 58 58 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
I think I remain a little confused to the extent to which these term can translate unambiguously to the variety of algos and their implementations. For a GBT / EvoTree:
Is the intent of Is it assumed that for For neural nets, where a model is fed data through a I think a reason why I find the update / ingest distinction not so clear is that it may be that the underlying reason for a difference in implications from the 2 verbs have more to do about algorithm implementations and whether they involve preprocessing / caching, than actual distinct verbs generally applicable. For example, if using a GBT with exact method (one which does not require data preprocessing), then such tree construction algo could be implemented using a stream / online approach. Each iteration could be fed with either entirely new data (having the same features) or just another subsampling of the original data. This is a similar situation for neural nets where I don't see fundamental distinction between a batch from a fixed dataset or a batch coming from an entirely new one. And in all cases, I think there are some parms that can changed through both update / ingest like learning rates and regularization, and others that can't like number of features, or size of hidden layers. Perhaps this has already been done, but I'm wondering if a clarification of the scope of what algos / use cases are supported by the framework. By that I mean to explicit what are the implications (is there any overhead, in what circumstances) for a variety of algo families, notably:
Given the broadly different cowds that may feel concerned by the framework, it also comes with very different perspectives of what are the "natural" way of doing things and what appear like reasonablw compromise (for instance performance overhead is a big deal in my prod oriented usage, but isn't for many research / educational ones). |
@jeremiedb