Description
In our estimators and other similar components often have advanced settings, because, sometimes people have unusual circumstances. At the same time, there is a 95% or 99% scenario for "simple" usage that most people will be happy with. For this reason we have often made a distinction between common and advanced settings, as we see here.
machinelearning/src/Microsoft.ML.FastTree/TreeTrainersCatalog.cs
Lines 29 to 38 in cb37c7e
There are some possible things that excited feedback:
-
Echoing feedback seen in Rename mlContext.Data.TextReader() to mlContext.Data.CreateTextLoader() #1690, these things where we're making something should have the prefix
Create
, even in situations where this a catalog where we are always creating. Note:Create
preferred toMake
. -
The worth of ASP.NET style configuration was questioned (seen above as
Action<FastTreeRegressionTrainer.Arguments> advancedSettings
), e.g., there may not be much purpose in having a delegate. The older style where it just takes theArguments
period was preferred. -
Having this
Arguments
object as a nested class the component being created was viewed as positive, but this would be more idiomatically calledOptions
--Arguments
was a holdover name from when these were exclusively for command line arguments, but for the API this is not a great name. So while keeping the general structure of how they are placed currently, they should probably be renamed toOptions
. -
It is good to have the convenience for the simple arguments, however, if we have both simple and advanced settings, we should not mix them but have instead two distinct constructors/extension methods. (E.g., in the above, we would have two methods, one that took the advanced options.) To do otherwise is to invite confusion about which "wins" if we have the setting set in both.
- Note that phase setting "set in both," which suggests that these settings object should retain the "simpler" settings in them. This reinforces feedback elsewhere as seen here.
-
If the simple arguments are totally sufficient, then there is no need to expose this
Arguments
class in hte public API. (For practical reasons relating to command line and entry-point usage, we still need to always have theseArguments
objects, but if they serve no purpose for the API the class can simply be made internal.)
/cc @KrzysztofCwalina, @terrajobst , on whose feedback this list is primarily based, and who can correct me and provide clarification in case I misspoke.