-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tiny] Use string[] instead of IEnumerable<string> in column names #2815
Conversation
I understand that we wrote our samples this way, but writing it this way means we could not, for instance, take things that are the results of linq queries. It seems like we're just giving up capabilities for no particular benefit. Why not just standardize elsewhere on I think it is important in some cases that we continue to support |
@@ -45,7 +45,7 @@ public static class TextCatalog | |||
/// </example> | |||
public static TextFeaturizingEstimator FeaturizeText(this TransformsCatalog.TextTransforms catalog, | |||
string outputColumnName, | |||
IEnumerable<string> inputColumnNames, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IEnumerable inputColumnNames, [](start = 10, length = 39)
my vote, and the original issues is to have this be the last parameter of the method, and be a params, to optimize for both cases where the users has one or more columns and wants to provied options.
Having the IEumerable, or with the string[] will require building an Enumerable/Array to hold just one element, which is unnecessary since params exists.
Why would the users have to do either:
var customized_pipeline = ml.Transforms.Text.FeaturizeText(customizedColumnName, options, new List<string> { "SentimentText" }, var customized_pipeline = ml.Transforms.Text.FeaturizeText(customizedColumnName, options, new string[] { "SentimentText" },
when they can just go:
var customized_pipeline = ml.Transforms.Text.FeaturizeText(customizedColumnName, options, "SentimentText"
I do think the signature should be:
public static TextFeaturizingEstimator FeaturizeText(this TransformsCatalog.TextTransforms catalog,
string outputColumnName,
TextFeaturizingEstimator.Options options
params string[] inputColumnNames,
)
and we are doing that everywhere else...
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption here is that output column is always single, but why can't it be multiple columns? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #2815 +/- ##
==========================================
+ Coverage 71.67% 71.68% +<.01%
==========================================
Files 809 808 -1
Lines 142416 142392 -24
Branches 16120 16112 -8
==========================================
- Hits 102076 102073 -3
+ Misses 35910 35886 -24
- Partials 4430 4433 +3
|
I switch to In reply to: 468742828 [](ancestors = 468742828) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wschin This commit appears to have broken the repo build |
There are three possible ways to denote input column names,
string[]
,IEnumerable<string>
, andparams
. I personally likestring[]
becauseIEnumerable<string>
is not used elsewhere for the same purpose.params
makes signature less typed and it forces input column names to be the last argument group.Fix #2801, Fix #2460.