-
Notifications
You must be signed in to change notification settings - Fork 6k
Description
In the given example only one feature (Sentiment Text - of string type) is considered for predicting whether sentiment Positive/Negative.
I am trying Binary classification for Multiple features , my question is how can i create Pipeline with multiple features having different data type?
public class ProposalData
{
[LoadColumn(0)]
public string ProposalID { get; set; }
[LoadColumn(1)]
public string Status { get; set; }
[LoadColumn(2)]
public float TotalContractValue {get;set;}
[LoadColumn(3)]
public float DevicesCount { get; set; } // no int/double in ML.net
[LoadColumn(4),ColumnName("Label")]
public bool WinSentiment{ get; set; }
}
Below is the pipeline created.
Since Status - is string type its featured. my understanding is DeviceCount and TotalContractValue are Float type that does not require FeaturiseText().
var pipeline = mlContext.Transforms.Text.FeaturizeText(inputColumnName: "Status", outputColumnName: "StatusFeatured")
.Append(mlContext.Transforms.Concatenate("Features", "StatusFeatured", "TotalContractValue", "DevicesCount"))
.Append(mlContext.BinaryClassification.Trainers.FastTree(numLeaves: 50, numTrees: 50, minDatapointsInLeaves: 20));
With above code i am getting incorrect results.
Accuracy: 100%
Auc: 100%
F1Score: 100%.
Can anyone please help me for creating right pipeline considering Status,DeviceCount,TotalContractValue as features.
I tried DeviceCount and TotalContractValue types changed to String. and added them in FeaturiseText()
Still my predictions are wrong.
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
- ID: c8e22825-c4ee-5a36-f911-8ad456970ecc
- Version Independent ID: ad0bf640-7dfc-1184-077c-358bb14277a7
- Content: Use ML.NET in a sentiment analysis binary classification scenario
- Content Source: docs/machine-learning/tutorials/sentiment-analysis.md
- Product: dotnet-ml
- GitHub Login: @JRAlexander
- Microsoft Alias: johalex