Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions docs/machine-learning/tutorials/sentiment-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,8 @@ You need to create two global variables to hold the path to the recently downloa
Add the following code to the line right above the `Main` method:

```csharp
const string _dataPath = @"..\..\data\sentiment labelled sentences\imdb_labelled.txt";
const string _testDataPath = @"..\..\data\sentiment labelled sentences\yelp_labelled.txt";
const string _dataPath = @"..\..\..\data\imdb_labelled.txt";
const string _testDataPath = @"..\..\..\data\yelp_labelled.txt";
```

You need to create some classes for your input data and predictions. Add a new class to your project:
Expand All @@ -122,8 +122,9 @@ Add the following code, which has two classes `SentimentData` and `SentimentPred
```csharp
public class SentimentData
{
[Column(ordinal: "0")]
public string SentimentText;
[ColumnName("Label")]
[Column(ordinal: "1", name: "Label")]
public float Sentiment;
}

Expand All @@ -134,7 +135,7 @@ public class SentimentPrediction
}
```

`SentimentData` is the input dataset class and has a string for the comment (`SentimentText`), a boolean (`Sentiment`) that has a value for sentiment of either positive or negative, and a `Label` `ColumnName` attribute. `SentimentPrediction` is the class used for prediction after the model has been trained. It has a single boolean (`Sentiment`) and a `PredictedLabel` `ColumnName` attribute. The `Label` is used to create and train the model, and it's also used with a second dataset to evaluate the model. The `PredictedLabel` is used during prediction and evaluation. For evaluation, an input with training data, the predicted values, and the model are used.
`SentimentData` is the input dataset class and has a string for the comment (`SentimentText`), a `float` (`Sentiment`) that has a value for sentiment of either positive or negative. Both fields have `Column` attributes attached to them. This attribute describes the order of each field in the data file, and which is the `Label` field. `SentimentPrediction` is the class used for prediction after the model has been trained. It has a single boolean (`Sentiment`) and a `PredictedLabel` `ColumnName` attribute. The `Label` is used to create and train the model, and it's also used with a second dataset to evaluate the model. The `PredictedLabel` is used during prediction and evaluation. For evaluation, an input with training data, the predicted values, and the model are used.

In the *Program.cs* file, replace the `Console.WriteLine("Hello World!")` line with the following code in the `Main` method:

Expand Down Expand Up @@ -169,7 +170,7 @@ var pipeline = new LearningPipeline();
The <xref:Microsoft.ML.TextLoader%601> object is the first part of the pipeline, and loads the training file data.

```csharp
pipeline.Add(new TextLoader<SentimentData>(_dataPath, header: false, sep: "tab"));
pipeline.Add(new TextLoader<SentimentData>(_dataPath, useHeader: false, separator: "tab"));
```

## Data preprocess and feature engineering
Expand Down Expand Up @@ -299,7 +300,7 @@ Evaluate(model);
The <xref:Microsoft.ML.TextLoader%601> class loads the new test dataset with the same schema. You can evaluate the model using this dataset as a quality check. Add that next to the `Evaluate` method call, using the following code:

```csharp
var testData = new TextLoader<SentimentData>(_testDataPath, header: false, sep: "tab");
var testData = new TextLoader<SentimentData>(_testDataPath, useHeader: false, separator: "tab");
```

The <xref:Microsoft.ML.Models.BinaryClassificationEvaluator> object computes the quality metrics for the `PredictionModel` using the specified dataset. To see those metrics, add the evaluator as the next line in the `Evaluate` method, with the following code:
Expand Down
12 changes: 8 additions & 4 deletions docs/machine-learning/tutorials/taxi-fare.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,21 +105,25 @@ Next, create classes for the input data and the predictions:
using Microsoft.ML.Runtime.Api;
```

Add two classes into this file. `TaxiTrip`, the input data set class, has definitions for each of the columns discovered above and a `Label` `ColumnName` attribute. Add the following code to the file:
Add two classes into this file. `TaxiTrip`, the input data set class, has definitions for each of the columns discovered above and a `Label` attribute for the fare_amount column that you are predicting. Add the following code to the file:

```csharp
public class TaxiTrip
{
[Column(ordinal: "0")]
public string vendor_id;
[Column(ordinal: "1")]
public string rate_code;
[Column(ordinal: "2")]
public float passenger_count;
[Column(ordinal: "3")]
public float trip_time_in_secs;
[Column(ordinal: "4")]
public float trip_distance;
[Column(ordinal: "5")]
public string payment_type;
[Column(ordinal: "6", "Label")]
public float fare_amount;

[ColumnName("Label")]
public float Label;
}
```

Expand Down