-
Notifications
You must be signed in to change notification settings - Fork 6
terminology
The following terms will come up repeatedly in other documents about Vita.
The thing about which you want to make a prediction. For example, the instance might be an image that you want to classify as either hotdog or not hotdog (Silicon Valley: Season 4 Episode 4).
An answer for a prediction task, either the answer produced by the machine learning system, or the right answer supplied in training data.
For example, the label for an image might be hotdog.
Consider that:
-
symbolic regression tasks have a number as label and it can be accessed via the
label_as
function; -
although classification tasks might codify classes with different, problem-specific data types, Vita always uses its own scheme encoding classes with an integer (
class_t
). This allows a simpler, uniform, manipulation.The actual label can be accessed via the
label(example)
function and the original label via thedataframe::class_name
method.
A property of an instance used in a prediction task. For example, a car might have a feature Mileage.
NOTE: in Machine Learning a feature has several meanings depending on the context. It can be a data type (e.g. Mileage) or a data type plus its value. Many people use the words attribute and feature interchangeably.
An instance (with its features) and a label.
In Vita the class representing an example is dataframe::example
(see dataframe.h and dataframe.cc).
A statistical representation of a prediction task. You train a model on examples then use the model to make predictions.
A number that you care about. May or may not be directly optimized.
A metric that your algorithm is trying to optimize.
The infrastructure surrounding a machine learning algorithm. Includes gathering the data from the front end, putting it into training data files, training one or more models, and exporting the models to production.