Skip to content

ConvertToOnnx should also accept DataViewSchema #6448

Open

Description

Is your feature request related to a problem? Please describe.
Currently, saving a model to zip file only requires a DataViewSchema, but saving a model to ONNX requires IDataView.

Inside ConvertToOnnxProtobufCore, a prediction was performed (transform.Transform(inputData)), which may be expensive if the training data set is large.

Describe the solution you'd like
ConvertToOnnx should have overloads that accept DataViewSchema, then convert the DataViewSchema to an empty IDataView, and pass the empty IDataView to the methods accepting IDataView.

The performance of methods accepting IDataView may be improved if EmptyDataView is created from the Schema of the IDataView and passed to ConvertToOnnxProtobuf, instead of full data.

Describe alternatives you've considered
Nil

Additional context
I have implemented the proposed solution and that seems working well. It is unfortunate that EmptyDataView is an internal class, so I have to implement my own EmptyDataView.

sealed record EmptyDataView(DataViewSchema Schema) : IDataView {
	public bool CanShuffle => true;

	public long? GetRowCount() => 0L;

	public DataViewRowCursor GetRowCursor(IEnumerable<DataViewSchema.Column> columnsNeeded, Random? rand = null)
		=> new EmptyDataViewRowCursor(Schema);

	public DataViewRowCursor[] GetRowCursorSet(IEnumerable<DataViewSchema.Column> columnsNeeded, int n, Random? rand = null)
		=> Array.Empty<DataViewRowCursor>();
}

sealed class EmptyDataViewRowCursor : DataViewRowCursor {
	private readonly DataViewSchema schema;

	public EmptyDataViewRowCursor(DataViewSchema Schema) {
		schema = Schema;
	}

	public override DataViewSchema Schema => schema;

	public override long Position => -1L;

	public override bool IsColumnActive(DataViewSchema.Column column) => false;

	public override bool MoveNext() => false;

	public override long Batch => 0L;

	public override ValueGetter<TValue> GetGetter<TValue>(DataViewSchema.Column column)
		=> throw new InvalidOperationException();

	public override ValueGetter<DataViewRowId> GetIdGetter()
		=> throw new InvalidOperationException();
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions