Description
openedon Oct 28, 2019
System information
- OS version/distro: Windows 10 PRO 10.0.18362
- .NET Version (eg., dotnet --info): 3.1.100-preview1-014459
Issue
I am trying to cluster a group of documents. For this sample, I used news articles short descriptions. If I run this sample with FeaturizeText
the sample builds a model. If I try to apply TextCatalog.ApplyWordEmbedding
I get a System.IndexOutOfRangeException
.
- What did you do? Applying Wordembedding to KMeans Trainer
- What happened? IndexOutOfRangeException
- What did you expect? For the ML.NET to build my model
Source code / logs
Sample code to reproduce the problem can be found here.
StackTrace: |
---|
System.AggregateException: One or more errors occurred. (Index was outside the bounds of the array.) (Index was outside the bounds of the array.) (Index was outside the bounds of the array.) |
---> System.IndexOutOfRangeException: Index was outside the bounds of the array. |
at Microsoft.ML.Trainers.KMeansBarBarInitialization.<>c__DisplayClass3_1.b__2(VBuffer`1& point, Int32 pointRowIndex, Single[] weights, Random rand) |
at Microsoft.ML.Trainers.KMeansUtils.<>c__DisplayClass8_1`2.b__0() |
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) |
--- End of stack trace from previous location where exception was thrown --- |
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) |
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) |
--- End of inner exception stack trace --- |
at System.Threading.Tasks.Task.WaitAllCore(Task[] tasks, Int32 millisecondsTimeout, CancellationToken cancellationToken) |
at System.Threading.Tasks.Task.WaitAll(Task[] tasks) |
at System.Threading.Tasks.Parallel.Invoke(ParallelOptions parallelOptions, Action[] actions) |
--- End of stack trace from previous location where exception was thrown --- |
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw(Exception source) |
at System.Threading.Tasks.Parallel.ThrowSingleCancellationExceptionOrOtherException(ICollection exceptions, CancellationToken cancelToken, Exception otherException) |
at System.Threading.Tasks.Parallel.Invoke(ParallelOptions parallelOptions, Action[] actions) |
at Microsoft.ML.Trainers.KMeansUtils.ParallelMapReduce[TPartitionState,TGlobalState](Int32 numThreads, IHost baseHost, Factory factory, RowIndexGetter rowIndexGetter, InitAction1 initChunk, MapAction 1 mapper, ReduceAction`2 reducer, TPartitionState[]& buffer, TGlobalState& result) |
at Microsoft.ML.Trainers.KMeansBarBarInitialization.Initialize(IHost host, Int32 numThreads, IChannel ch, Factory cursorFactory, Int32 k, Int32 dimensionality, VBuffer`1[] centroids, Int64 accelMemBudgetMb, Int64& missingFeatureCount, Int64& totalTrainingInstances) |
at Microsoft.ML.Trainers.KMeansTrainer.TrainCore(IChannel ch, RoleMappedData data, Int32 dimensionality) |
at Microsoft.ML.Trainers.KMeansTrainer.TrainModelCore(TrainContext context) |
at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) |
at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Fit(IDataView input) |
at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input) |
at ClusteringNewsArticles.Train.Program.Main(String[] args) in C:\Users\maxim\Source\Repos\machinelearning-samples\samples\csharp\getting-started\Clustering_NewsArticles\ClusteringNewsArticles.Train\Program.cs:line 54 |
---> (Inner Exception #1) System.IndexOutOfRangeException: Index was outside the bounds of the array. |
at Microsoft.ML.Trainers.KMeansBarBarInitialization.<>c__DisplayClass3_1.b__2(VBuffer`1& point, Int32 pointRowIndex, Single[] weights, Random rand) |
at Microsoft.ML.Trainers.KMeansUtils.<>c__DisplayClass8_1`2.b__0() |
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state) |
--- End of stack trace from previous location where exception was thrown --- |
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)<--- |
---> (Inner Exception #2) System.IndexOutOfRangeException: Index was outside the bounds of the array.
at Microsoft.ML.Trainers.KMeansBarBarInitialization.<>c__DisplayClass3_1.b__2(VBuffer1& point, Int32 pointRowIndex, Single[] weights, Random rand) at Microsoft.ML.Trainers.KMeansUtils.<>c__DisplayClass8_1
2.b__0()
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)<--- |