Skip to content

Can't load tokenizer from the web-client #669

@JulienVig

Description

@JulienVig

Uploading the training file wiki.train.tokens in the web-client and hitting Train alone triggers the error:

SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data
    getModelJSON hub.js:584
    loadTokenizer tokenizers.js:62
    from_pretrained tokenizers.js:4406
    getTaskTokenizer tokenizer.js:18
    apply text_preprocessing.js:65
    async*get preprocessing/preprocessingChain</< data.js:62
    get preprocessing/preprocessingChain</< data.js:62
    get preprocessing/< data.js:63
    next lazy_iterator.ts:803
    serialNext lazy_iterator.ts:650
    lastRead lazy_iterator.ts:643
    promise callback*next lazy_iterator.ts:643
    next lazy_iterator.ts:711
    fitDataset model.js:56
    train index.js:31
    fitModel trainer.js:36

It seems the web-client doesn't manage to load the Transformers.js tokenizer.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdiscojsRelated to Disco.jsweb clientRelated to the browser environment

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions