Skip to content

Conversation

cpcdoy
Copy link
Owner

@cpcdoy cpcdoy commented Nov 6, 2020

No description provided.

cpcdoy added 13 commits November 6, 2020 14:01
* Add: sentence_piece tokenizer implementation for RoBERTa

* Add: temporarly using a fix branch on my fork of rust-bert while we wait
for the PR to be merged

* Add: distilroberta for sequence classification implementation

* Add: conversion script for distilroberta from PyTorch to Rust

* Add: test for distilroberta and sentence piece tokenizer

* Fix: padding token id is now extracted from the vocabulary +
Fix: attention mask after padding +
Add: new example test

* Fix: formatting

* Add: using rust-bert master branch with my fix PR instead of my fork until a new
release comes up

* Add: using the PR merge commit of rust-bert

* Clean: encode is now forward +
Add: all models in models/ directory +
Update: tests with new naming

* Clean: pad_sort is not in models/mod.rs

* Clean: encode_with_attention is now forward_with_attention

* Add: 0.4.0 version in Cargo.toml
@cpcdoy cpcdoy marked this pull request as ready for review November 9, 2020 16:05
@cpcdoy cpcdoy merged commit 992ebb6 into master Nov 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant