Create example notebook with English Web Treebank data set

Create an example notebook that shows analyzing the [EWT data set](https://github.com/UniversalDependencies/UD_English-EWT)

Major steps:
* Download the data set from https://github.com/UniversalDependencies/UD_English-EWT
* Read the dataset into DataFrames
* Write entire data set to a Feather file and read back in
* Display a parse tree
* Retokenize with a BERT subword tokenizer
* Show reconstructing a sentence's span using group by and aggregation
* Run document text through the Stanza EWT dependency parser (https://stanfordnlp.github.io/stanza/available_models.html) and compare the outputs against the gold standard. Or alternately use SpaCy's parser, with the caveat that it's trained on OntoNotes which has a slightly different schema.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create example notebook with English Web Treebank data set #193

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create example notebook with English Web Treebank data set #193

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions