-
Notifications
You must be signed in to change notification settings - Fork 1
Add parsing model #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
795d2b5 to
d11d633
Compare
If --output|-o is set then the output will be written to a json file
|
|
nsorros
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Just waiting to run the examples. Takes some time to download the weights for some reason.
Running the code takes a while since it seems to download weights every time. Any way to cache them? in a location that is made clear to the user. Also the output of the split seems good for a demo but I wonder whether the outputs should just be one reference per line. Also for parse maybe a JSON or at least the tokens grouped in their categories? |
Weights are cached, it will only download if you don't have them in the right place which is specified in the config you are using. If you didn't specify a config, it uses the default ones which ship with the package. Could certainly be more clear about where it is looking for these files though.
This is the default verbose output. For tokens if you specify More generally I'm strongly considering moving away from the tsv format used by Rodrigues and presenting labels and tokens in separate files with one token/label per line. This would remove a lot of duplicated code.
Yes agree that output is sub-optimal. I'd been putting this off until the multi-task model was ready so that I could combine the splitter output with the parser output, so that we will have an idea of where a references starts and finishes. |
👍 |
Ah ok yes I remember now. So the output when |
I've added some additional feedback on where it will look for the artefacts: |
What does this PR contain
python -m deep_reference_parser parsecommandpython -m deep_reference_parser splitDeprecationWarnings caused by using keras andCRFfromkeras_contrib. It is currently non-trivial to fix these warnings.How can you test it