Description
I have successfully run the installation instructions and ant build
succeeded. I am running macOS Big Sur, version 11.6.
However, when trying to run any of the demos, such as cat input-english.txt | sh run-english.sh
, I get an error that says things like
- Error while loading a tagger model (probably missing model file)
- Unable to open
"edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger"
(Full version of the error is at the bottom of this message.)
Googling the error message, I find this issue in the CoreNLP itself: stanfordnlp/CoreNLP#1101
Just like the original author of that issue, I have also verified that english-left3words-distsim.tagger
is present in lib/stanford-corenlp-3.6.0-models.jar
:
$ unzip -l lib/stanford-corenlp-3.6.0-models.jar | grep words
0 01-19-2016 04:03 edu/stanford/nlp/models/pos-tagger/english-left3words/
1522 01-19-2016 04:03 edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger.props
12409329 01-19-2016 04:03 edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger
According to the CoreNLP issue 1101, the problem of not finding the model is due to some change in class loader in the CoreNLP library. I don't really know Java, so that doesn't say anything to me. :-P
Does the CoreNLP library included in UDepLambda need to be updated? Should the Java version requirements be different?
Alternatively, would it be possible to include a script that just reads e.g. conllu files and produces the logical forms from that? I tried just splitting out this part into a separete .sh file and giving it a tab-separated .conllu file as an input, but it could not read the conllu file.
# split only the semantic parser into its own shellscript
java -Dfile.encoding="UTF-8" -cp bin:lib/* deplambda.others.NlpPipeline `# This pipeline runs semantic parser` \
annotators tokenize,ssplit \
tokenize.whitespace true \
ssplit.eolonly true \
languageCode en \
deplambda true \
deplambda.definedTypesFile lib_data/ud.types.txt \
deplambda.treeTransformationsFile lib_data/ud-enhancement-rules.proto \
deplambda.relationPrioritiesFile lib_data/ud-obliqueness-hierarchy.proto \
deplambda.lambdaAssignmentRulesFile lib_data/ud-substitution-rules.proto \
deplambda.lexicalizePredicates true \
deplambda.debugToFile debug.txt \
nthreads 1
Finally, here's the full output when I run run-english.sh
.
$ cat input-english.txt | sh run-english.sh
{tokenize.whitespace=true, annotators=tokenize,ssplit, preprocess.addNamedEntities=true, ssplit.eolonly=true, preprocess.addDateEntities=true, nthreads=1}
{annotators=tokenize,ssplit,pos,lemma,ner,depparse, tokenize.language=en, ssplit.eolonly=true, nthreads=1}
{deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto, tokenize.whitespace=true, deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto, annotators=tokenize,ssplit, deplambda=true, deplambda.lexicalizePredicates=true, deplambda.definedTypesFile=lib_data/ud.types.txt, deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto, ssplit.eolonly=true, nthreads=1, languageCode=en, deplambda.debugToFile=debug.txt}
{tokenize.whitespace=true, posTagKey=UD, annotators=tokenize,ssplit,pos, ssplit.eolonly=true, nthreads=1, languageCode=en, pos.model=lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger}
NlpPipeline Specified Options : {preprocess.addDateEntities=true, ssplit.eolonly=true, tokenize.whitespace=true, nthreads=1, preprocess.addNamedEntities=true, annotators=tokenize,ssplit}
NlpPipeline Specified Options : {depparse.extradependencies=ref_only_uncollapsed, ssplit.eolonly=true, tokenize.language=en, nthreads=1, annotators=tokenize,ssplit,pos,lemma,ner,depparse}
NlpPipeline Specified Options : {pos.model=lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger, posTagKey=UD, ssplit.eolonly=true, tokenize.whitespace=true, languageCode=en, nthreads=1, annotators=tokenize,ssplit,pos}NlpPipeline Specified Options : {nthreads=1, deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto, ssplit.eolonly=true, deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto, annotators=tokenize,ssplit, deplambda=true, tokenize.whitespace=true, deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto, deplambda.definedTypesFile=lib_data/ud.types.txt, deplambda.debugToFile=debug.txt, deplambda.lexicalizePredicates=true, languageCode=en}
Adding annotator tokenize
Adding annotator tokenize
Adding annotator tokenize
Adding annotator tokenize
Adding annotator ssplit
Adding annotator ssplit
Adding annotator ssplit
Adding annotator ssplit
Adding annotator pos
Adding annotator pos
{tokenize.whitespace=true, annotators=tokenize,ssplit, preprocess.addNamedEntities=true, ssplit.eolonly=true, preprocess.addDateEntities=true, nthreads=1}
{deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto, tokenize.whitespace=true, deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto, annotators=tokenize,ssplit, deplambda=true, deplambda.lexicalizePredicates=true, deplambda.definedTypesFile=lib_data/ud.types.txt, deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto, ssplit.eolonly=true, nthreads=1, languageCode=en, deplambda.debugToFile=debug.txt}
Loading DepLambda Model..
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:799)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:320)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:273)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:85)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:73)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(AnnotatorImplementations.java:53)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$43(StanfordCoreNLP.java:544)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$70(StanfordCoreNLP.java:625)
at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)
at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:495)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:201)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:194)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:181)
at in.sivareddy.graphparser.util.NlpPipeline.<init>(NlpPipeline.java:144)
at deplambda.others.NlpPipeline.<init>(NlpPipeline.java:41)
at deplambda.others.NlpPipeline.main(NlpPipeline.java:137)
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:481)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:797)
... 17 more
deplambda.definedTypesFile=lib_data/ud.types.txt
deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto
deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto
deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto
Loaded DepLambda Model..
Loading POS tagger from lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger ... done [0.9 sec].
{tokenize.whitespace=true, posTagKey=UD, annotators=tokenize,ssplit,pos, ssplit.eolonly=true, nthreads=1, languageCode=en, pos.model=lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger}