corenlp-golang is a GO client to access the complete data set of Stanford CoreNLP defined by CoreNLP.proto.
The corrent version v0.4.5 is applied to the CoreNLP version 4.5.4.
$ go get -u github.com/genelet/corenlp-golang
This GO client package should be used with either command line program or web service.
Download the Stanford CoreNLP and unzip it:
https://stanfordnlp.github.io/CoreNLP/download.html.
Go to the directory and make sure to the following command line can run properly:
$ java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -file input.txt
CoreNLP can also be launched as a http Web service.
For example, under a Ubuntu account, create the following startup script and run it as a service.
$ vi ~/.config/systemd/user/coreNLP.service
[Unit]
Description=CoreNLP Server at 9000
[Service]
Type=simple
WorkingDirectory=/home/user/stanford-corenlp-4.4.0
Environment=CLASSPATH=/home/user/stanford-corenlp-4.4.0/*:
ExecStart=/usr/bin/java -mx4g -cp "/home/user/stanford-corenlp-4.4.0/*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
[Install]
WantedBy=default.target
$ systemctl --user enable coreNLP.service
$ systemctl --user start coreNLP.service
$ systemctl --user daemon-reload
The data that CoreNLP returns from natural language processing can be summaried in a protocol buffer:
https://github.com/stanfordnlp/CoreNLP/blob/main/src/edu/stanford/nlp/pipeline/CoreNLP.proto
The auto-generated GO packge is included in github.com/genelet/corenlp-golang/nlp
There are two functions implemented:
- Run: it reads natural language from a text file, and returns the NLP data as protobuf message.
- RunText: the same as Run but reads text directly.
package main
import (
"context"
"fmt"
"github.com/genelet/corenlp-golang/client"
"github.com/genelet/corenlp-golang/nlp"
)
func main() {
// assuming the Stanford CoreNLP is downloaded into /home/user/stanford-corenlp-4.4.0
// create a new Cmd instance
cmd := client.NewCmd([]string{"tokenize","ssplit","pos","lemma","parse","depparse"}, "/home/user/stanford-corenlp-4.4.0/*")
// a reference to the nlp Document
pb := &nlp.Document{}
// run NLP and receive data in pb
err := cmd.RunText(context.Background(), []byte(`Stanford University is located in California. It is a great university, founded in 1891.`), pb)
if err != nil { panic(err) }
// print some result
fmt.Printf("%12.12s %12.12s %8.8s\n", "Word", "Lemma", "Pos")
fmt.Printf("%s\n", " --------------------------------")
for _, token := range pb.Sentence[0].Token {
fmt.Printf("%12.12s %12.12s %8.8s\n", *token.Word, *token.Lemma, *token.Pos)
}
}
It outputs:
Word Lemma Pos
--------------------------------
Stanford Stanford NNP
University University NNP
is be VBZ
located locate VBN
in in IN
California California NNP
. . .
Using the web service is almost identical:
package main
import (
"context"
"fmt"
"github.com/genelet/corenlp-golang/client"
"github.com/genelet/corenlp-golang/nlp"
)
func main() {
// assuming the Stanford CoreNLP is running at http://localhost:9000
// create a new HttpClient instance
cmd := client.NewHttpClient([]string{"tokenize","ssplit","pos","lemma","parse","depparse"}, "http://localhost:9000")
// a reference to the nlp Document
pb := &nlp.Document{}
// run NLP and receive data in pb
err := cmd.RunText(context.Background(), []byte(`Stanford University is located in California. It is a great university, founded in 1891.`), pb)
if err != nil { panic(err) }
// print some result
fmt.Printf("%12.12s %12.12s %8.8s\n", "Word", "Lemma", "Pos")
fmt.Printf("%s\n", " --------------------------------")
for _, token := range pb.Sentence[0].Token {
fmt.Printf("%12.12s %12.12s %8.8s\n", *token.Word, *token.Lemma, *token.Pos)
}
}
It outputs:
Word Lemma Pos
--------------------------------
Stanford Stanford NNP
University University NNP
is be VBZ
located locate VBN
in in IN
California California NNP
. . .
Please check https://godoc.org/github.com/genelet/corenlp-golang for the complete document.