This repo creates standard images of Stanford's CoreNLP with the default Chinese model.
See https://transcrob.es for information about how Stanford's CoreNLP is used in Transcrobes.
See https://stanfordnlp.github.io/CoreNLP/ for information on Stanford's CoreNLP itself.
Three ENV variables can be set to control the running CoreNLP process and should be passed to docker run
(or to Kubernetes, etc.):
TIMEOUT
: Default 30000 (milliseconds)
JAVA_XMX
: Default 700m (megabytes)
PORT
: Default listen 9001
CORENLP_CHINESE_SEGMENTER
: Default ctb.small.gz, memory reduced Chinese Treebank version, can also be pku.gz for pku or ctb.gz for the full/upstream default CTB version, though typically at least 2500MB mem is required for those
CORENLP_THREADS
: Default 2, number of java threads for the process
See the Transcrobes website for more information. Please also take a look at our code of conduct (or CODE_OF_CONDUCT.md in this repo).
This repo contains code/resources for building and pushing Docker container images of Stanford's CoreNLP running on eclipse-temurin:17 base images. It does not directly include any of the code from those projects.