A Java library for processing natural language text using spaCy Server or CoreNLP.
Before you begin, ensure you have met the following requirements:
- You have Java 11 installed.
Additionaly, to use the spacy-server adapter, ensure you have met the following requirements:
- You have Docker installed.
- You have a running instance of spaCy Server as described in the official Docker Hub docs.
To use the spaCy Server adapter, add this to the dependencies section of your pom.xml
:
<dependency>
<groupId>io.github.manzurola</groupId>
<artifactId>spacy4j-adapters-spacy-server</artifactId>
<version>0.4.0</version>
</dependency>
To use the CoreNLP adapter, add this to the dependencies section of your pom.xml
:
<dependency>
<groupId>io.github.manzurola</groupId>
<artifactId>spacy4j-adapters-corenlp</artifactId>
<version>0.4.0</version>
</dependency>
To use spaCy4J in code, follow these steps:
// Create a new spacy-server adapter with host and port matching a running instance of spacy-server
SpaCyAdapter adapter = SpaCyServerAdapter.create("localhost", 8000);
// Create a new SpaCy object. It is thread safe and should be reused across your app
SpaCy spacy = SpaCy.create(adapter);
// Parse a doc
Doc doc = spacy.nlp("My head feels like a frisbee, twice its normal size.");
// Inspect tokens
for (Token token : doc.tokens()) {
System.out.printf("Token: %s, Tag: %s, Pos: %s, Dependency: %s%n",
token.text(), token.tag(), token.pos(), token.dependency());
}
If you wish to use the CoreNLP adapter, replace the first line above with the following:
SpaCyAdapter adapter = CoreNLPAdapter.create();
To contribute to spaCy4J, follow these steps:
- Fork this repository.
- Create a branch:
git checkout -b <branch_name>
. - Make your changes and commit them:
git commit -m '<commit_message>'
- Push to the original branch:
git push origin <project_name>/<location>
- Create the pull request.
Alternatively see the GitHub documentation on creating a pull request.
Thanks to the following people who have contributed to this project:
If you want to contact me you can reach me at guy.manzurola@gmail.com.
This project uses the following license: MIT.