This chapter explains about how to parse raw text using OpenNLP API.
OpenNLP uses a predefined model with a file named en-parserchunking.bin to detect sentences. You can use this predefined to parse the given raw text.
The Parser class of the opennlp.tools.Parser package is mainly used to hold the parse constituents and the ParserTool class of the opennlp.tools.cmdline.parser package to parse the content.
Below mentioned steps are used to be followed to write a program which parses the given raw text using the ParserTool class.
The model for parsing text is highlighted by the class named ParserModel, which belongs to the package opennlp.tools.parser.
To load a tokenizer model −
The Parser class of the package opennlp.tools.parser presents the data structure to capture the parse constituents. Now you can design an object of this class with the help of the static create()method of the ParserFactory class.
You can invoke the create() method of the ParserFactory by holding the model object created in the previous step, as mentioned below –
The parseLine() method of the ParserTool class is mainly used to parse the raw text in OpenNLP. This method accepts −
Execute this method by passing the sentence with the following parameters like the parse object and an integer representing the required number of parses to be carried out.
Below mentioned program explains about which parses the given raw text. Save this program in a file with the name ParserExample.java.
Let’s compile and execute the saved Java file from the Command prompt by using the following commands –
Upon executing, the above code reads the given raw text, parses it, and provides below output –
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.