Apache OpenNLP with Spring Boot

Posted By : Amit Mishra | 05-Jan-2021

To process the natural language text we have Apache OpenNLP Library, which is an ML-based toolkit. It supports the most common Natural Language processing tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. We have these tasks to develop a more advanced form of text processing services. In OpenNLP, we also get maximum entropy and perception-based machine learning.


We can use the OpenNLP project to create a mature toolkit for the above-mentioned tasks. The OpenNLP project will serve an additional purpose of providing a large number of pre-built models for a variety of languages. It also covers annotated text resources from which those models are derived.


Apache OpenNLP requires a model file which you can get from official sites and this model file contain the definition of languages and other parts and figure of speech that you want to work with. The very first thing you need to do is to instantiate the model file. The following code snippet shows you how you can load a language detector Model and then the appropriate detector wraps this model object.

try(InputStream inputStream = Some.class.getResourceAsStream("langdetect-183.bin")){
			LanguageDetectorModel model = new LanguageDetectorModel(inputStream);
			detector = new LanguageDetectorME(model);
		}catch(Exception e) {
			logger.error("Error while loading model file : ", e);


Now, Once you've your model loaded then wrap the model inside any detector like LanguageDetectorME or SentenceDetector and there are a lot many Detectors you can find on official documentation  Let's write a basic test for this.

public void testApp()
        InputStream stream = DetectorFactory.class.getResourceAsStream("langdetect-183.bin")
        LanguageDetectorModel model = new LanguageDetectorModel(inputStream);
    	LanguageDetectorME detector = new LanguageDetectorME( model );
    	Language lang = null;
		try {
			lang = detector.detectLang(&Some language ");
		} catch (Exception e) {
			// TODO Auto-generated catch block
    	assertEquals("bel", lang.getLang()); //prints true

Apache OpenNLP has a list of tools that we can use to process Common Natural Languages and even perform the most possible operations on sentences. The list of tools is available on this link.


