Skip to content

java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.Map.get(Object)" is null #8

@schanz007

Description

@schanz007

Hello there,

I'm working currently on my master thesis, in which I look into different text segmentation algorithms
and their benefits for preprocessing documents in Retrieval Augmented Generation (RAG).
For this, I wanted to run your TopicTiling method because it seems very promising, however I encountered the following problem.
Somehow, it throws a NullPointerException.
As a start, I used multiple texts like the one from a previous issue or parts of the Readme File, the paper, etc. I get always the same error.

I called the project with following command:
sh topictiling.sh -ri 5 -tmd topicmodel -tmn model-final -fp "Test.txt" -fd files_to_segment -s

I use Java17.

The whole error code:
INFORMATION: Found [1] resources to be read

The current version uses the Stanford segmenter for tokenization. However, this tokenizer does not play well on languages without any latin characters (e.g. Chinese, Arabic, Hebrew, Japanese, etc.). In order to segment such languages, segment the texts beforehand and use the parameter -s that disables the tokenization and expects all words segmented by white spaces.

Nov. 05, 2024 4:12:05 PM jgibbslda.Model readOthersFile(188)
WARNUNG: Error while reading other file:topicmodel/model-final.others (Datei oder Verzeichnis nicht gefunden)
java.io.FileNotFoundException: topicmodel/model-final.others (Datei oder Verzeichnis nicht gefunden)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
at java.base/java.io.FileInputStream.(FileInputStream.java:157)
at java.base/java.io.FileInputStream.(FileInputStream.java:111)
at java.base/java.io.FileReader.(FileReader.java:60)
at jgibbslda.Model.readOthersFile(Model.java:150)
at jgibbslda.Model.loadModel(Model.java:254)
at jgibbslda.Model.initEstimatedModel(Model.java:658)
at jgibbslda.Inferencer.init(Inferencer.java:62)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.TopicTiling.(TopicTiling.java:95)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.annotator.TopicTilingSegmenterAnnotator.process(TopicTilingSegmenterAnnotator.java:119)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.uimafit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:223)
at org.uimafit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:143)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.RunTopicTilingOnFile.(RunTopicTilingOnFile.java:133)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.RunTopicTilingOnFile.main(RunTopicTilingOnFile.java:94)
Nov. 05, 2024 4:12:05 PM jgibbslda.Model initEstimatedModel(659)
WARNUNG: Fail to load word-topic assignment file of the model!

Nov. 05, 2024 4:12:05 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(407)
SCHWERWIEGEND: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:391)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.uimafit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:223)
at org.uimafit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:143)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.RunTopicTilingOnFile.(RunTopicTilingOnFile.java:133)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.RunTopicTilingOnFile.main(RunTopicTilingOnFile.java:94)
Caused by: java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.Map.get(Object)" is null
at jgibbslda.Inferencer.infSampling(Inferencer.java:184)
at jgibbslda.Inferencer.inference(Inferencer.java:99)
at jgibbslda.Inferencer.inference(Inferencer.java:126)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.TopicTiling.inference(TopicTiling.java:508)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.TopicTiling.getSimilarityScores(TopicTiling.java:366)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.TopicTiling.segment2(TopicTiling.java:150)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.TopicTiling.segment(TopicTiling.java:120)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.TopicTiling.segment(TopicTiling.java:111)
at de.tudarmstadt.langtech.semantics.segmentation.segmenter.annotator.TopicTilingSegmenterAnnotator.process(TopicTilingSegmenterAnnotator.java:125)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375)
... 6 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions