Background to the issue
The same entity may be expressed differently in different documents/paragraphs (e.g. "Apple Inc." vs. "Apple Inc."), OntoCast solves this problem with a three-level disambiguation mechanism.
prescription
- body anchoring: Predefine the entity alias table (aliases.ttl) in the ontology directory
- contextual analysis: Enable at runtime
--context-window 5Parametric Analysis Peripheral Vocabulary - manual calibration: viewed through the Fuseki interface after processing is complete
owl:sameAsrelationship chain
Typical Configuration Example
Set in the .env file:DISAMBIGUATION_STRICTNESS=0.7(The larger the value, the stricter the match)CROSS_DOC_LINKING=true(Enable cross-document entity association)
Handling of special cases
- For domain names: add glossary in data/dictionaries/
- Dynamically emerging new entities: enablement
AUTO_EXTEND_ONTOLOGY=trueAutomatic extension of the ontology
This answer comes from the articleOntoCast: an intelligent framework for extracting semantic triples from documentsThe































