Why knowledge graphs?
Knowledge graphs are flexible data structures which store data as nodes and ties (edges). Knowledge graphs as a general data structure are not limited to any particular methodology of data analysis (such as network analysis); rather, they allows modelling of textual data in ways very close to original sources, enable various reasoning capabilities through logical inferences, and allow flexible schema, which evolves with the progress of data collection and research.
CASTEMO (Computer-Assisted Semantic Text Modelling) is a dataworkflow for the collection workflowof data from textual sources. whose ultimateIts goal is to produce queryable, well-structured, research-oriented knowledge graphs.graphs for long-term and versatile use, rather than for single use in one publication. It featuresis athus comprehensivewell-adapted for instance to historical research and projects which combine qualitative and quantitative methodologies.
The CASTEMO data model contains some pre-defined, and a large variety of flexible features. It is based on 11 entity types (SPECTRABLOG—SPECTRABLOG – Statements, Persons, Events, Concepts, Territories / Texts, Resources, Actions, Living Beings, Locations, Physical Objects, and Groups) and various connections between them, offering a flexible and comprehensive multi-lingual structure.structure.
Statements model the syntactic structure of textual clauses and sentences and capture the semantics by linking entities. The knowledge graph is enhanced by three main kinds of connection between entities: Properties, Relations, and References. All those connections are considered optional in terms of the general data model, but they can be required by the guidelines of specific projects and the requirement can be enforced through the validation features of InkVisitor.
Properties are flexible and extensible structures composed of an origin (the entity to which the Property is being attached), property type, and property value, and read with a “has” logic: e.g.: P current U.S. president — PROP — C area of authority — L United States of America”
.
Relations, of which there are seventeen types, serve to model core ontological (e.g. Classification, Identification) and semantic (e.g. Synonymy, Superclass) connections.
References serve to relate knowledge to a specific Resource from which it has been derived.
On the whole, knowledge graphs provide the best way of structuring complex data in ways close to the original expression in the sources, keeping the analytical layers safely stored with but neatly separated from original expression.