Bayesian Knowledge Graph
Bayesian Knowledge Graph explores uncertainty-aware knowledge representation by attaching confidence values to subject-predicate-object triples and updating beliefs with Bayesian-style evidence accumulation.
Quick Links
Abstract
Knowledge Graphs are widely used to represent structured information, but often assume binary truth values for entities and relationships, which limits applicability for domains where uncertain data is widespread.
This paper presents a Bayesian Knowledge Graph (BKG) framework as a Bayesian inspired approach which alters the traditional subject-predicate-object triple format with confidence values and applies Bayesian style belief updating to define relationship strength overtime. Utilizing alpha/beta distributed priors, the system updates and propagates node & edge confidence, edge beliefs, and node reliability as new evidence enters the system.
This framework for handling uncertain data is compared against the Non-Axiomatic Reasoning System (NARS) as an alternative system for handling uncertain data, due to its role as an inspiration for this project, and its status as a more mature framework for handling uncertain data in practice.
Report
- Technical report PDF - the full project writeup and evaluation summary.
Core Items
- Bayesian KG model - The core belief-update logic, confidence scoring, and propagation routines.
- CSV runner - Generates node and edge CSV outputs from the model. Good for single loads, will overwrite old data on new runs.
- Neo4j runner - Loads the model output into Neo4j for graph storage and inspection. Data persists between data loads unless manually removed from the graph in Neo4j.
- Neo4j class - Neo4j class with custom Cypher scripts to properly read out data within the graph, and load in new data.
Data and Experiments
- General data - the main medical knowledge base in NAL format.
- Experiment data - generated inputs and outputs, including ordering and contradiction runs.
- CN15k data - decoded validation data and supporting ID maps.
Resources for Running
Neo4j
The BKG project is natively designed to run on Neo4j, and the files listed in Core Items will work ‘out of the box’ with a self-hosted Neo4j instance. Other graph databases that utilize Cypher may work, however there is no guarantee. Neo4j can be run locally which is how the current authentication in Neo4j runner is set up. Neo4j’s free cloud instance (Neo4j Aura) can be utilized as well, however the Neo4jConnection instance in the Neo4j runner will need to be updated. Formatted csv files for the CN15k dataset and mocked medical dataset are available in the data section.
OpenNars
The OpenNars version used for comparative analysis in this project was OpenNARS 3.1.2. Build instructions are inside the README file. Usable .nal files for pre-loading are available in the data section.