Finding inconsistent and invalid data
For various reasons, such as data import or bugs of some version of the interface, a CASTEMO knowledge graph can contain inconsistent data. It is thus important to identify this data and correct the inconsistencies either manually or from a script.
Get all entities involved in more than one synonym cloud
ReQL
r.db("inkvisitor")
.table("relations")
.filter({ type: "SYN" })
.getField("entityIds")
.reduce(function (acc, val) {
return acc.setUnion(val);
})
.default([])
.map(function (ide) {
return {
id: ide,
count: r
.db("inkvisitor")
.table("relations")
.filter({ type: "SYN" })
.getField("entityIds")
.filter(function (syn) {
return syn.contains(ide);
})
.count(),
};
})
.filter(function (a) {
return a("count").gt(1);
});
R
Get labels of all concepts with empty language
ReQL