#5 - Kevin Feeney on TerminusDb and the Semantic Web
Kevin Feeney talks about different kinds of graph databases, the role of logic in the Semantic Web vision, the shortcomings the technologies created to fulfill that vision, and how TerminusDb, the database project Kevin co-founded, is seeking to finally make good on the promise of the Semantic Web.
I first became aware of Kevin through a series of blog posts that explain the similarities and differences between these different kinds of databases
Then I found out about TerminusDb
"a bunch of Swedish hackers with a bunch of JSON blobs"
Full quote:
[...] there have been many more incoherent standards and initiatives that have come out of the W3C’s standards bodies — almost all of which have launched like lead balloons into a world that cares not a jot. Nevertheless, it is important to recognise that, hidden in all the nonsense, there are some exceptionally good ideas — triples, URL identifiers and OWL itself are all tremendously good ideas in essence and nothing else out there comes close. It is a sad testament to the suffocating nature of design by standards committee which has consumed countless hours of many thousands of smart and genuine researchers, that ultimately the entire community ended up getting it’s ass kicked by a bunch of Swedish hackers with a bunch of json blobs — the Neo4j property graph guys have had a greater impact upon the real world than the whole academic edifice of semantic web research.
The Semantic Web as a movement came out of Tim Berners-Lee
One of the seminal articles:
The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. ... Adding logic to the Web—the means to use rules to make inferences, choose courses of action and answer questions—is the task before the Semantic Web community at the moment.
— May 17, 2001, The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities, Tim Berners-Lee, James Hendler and Ora Lassila
The standardization of RDF, the standardization of OWL
the big, big gap in a lot of the standards of the semantic web was some type of closed world reasoning regime
I was one of the developers of software that ran a thing called Indymedia back in the early two-thousands.
most of the impetus for OWL came out of the description logic community
...there's a number of very well-known and very accomplished description logic people [...] like Peter Patel-Schneider and [Ian] Horrocks in Oxford.
...they were using predicates to point out that two things, two data structures are the same thing. But the standard didn't ... mean for that to be used for things that just happened to be the same real world thing...
Kevin discusses this misuse of owl:sameAs and owl:equivalentClass in the the fourth of his blog posts linked to above.
I was talking to some of the guys in Semantic Arts, who are very busy and active consultants in the area.
And then there is this thing called RDF stores...
[RDF stores] are based around this concept of a triple - predicate subject object
Just like Google do actually on their front page now for their knowledge graph
The Google Knowledge Graph was introduced in 2012 with the great slogan "things, not strings"
...the other thing that triples have ... is it makes revision controlled databases possible.
...we adopted a delta encoding approach to updates as is used in source control systems such as git. This provides transaction processing and updates using immutable database data structures, recovering standard database management features while also providing the whole suite of revision control features: branch, merge, squash, rollback, blame, and time-travel...
I've actually seen that very thing being described as a benefit of property graphs that each relation has its own ID and it can have [a] whole data structure associated with it
See for example neo4j's blog post RDF Triple Stores vs. Labeled Property Graphs: What’s the Difference?, in the section "Difference #1: RDF Does Not Uniquely Identify Instances of Relationships of the Same Type".
You can do it in SQL these days, but it's sort of a later addition... the WITH syntax, Common Table Expressions they are called... you can actually do recursive queries.
I once showed up at a neo4j meetup with some examples of doing graphy queries in PostgreSQL using Common Table Expressions. (The presentation would have been more impressive if I had but some indexes on those tables...) A way better introduction is the excellent page on The WITH Clause in the SQLite documentation.
People beat up on normal-form modeling and SQL way more than they should.
A good recent blog post on this topic: Normalization is not a process.
Even when I'm modelling graph stuff, I start off with basically ERD.We started out with the Java Jena library for Semantic Web.