On Knowledge Graphs and Grakn, with Daniel from Grakn Labs

This AMA by Daniel Crowe from Grakn Labs occurred in Slogging's official #amas channel, and has been edited for readability.

Daniel CroweMar 16, 2021, 11:33 AM

Hi all! We’re from Grakn Labs, in London, inventors of Grakn, a database technology that serves as the knowledge base foundation to intelligent systems.

Jump in and thread your questions here!

Limarc AmbalinaMar 16, 2021, 3:19 PM

Welcome Daniel! Thanks for joining us on Slogging's AMAs. Can you tell us a little bit about Grakn Labs and what you do there?

Limarc AmbalinaMar 16, 2021, 3:20 PM

Also, I've briefly studied convolutional neural networks and how they work. What are the main differences between CNNs and Knowledge Graph Convolutional Networks (KGCNs)?

Daniel CroweMar 17, 2021, 6:04 PM

Grakn Labs is a London based database startup. We are the inventors of the Grakn knowledge-base and the Graql query language. Our technology helps organisations in various industries, including Life Sciences, Defence & Security, Financial Services and Robotics, to build intelligent systems that we believe will change the world. From financial analytics to drug discovery, cyber threat detection to robotics disaster recovery, our technology empowers engineers around the world to tackle a higher order of complexity in knowledge, and solve the world’s most complex problems.

My role is the global community and partnerships manager. This means, I try and turn conversation into action, accelerating our community in their development efforts, connecting users working on similar or related projects, or simply acting as a first line of support. It’s honestly been one of the most exciting and and rewarding roles I’ve been in.

Daniel CroweMar 17, 2021, 6:06 PM

What are the main differences between CNNs and Knowledge Graph Convolutional Networks (KGCNs)?

A Knowledge Graph Convolutional Network is a Convolutional Neural Network.

How is a KGCN convolutional?….. We use a learned transformation that is applied to all nodes and edges in the graph. Take a CNN being used for image classification as an analogous example: If you imagine an image’s pixel grid as a series of nodes (the pixels) and edges (to represent the adjacency of pixels), you can see that applying a convolution across the image at each pixel is the same as applying the convolution for all of the nodes of the graph.

So what’s the difference between a CNN and a KGCN? That a KGCN can ingest heterogeneous nodes and edges and learn a convolution to apply over them. It can do so over an arbitrary shape of nodes and edges (a graph) rather than the classical N dimensional array that CNNs operate over.

Utsav JaiswalMar 16, 2021, 3:21 PM

Thanks for doing the AMA Daniel.

What are the best ways to remove human biases from creeping into our machines when the data sets are massive?

Daniel CroweMar 17, 2021, 6:07 PM

This is obviously a timely question that anyone in the data/engineering world has top of mind when embarking on ML work. As we have focused on the database infrastructure, type system, native reasoner and query language, we would say that it is really up to the team working on the ML to ensure that they are being mindful and diligent about these biases. The Grakn schema enables an organisation or team to model their domain more closely to its natural representation. This means that we are able to model complexity more simply.

anthony watsonMar 16, 2021, 3:31 PM

Hello Daniel - how does "automated reasoning at the database level " work? Can I control it or is it natively part of any query or a little of both?

Daniel CroweMar 17, 2021, 6:08 PM

Great question Anthony - how does “automated reasoning at the database level ” work? Can I control it or is it natively part of any query or a little of both?Grakn is capable of reasoning over data via pre-defined rules. Graql rules look for a given pattern in the dataset and when found, create the given queryable relation (only for the time being of the given transaction). Automated reasoning provided by rules is performed at query (run) time. Rules not only allow shortening and simplifying commonly-used queries but also enable knowledge discovery and implementation of business logic at the database level.

Thanks to the reasoning facility, common patterns in the knowledge graph can be defined and associated with existing schema elements. The association happens by means of rules. This not only allows us to compress and simplify typical queries, but offers the ability to derive new non-trivial information by combining defined patterns.

Amy TomMar 17, 2021, 5:51 PM

Hi Daniel, RE automated reasoning at the database level. How much data needs to be in a database in order for this to happen

Daniel CroweMar 17, 2021, 6:15 PM

So the amount of data is less important than the insights / questions you want to generate/ask of your data. So we’d say there are no prerequisites or minimum amount of data needed to perform reasoning.

We could have a simple rule for inferring a relation between two siblings based on a common parent. The rule might look like:

define rule siblingship: when {$child1 isa person; $child2 isa person; $parent isa person; (child: $child1, parent:$parent) isa parentship; (child: $child2, parent:$parent) isa parentship; $child1 !-= $child2;} then {(sibling: $child1, sibling: $child2) isa siblingship;};

Daria KulishMar 17, 2021, 11:44 AM

Welcome Daniel! Can you share how Grakn Labs has changed since the release in 2016?

Daniel CroweMar 17, 2021, 6:16 PM

Short answer: a lot. More specifically, the team has shifted from a more academic and R&D into a product focus. Grakn, our database, has just be rereleased as Grakn 2.0.

Grakn 2.0 has brought massive scale and performance improvements as well as a new Grakn type system through Graql. Something our community has picked up quickly and are getting really excited about what they can build with it.

Speaking of our community, we’ve grown across 20 countries, hosted our first user conference - pre-lockdown - in London last February. You can find all the sessions on our YouTube channel:

Linh SmookeMar 17, 2021, 5:51 PM

Hello Daniel thank you for going on this AMAs! How has covid impacted your work, if at all?

Daniel CroweMar 17, 2021, 6:21 PM

COVID-19 has created a unique environment for startups and especially startups at our stage, where it gave us the hyper focus on the product. From a community standpoint, gone were the opportunities for us to meet with our community in person. Something we cherished.

Daniel CroweMar 17, 2021, 6:25 PM

As I mentioned Grakn Cosmos, which was our last in-person event last February, gave us a lot of energy that carried us through the isolated times of 2020.

We quickly pivoted to virtual events and have since run over 100 talks with the community, ranging from object recognition with a SPOT robot

(https://towardsdatascience.com/object-recognition-and-spacial-awareness-for-a-spot-robotics-system-2ba33152bf65) to drug discovery and modelling biomedical data

(https://towardsdatascience.com/modelling-biomedical-data-for-a-drug-discovery-knowledge-graph-a709be653168) and everything in between. We’ve really gotten into the “virtual” swing of things.

David SmookeMar 17, 2021, 5:53 PM

Hey Daniel! What have been the most helpful softwares and techniques to grow such a large open source community?

Daniel CroweMar 17, 2021, 6:29 PM

David Smooke this (community management technologies) is something I’ve really spent a good deal of time on over the last year. Our goal was to create an ecosystem for our community to:

1. collaborate and connect on topics, ideas, projects

2. learn and get started with Grakn quickly, providing support via content, live chats, and the talks I previously mentioned.

3. ensure that everyone feels welcome, empowered and supported in realising their visions

Daniel CroweMar 17, 2021, 6:32 PM

Now, what are we using in the community to do all this:

- discord has become our main community hub where everyone can engage with our team and each other
- discourse: where we have our forum
- Zoom is where we host our talks

Mar 17, 2021, 6:04 PM

Hey there Daniel Crowe! My question is personal - what kinds of content do you spend the most time reading online, and which 5 sites do you visit most often?

Daniel CroweMar 17, 2021, 6:33 PM

Thank you for your question - for me, as a community manager and someone that didn’t come from a engineering background, I spend most of my time reading articles from Towards Data Science, Dzone, various medium bloggers.

There are a couple books that I have read over the last 18 months that I would recommend:

The Master Algorithm by Pedro Domingos

Algorithms to Live By

Daniel CroweMar 17, 2021, 6:33 PM

My most often visited sites - oh boy, need to review my history real quick…
Besides the Grakn documentation……

Would probably say:

1. Hackernoon ;)
2. Dzone
3. Towards Data Science
4. This newsletter
5. MIT tech review is up there as well

anthony watsonMar 17, 2021, 6:12 PM

are my queries always text-based or could I give it an object, like the photo of a dog, without telling the system it was a dog and I could get something back...pictures of dogs but perhaps also veterinary products or donation links to the SPCA?

anthony watsonMar 17, 2021, 6:14 PM

Also, how much do I need to setup the "data"? meaning categorize things at a high level in some way to make linkages easier to find

Daniel CroweMar 17, 2021, 6:40 PM

Grakn is the database so would not perform the computer vision work to identify an image. If you were to ingest this data into Grakn, you could make use of the reasoner to infer connections between images based on some fact.

Daniel CroweMar 17, 2021, 6:42 PM

and then this one: Also, how much do I need to setup the “data”? meaning categorize things at a high level in some way to make linkages easier to find

This is a great question and gets to the core of why we use a schema in Grakn. Schema and taxonomy, are all used interchangeably in our world. The schema provides:

- logical validation on data insert
- modelling a more natural representation of your data via Grakn’s strong type system
- also holds the rules we write for reasoner

Daniel CroweMar 17, 2021, 6:43 PM

This is some work upfront but the value of this is seen when you are able to write simpler more expressive queries.

Daniel CroweMar 17, 2021, 6:45 PM

For example if we know that a given bank has risk types: war, cyber crime, etc.; these are all subtypes of “risk”. So when we want to ask for “risky banks”, we can simply as for banks that have some risk-type. Which returns all subtypes of risk.

anthony watsonMar 17, 2021, 6:44 PM

Yes I can see where that would be a high level interface, so my queries are text-based essentially. I spend a whole lot of my day writing SQL so wondering about learning another text-based query language, when SQL can do a lot....as long as I properly prepare my data, normalize and categorize, you can learn a lot of stuff in the data. I think nothing is more spectacular than to asksome question of your data and get some completely unexpected answer.

anthony watsonMar 17, 2021, 6:46 PM

The problem is when it disagrees with some particular bias in management and you have to show up to speak truth to power..uh...no sir...the data says this.

anthony watsonMar 17, 2021, 6:46 PM

if we properly define risk...yes. Rodan might show up and wreck the bank unexpectedly. https://hackernoon.com/the-black-pterodactyl-event-hc4033er

Daniel CroweMar 17, 2021, 6:48 PM

So we feel that Graql is a higher level language than SQL, which means we are pushing more important operations to the database. Gone are the need for joins, etc.

anthony watsonMar 17, 2021, 6:25 PM

I think COVID-19 creates opportunities for those that can deliver on their promises no matter where they are at. They do not need any glad-handing to get the job done. They are bringing value digitally. It strips off some of the politics of landing a client.

Daniel CroweMar 17, 2021, 6:53 PM

I’d agree with you here. We’ve focused on the direct value that a particular engagement provides the community. Tried our best to zero in on how our community learns, explores new technologies so that we can be there for them when they join.

Daniel CroweMar 17, 2021, 6:55 PM

We certainly miss the energy of live events and having everyone in the same room. There’s something special about a room full of individuals that come from all sorts of backgrounds, perspectives, industries and passions. This is something we are hoping our community still feels even in an all virtual setting.

Muhammad BilalMar 17, 2021, 6:23 PM

Hello Daniel Crowe, my startup is in the IoT Sector, solving Air Pollution by providing Air Purifiers with Air Monitors, as I collect Data, I would really love to know, How I can leverage your work in my Sector and Largely in the IoT Sector?

Daniel CroweMar 17, 2021, 6:49 PM

Muhammad Bilal thank you for your question :))

Daniel CroweMar 17, 2021, 6:50 PM

There’s so much work being done in the IoT space that is really exciting to us. We hope to support these efforts, and have already seen some of the IOTA community use Grakn for their projects in logistics, manufacturing, etc.

Daniel CroweMar 17, 2021, 6:52 PM

For your industry specifically, it could be really interesting to be able to answer questions like:

what are the common factors of cities/buildings/homes with low air quality?

what are the potential impacts of given air quality? - providing guides to those who monitor changes in air quality

Tomás SabatMar 17, 2021, 6:30 PM

Do you have any Grakn events planned?

Daniel CroweMar 17, 2021, 6:56 PM

I really just want to share one with you - we have a lot going on so this is for sure not the only event happening, but a great place for any new folks to learn, and explore what can be done with Grakn.

Daniel CroweMar 17, 2021, 6:58 PM

This is https://community.grakn.ai/grakn-orbit-2021 - our 2-day virtual conference, “for the Community, by the Community.”

It is April 21-22nd 2021 and features 30+ speakers, panelists and moderators from the life sciences, robotics, financial services, Cyber, research, legal and more.

Daniel CroweMar 17, 2021, 6:58 PM

Oh man, did we make it through them all?

Daniel CroweMar 17, 2021, 6:59 PM

Hoping that I didn’t miss anyone, this really has been so much fun. Thank you again to Hackernoon and the Slogging crew: Limarc Ambalina, Linh Smooke, David Smooke, and everyone else in the Hacker Noon Team for having us - lots of love from the Grakn team coming this way 💚

On Knowledge Graphs and Grakn, with Daniel from Grakn Labs

Too Long; Didn't Read

People Mentioned

Companies Mentioned

Coins Mentioned

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Categories

Trending Topics

On Knowledge Graphs and Grakn, with Daniel from Grakn Labs

Too Long; Didn't Read

People Mentioned

Companies Mentioned

Coins Mentioned

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES

Categories

Trending Topics