This AMA by Daniel Crowe from Grakn Labs occurred in Slogging's official #amas channel, and has been edited for readability.
Hi all! We’re from Grakn Labs, in London, inventors of Grakn, a database technology that serves as the knowledge base foundation to intelligent systems.
Jump in and thread your questions here!
Jump in and thread your questions here!
Welcome Daniel! Thanks for joining us on Slogging's AMAs. Can you tell us a little bit about Grakn Labs and what you do there?
Also, I've briefly studied convolutional neural networks and how they work. What are the main differences between CNNs and Knowledge Graph Convolutional Networks (KGCNs)?
Grakn Labs is a London based database startup. We are the inventors of the Grakn knowledge-base and the Graql query language. Our technology helps organisations in various industries, including Life Sciences, Defence & Security, Financial Services and Robotics, to build intelligent systems that we believe will change the world. From financial analytics to drug discovery, cyber threat detection to robotics disaster recovery, our technology empowers engineers around the world to tackle a higher order of complexity in knowledge, and solve the world’s most complex problems.
My role is the global community and partnerships manager. This means, I try and turn conversation into action, accelerating our community in their development efforts, connecting users working on similar or related projects, or simply acting as a first line of support. It’s honestly been one of the most exciting and and rewarding roles I’ve been in.
My role is the global community and partnerships manager. This means, I try and turn conversation into action, accelerating our community in their development efforts, connecting users working on similar or related projects, or simply acting as a first line of support. It’s honestly been one of the most exciting and and rewarding roles I’ve been in.
What are the main differences between CNNs and Knowledge Graph Convolutional Networks (KGCNs)?
A Knowledge Graph Convolutional Network is a Convolutional Neural Network.
How is a KGCN convolutional?….. We use a learned transformation that is applied to all nodes and edges in the graph. Take a CNN being used for image classification as an analogous example: If you imagine an image’s pixel grid as a series of nodes (the pixels) and edges (to represent the adjacency of pixels), you can see that applying a convolution across the image at each pixel is the same as applying the convolution for all of the nodes of the graph.
So what’s the difference between a CNN and a KGCN? That a KGCN can ingest heterogeneous nodes and edges and learn a convolution to apply over them. It can do so over an arbitrary shape of nodes and edges (a graph) rather than the classical N dimensional array that CNNs operate over.
A Knowledge Graph Convolutional Network is a Convolutional Neural Network.
How is a KGCN convolutional?….. We use a learned transformation that is applied to all nodes and edges in the graph. Take a CNN being used for image classification as an analogous example: If you imagine an image’s pixel grid as a series of nodes (the pixels) and edges (to represent the adjacency of pixels), you can see that applying a convolution across the image at each pixel is the same as applying the convolution for all of the nodes of the graph.
So what’s the difference between a CNN and a KGCN? That a KGCN can ingest heterogeneous nodes and edges and learn a convolution to apply over them. It can do so over an arbitrary shape of nodes and edges (a graph) rather than the classical N dimensional array that CNNs operate over.
Utsav Jaiswal
Thanks for doing the AMA Daniel.What are the best ways to remove human biases from creeping into our machines when the data sets are massive?
This is obviously a timely question that anyone in the data/engineering world has top of mind when embarking on ML work. As we have focused on the database infrastructure, type system, native reasoner and query language, we would say that it is really up to the team working on the ML to ensure that they are being mindful and diligent about these biases. The Grakn schema enables an organisation or team to model their domain more closely to its natural representation. This means that we are able to model complexity more simply.
Hello Daniel - how does "automated reasoning at the database level " work? Can I control it or is it natively part of any query or a little of both?
Great question Anthony - how does “automated reasoning at the database level ” work? Can I control it or is it natively part of any query or a little of both?Grakn is capable of reasoning over data via pre-defined rules. Graql rules look for a given pattern in the dataset and when found, create the given queryable relation (only for the time being of the given transaction). Automated reasoning provided by rules is performed at query (run) time. Rules not only allow shortening and simplifying commonly-used queries but also enable knowledge discovery and implementation of business logic at the database level.
Thanks to the reasoning facility, common patterns in the knowledge graph can be defined and associated with existing schema elements. The association happens by means of rules. This not only allows us to compress and simplify typical queries, but offers the ability to derive new non-trivial information by combining defined patterns.
Thanks to the reasoning facility, common patterns in the knowledge graph can be defined and associated with existing schema elements. The association happens by means of rules. This not only allows us to compress and simplify typical queries, but offers the ability to derive new non-trivial information by combining defined patterns.
Amy Tom
Hi Daniel, RE automated reasoning at the database level. How much data needs to be in a database in order for this to happen So the amount of data is less important than the insights / questions you want to generate/ask of your data. So we’d say there are no prerequisites or minimum amount of data needed to perform reasoning.
We could have a simple rule for inferring a relation between two siblings based on a common parent. The rule might look like:
We could have a simple rule for inferring a relation between two siblings based on a common parent. The rule might look like:
define rule siblingship: when {$child1 isa person; $child2 isa person; $parent isa person; (child: $child1, parent:$parent) isa parentship; (child: $child2, parent:$parent) isa parentship; $child1 !-= $child2;} then {(sibling: $child1, sibling: $child2) isa siblingship;};
Daria Kulish
Welcome Daniel! Can you share how Grakn Labs has changed since the release in 2016? Short answer: a lot. More specifically, the team has shifted from a more academic and R&D into a product focus. Grakn, our database, has just be rereleased as Grakn 2.0.
Grakn 2.0 has brought massive scale and performance improvements as well as a new Grakn type system through Graql. Something our community has picked up quickly and are getting really excited about what they can build with it.
Speaking of our community, we’ve grown across 20 countries, hosted our first user conference - pre-lockdown - in London last February. You can find all the sessions on our YouTube channel:
Grakn 2.0 has brought massive scale and performance improvements as well as a new Grakn type system through Graql. Something our community has picked up quickly and are getting really excited about what they can build with it.
Speaking of our community, we’ve grown across 20 countries, hosted our first user conference - pre-lockdown - in London last February. You can find all the sessions on our YouTube channel:
Linh Smooke
Hello Daniel thank you for going on this AMAs! How has covid impacted your work, if at all? COVID-19 has created a unique environment for startups and especially startups at our stage, where it gave us the hyper focus on the product. From a community standpoint, gone were the opportunities for us to meet with our community in person. Something we cherished.
As I mentioned Grakn Cosmos, which was our last in-person event last February, gave us a lot of energy that carried us through the isolated times of 2020.
We quickly pivoted to virtual events and have since run over 100 talks with the community, ranging from object recognition with a SPOT robot
(https://towardsdatascience.com/object-recognition-and-spacial-awareness-for-a-spot-robotics-system-2ba33152bf65) to drug discovery and modelling biomedical data
(https://towardsdatascience.com/modelling-biomedical-data-for-a-drug-discovery-knowledge-graph-a709be653168) and everything in between. We’ve really gotten into the “virtual” swing of things.
We quickly pivoted to virtual events and have since run over 100 talks with the community, ranging from object recognition with a SPOT robot
(https://towardsdatascience.com/object-recognition-and-spacial-awareness-for-a-spot-robotics-system-2ba33152bf65) to drug discovery and modelling biomedical data
(https://towardsdatascience.com/modelling-biomedical-data-for-a-drug-discovery-knowledge-graph-a709be653168) and everything in between. We’ve really gotten into the “virtual” swing of things.
Hey Daniel! What have been the most helpful softwares and techniques to grow such a large open source community?
David Smooke this (community management technologies) is something I’ve really spent a good deal of time on over the last year. Our goal was to create an ecosystem for our community to:
1. collaborate and connect on topics, ideas, projects
2. learn and get started with Grakn quickly, providing support via content, live chats, and the talks I previously mentioned.
3. ensure that everyone feels welcome, empowered and supported in realising their visions
1. collaborate and connect on topics, ideas, projects
2. learn and get started with Grakn quickly, providing support via content, live chats, and the talks I previously mentioned.
3. ensure that everyone feels welcome, empowered and supported in realising their visions
Hey there Daniel Crowe! My question is personal - what kinds of content do you spend the most time reading online, and which 5 sites do you visit most often?
Thank you for your question - for me, as a community manager and someone that didn’t come from a engineering background, I spend most of my time reading articles from Towards Data Science, Dzone, various medium bloggers.
There are a couple books that I have read over the last 18 months that I would recommend:
The Master Algorithm by Pedro Domingos
Algorithms to Live By
There are a couple books that I have read over the last 18 months that I would recommend:
The Master Algorithm by Pedro Domingos
Algorithms to Live By
My most often visited sites - oh boy, need to review my history real quick…
Besides the Grakn documentation……
Would probably say:
1. Hackernoon ;)
2. Dzone
3. Towards Data Science
4. This newsletter
5. MIT tech review is up there as well
Besides the Grakn documentation……
Would probably say:
1. Hackernoon ;)
2. Dzone
3. Towards Data Science
4. This newsletter
5. MIT tech review is up there as well
are my queries always text-based or could I give it an object, like the photo of a dog, without telling the system it was a dog and I could get something back...pictures of dogs but perhaps also veterinary products or donation links to the SPCA?
Also, how much do I need to setup the "data"? meaning categorize things at a high level in some way to make linkages easier to find
Grakn is the database so would not perform the computer vision work to identify an image. If you were to ingest this data into Grakn, you could make use of the reasoner to infer connections between images based on some fact.
and then this one: Also, how much do I need to setup the “data”? meaning categorize things at a high level in some way to make linkages easier to find
This is a great question and gets to the core of why we use a schema in Grakn. Schema and taxonomy, are all used interchangeably in our world. The schema provides:
- logical validation on data insert
- modelling a more natural representation of your data via Grakn’s strong type system
- also holds the rules we write for reasoner
This is a great question and gets to the core of why we use a schema in Grakn. Schema and taxonomy, are all used interchangeably in our world. The schema provides:
- logical validation on data insert
- modelling a more natural representation of your data via Grakn’s strong type system
- also holds the rules we write for reasoner
This is some work upfront but the value of this is seen when you are able to write simpler more expressive queries.
For example if we know that a given bank has risk types: war, cyber crime, etc.; these are all subtypes of “risk”. So when we want to ask for “risky banks”, we can simply as for banks that have some risk-type. Which returns all subtypes of risk.
Yes I can see where that would be a high level interface, so my queries are text-based essentially. I spend a whole lot of my day writing SQL so wondering about learning another text-based query language, when SQL can do a lot....as long as I properly prepare my data, normalize and categorize, you can learn a lot of stuff in the data. I think nothing is more spectacular than to asksome question of your data and get some completely unexpected answer.
The problem is when it disagrees with some particular bias in management and you have to show up to speak truth to power..uh...no sir...the data says this.
if we properly define risk...yes. Rodan might show up and wreck the bank unexpectedly. https://hackernoon.com/the-black-pterodactyl-event-hc4033er
So we feel that Graql is a higher level language than SQL, which means we are pushing more important operations to the database. Gone are the need for joins, etc.
I think COVID-19 creates opportunities for those that can deliver on their promises no matter where they are at. They do not need any glad-handing to get the job done. They are bringing value digitally. It strips off some of the politics of landing a client.
I’d agree with you here. We’ve focused on the direct value that a particular engagement provides the community. Tried our best to zero in on how our community learns, explores new technologies so that we can be there for them when they join.
We certainly miss the energy of live events and having everyone in the same room. There’s something special about a room full of individuals that come from all sorts of backgrounds, perspectives, industries and passions. This is something we are hoping our community still feels even in an all virtual setting.
Hello Daniel Crowe, my startup is in the IoT Sector, solving Air Pollution by providing Air Purifiers with Air Monitors, as I collect Data, I would really love to know, How I can leverage your work in my Sector and Largely in the IoT Sector?
Muhammad Bilal thank you for your question :))
There’s so much work being done in the IoT space that is really exciting to us. We hope to support these efforts, and have already seen some of the IOTA community use Grakn for their projects in logistics, manufacturing, etc.
For your industry specifically, it could be really interesting to be able to answer questions like:
what are the common factors of cities/buildings/homes with low air quality?
what are the potential impacts of given air quality? - providing guides to those who monitor changes in air quality
what are the common factors of cities/buildings/homes with low air quality?
what are the potential impacts of given air quality? - providing guides to those who monitor changes in air quality
Tomás Sabat
Do you have any Grakn events planned? I really just want to share one with you - we have a lot going on so this is for sure not the only event happening, but a great place for any new folks to learn, and explore what can be done with Grakn.
This is https://community.grakn.ai/grakn-orbit-2021 - our 2-day virtual conference, “for the Community, by the Community.”
It is April 21-22nd 2021 and features 30+ speakers, panelists and moderators from the life sciences, robotics, financial services, Cyber, research, legal and more.
It is April 21-22nd 2021 and features 30+ speakers, panelists and moderators from the life sciences, robotics, financial services, Cyber, research, legal and more.
Oh man, did we make it through them all?
Hoping that I didn’t miss anyone, this really has been so much fun. Thank you again to Hackernoon and the Slogging crew: Limarc Ambalina, Linh Smooke, David Smooke, and everyone else in the Hacker Noon Team for having us - lots of love from the Grakn team coming this way 💚