Researchers Build AI Knowledge Graph That Sifts Through Science Papers For You

by Language Models (dot tech)April 18th, 2025
Read on Terminal Reader
tldt arrow

Too Long; Didn't Read

This paper presents a new AI-powered knowledge graph that organizes real-world materials science research into an accessible, searchable database.
featured image - Researchers Build AI Knowledge Graph That Sifts Through Science Papers For You
Language Models (dot tech) HackerNoon profile picture
0-item

Authors:

(1) Yanpeng Ye, School of Computer Science and Engineering, University of New South Wales, Kensington, NSW, Australia, GreenDynamics Pty. Ltd, Kensington, NSW, Australia, and these authors contributed equally to this work;

(2) Jie Ren, GreenDynamics Pty. Ltd, Kensington, NSW, Australia, Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China, and these authors contributed equally to this work;

(3) Shaozhou Wang, GreenDynamics Pty. Ltd, Kensington, NSW, Australia ([email protected]);

(4) Yuwei Wan, GreenDynamics Pty. Ltd, Kensington, NSW, Australia and Department of Linguistics and Translation, City University of Hong Kong, Hong Kong, China;

(5) Imran Razzak, School of Computer Science and Engineering, University of New South Wales, Kensington, NSW, Australia;

(6) Tong Xie, GreenDynamics Pty. Ltd, Kensington, NSW, Australia and School of Photovoltaic and Renewable Energy Engineering, University of New South Wales, Kensington, NSW, Australia ([email protected]);

(7) Wenjie Zhang, School of Computer Science and Engineering, University of New South Wales, Kensington, NSW, Australia ([email protected]).

Editor’s note: This article is part of a broader study. You’re reading Part 2 of 9. Read the rest below.

Methods

Figure 1(a) illustrates the comprehensive workflow of our research. Through NER and RE tasks, we extract the structural information about catalyst, battery and solar cell. After ER and normalization, we integrate information from these three fields and construct a knowledge graph. Specifically, the pipeline displayed in the Figure 1(b), commencing with the manual annotation and normalization of the initial training data set, prompting the fine-tuned LLMs specifically for NER and RE tasks. This inference dataset is subsequently divided into ten batches, a crucial step for the iterative process that follows. Then, we finish ER task through the NLP technology including ChemDataExactor[22], mat2vec[23] and our expert dictionary. After ER, high-quality results are meticulously selected to augment the training set, thereby enhancing the model’s performance in subsequent iterations. Finally, the knowledge graph is constructed using the triples transferred from normalizd result after the last iteration.


Figure 1. (a) Workflow and (b) pipeline of the fine-tuned LLM for knowledge graph tasks.


This paper is available on arxiv under CC BY 4.0 DEED license.


Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks