Predicting a Protein’s Stability under a Million Mutations: Limitations

Written by mutation | Published 2024/03/12
Tech Story Tags: protein-engineering | protein-engineering-ml | protein-engineering-dl | protein-engineering-models | mutate-everything | megascale-cdna-proteolysis | proteingym-dataset | protein-engineering-prediction

TLDRProtein engineering is the discipline of mutating a natural protein sequence to improve properties for industrial and pharmaceutical applications.via the TL;DR App

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Jeffrey Ouyang-Zhang, UT Austin

(2) Daniel J. Diaz, UT Austin

(3) Adam R. Klivans, UT Austin

(4) Philipp Krähenbühl, UT Austin

Table of Links

6 Limitations

First, our training dataset contains biases that may affect model performance. The training set contains only small proteins, which may limit performance on larger ones. Our model may exhibit biases towards certain types of mutations due to the data imbalance in our training set. Second, the limited availability of experimental stability data poses a challenge for in-silico evaluation. Evaluation on larger and more diverse datasets is necessary to fully assess the generalizability of our model. In the future, we hope that high-throughput experimental assays will enable more rigorous evaluation and further improvements in protein stability prediction.


Written by mutation | Mutation: process of changing in form or nature. We publish the best academic journals & first hand accounts of Mutation
Published by HackerNoon on 2024/03/12