A New York based AI research company that focuses on biotechnology has generated a novel protein using a Protein Language Model (ProLLMs) that works on the similar transformer architecture as that of ChatGPT.
EvolutionaryScale unveiled a first of its kind AI generated new protein molecule that glows- mimicking the bioluminescence of a jellyfish molecule called green fluorescent protein- on 25 June. The novel protein sequence is significantly different (less than 60 percent resemblance) to the natural protein: a difference dubbed to be possible "over 500 million years of (natural) evolution" by the company.
The company used its frontier AI language model called EvolutionaryScale Model-3 (ESM3) to achieve this feat and secured 142 million dollars in a seed funding round for the same, including investments from industry giants like Nvidia and Amazon.
We are thrilled to be partnering with AWS and NVIDIA to push the frontier of AI for the life sciences.
— EvolutionaryScale (@EvoscaleAI) June 25, 2024The ESM3 differs from ChatGPT as it is trained on parameters (internal variables) of three fundamental biological properties of proteins - sequence, structure and functions. The model was trained on 98 billion parameters, making it the biggest biological AI model till date.
EvolutionaryScale dubs this as a "model trained across all of evolution." The training set comprised 2.78 billion natural proteins, ranging from "the Amazon rainforest, to the depths of ocean, extreme environments like hydrothermal vents, and microbes in a handful of soil."
ESM3 lets users generate proteins, using prompts with partial information (sequence, structure, and function keywords) and iterating the model to make predictions until the entire sequence is completed. The model is primarily meant for scientists and gives them unprecedented control over the process of generating proteins.
We have trained ESM3 and we're excited to introduce EvolutionaryScale.
ESM3 is a generative language model for programming biology. In experiments, we found ESM3 can simulate 500M years of evolution to generate new fluorescent proteins.
Read more: https://t.co/iAC3lkj0iV pic.twitter.com/AhWtC4vxlF
EvolutionaryScale states their objective is to make biology programmable. "ESM3 takes a step towards the future where AI is a tool to engineer biology from first principles in the same way we engineer structures, machines and microchips, and write computer programs," states the company website.
The application of this technology can lead to breakthroughs in multiple fields like drug discovery and development, biomedical research as well as sustainability- an example of which is already demonstrated by EvolutionaryScale by showcasing a protein prototype that is capable of degrading plastic waste.
The possibilities are endless, with every cell in every organism containing ribosomes (protein complexes that are responsible for protein synthesis). However, there have also been concerns that AI might be misused for creating biological weapons.
Scientists have taken a proactive approach and laid down "Community Values, Guiding Principles, and Commitments for the Responsible Development of AI for Protein Design" in March, seeking to guide the developments in this domain for the betterment of humanity.
We're advancing a new global agreement signed by 100+ leading scientists to ensure AI technologies for protein design are developed responsibly. This field can deliver medicines, vaccines, and other innovations that benefit all. https://t.co/pTlFtBWHNhhttps://t.co/XYn986dEAt
— Institute for Protein Design (@UWproteindesign) March 8, 2024
EvolutionaryScale has also been praised by experts for releasing a smaller open-source version for others to use freely. The large-scale complete model has not been released, although the training process of the same has been made public, in an attempt to remain transparent and share the technology freely.