Recently I started a conversation with a few friends and coworkers about the idea of a Copilot like program for genetic engineering. Hoping that someone would have some ideas or theories that might test my understanding of what AI tech can do currently and what it might do in the future.
GitHub’s Copilot uses the OpenAI Codex to suggest code and entire functions in real-time, right from your editor.
There have been quite a few conversations about Copilot around Automattic lately. And I’m interested in what it can do. Not being a professional programmer myself, I am at best an amateur exploring ideas and possibilities.
While reading the Code Breaker by Walter Isaacson and watching conversations at work about code wranglers using Copilot and gaining 30% efficiency at times, and some saying they are writing better code with it as well. I wondered if there was, or will be a program like Copilot, but for genetic engineering, as the field progresses. Genetic wrangling assisted by an AI copilot.
Trained on billions of lines of code, GitHub Copilot turns natural language prompts into coding suggestions across dozens of languages.
If we get to a point of genetic engineering that we are writing our own code, creating new recombinant DNA entities, being able to type sentences describing the purpose of the desired genetic trait, I could see an AI paired program, searching billions of genetic data sets, research documents, and experiments to find the best genetic sequence to use.
Write a comment describing the logic you want and GitHub Copilot will immediately suggest code to implement the solution.
I could envision asking the genetic copilot things like…
- Find average lifespan of entities that contain CATGGAGATTACA sequence. (example made-up sequence)
- Return options for replace sequence with longer telomeres.
- Create new sequence to recombine color adaptation with bioluminescence.
- Determine most effective DNA sequence, based on longevity and survival, in mammals, that also allows for rapid muscle growth.
- Build a model to determine the top five bacterial DNA sequences that would thrive in lifespan, in temperatures over 105F.
- Build a recombinant virus that eats plastic based on fungi species with a terminal lifespan of 72 hours.
These are just a few random thoughts that passed through my head of sentences that could perhaps be written to a program to suggest or help build new DNA structures.
Currently Copilot doesn’t have the capability to test the code for effectiveness or accuracy, but I imaging that will be built in the future.
How do we keep this sort of program or data open source and where will we store it all. If the human genome is 3.2 billion characters, assuming one byte per character, that equates to 3.2gb per genome. I want to be able to store and reference billions of genomes, research papers, essentially anything ever published or input into the program. In order, to parse theoretically, the most effective solution for a requested outcome.
We need to keep science democratized with open source technology and open source genetics. And a genetic copilot needs to be open source.