Most of us have conflicted feelings about artificial intelligence. Is it awesome, is it nefarious, or is it eerie? Regardless, it is increasingly understood to be very powerful. That’s been magnified recently in our everyday conversational lives with online tools like ChatGPT. There really is no going back now.
Just a few days ago, NFL aficionado Peter King posted a terrifically illustrative example. Each week, he writes a 10,000-word-or-so review of recent football happenings that many of us stealthily read on Monday morning when we get to work (shhh, don’t tell our bosses!)
He closes each week with “10 Things I Think I Think” and then finally an “adieu haiku”. Last Monday, Things No. 7 and 8 were these:
7. I think there is perhaps nothing scarier than AI beginning to infiltrate how we think and feel. Why do I write this in a football column? A reader, John Jerrim, sent an email to me Sunday, saying he’d asked ChatGPT to “write a haiku about Peter King.” ChatGPT came back to him with this:
Peter King, writer.
Football his pen, ink his field.
Words score touchdowns too.
8. I think, aside from the fact that’s a better haiku than any I’ve written all season, it’s just eerie. I’m slightly honored, but mostly creeped out, that Artificial Intelligence might one day be able to write a column like this. Maybe even today.
I’m a little astonished that it was able to fit a 5-7-5 grid with something so eerily insightful. I could have spent a few hours composing a haiku tribute to Peter King, and I don’t know whether I could have done better. Do I, for one, welcome our new overlords? I’m really not sure.
But one aspect of AI that I do welcome is its help in making sense of the workings of life. Life by its very nature is obliged to roll with the punches and not waste time on ideological purity. It’s got to find things that empirically work, and let the pundits sort out the principles later. AI is like that, too. If a correlation stands, then it stands, no matter what you or I think of it.
So a bit of that same eeriness now ports over to the biological sciences. An AI approach, for the first time, has compressed millions of years of evolution into minutes (of computing time, that is) and arrived at a completely novel protein that not only does what it was designed to do, but in many ways performs better (from the human perspective, anyway) than any similar example in nature.
The protein is called luciferase, and it enables the production of light, as in a firefly’s rear end. A new luciferase has just been designed from scratch, with no evolutionary parallel. This out-of-nowhere luciferase works just as well as natural ones but is also smaller, more stable, and more specific. You can read about this eyebrow-raising milestone achieved at the University of Washington in David Baker’s lab in the February 22 issue of Nature.
You might remember back in December 2020 when a company called DeepMind announced it had developed a tool called AlphaFold that could predict the 3-D structure of just about any protein based only on its amino-acid sequence — just a string of letters — a challenge that had perplexed scientists for many years. That was a superb new tool for people who study how proteins function. Instead of painstakingly purifying a protein, making crystals out of it, and studying its X-ray diffraction patterns, now you could just copy and paste the sequence of the protein into a user interface.
What we’re talking about today extends this to essentially the reverse. Instead of knowing a protein’s sequence already and thereby predicting its structure, we can decide on what structure we want, do some computation, and arrive at a sequence that will achieve it. Now we have the basis to design proteins to do what we want them to do — even if there is no analog found in nature — from scratch. Enzymes (proteins that catalyze chemical reactions), antibodies, protein-based materials, therapeutics, all kinds of things. The times, they are a-changin’.
People have been trying to do this for some time but just haven’t had all the tools. Baker’s own team had made some rudimentary progress in rationally designing enzymes in the past, but the ones they came up with merely had “detectable” activity and were not nearly as good as those found in nature. Even a few years’ worth of lab evolution of these designed enzymes still couldn’t match the products of the natural world.
Then in late 2021 they reported a method they called “deep network hallucination”, by which they’d used a neural network to help sculpt proteins to have interesting and predictable structures. They started out by feeding random sequences of amino acids into the neural network and allowing it to predict the 3-D protein structures that would result. For random sequences like these, that generally meant featureless blobs. But then they kept randomly and progressively tweaking them until “interesting” predicted structures started to form. They synthesized genes to encode these designed proteins, put these genes into E. coli bacteria to produce the proteins, and they found that sure enough, the actual proteins had the “interesting” structures they’d predicted. These proteins didn’t really have any function yet, but that was a very good start.
The next step was to decide on a particular “interesting” structure; that is, one that would have a particular function. So they chose luciferase because: 1) well, it’s fun, 2) it’s pretty easy to tell if it works — it lights up, after all, and 3) luciferases are really useful for medical imaging, so people have been trying hard to develop better luciferases by tweaking natural ones.
Luciferases from nature all have the same basic mechanism, but they use different starting materials. They add an oxygen (O2) molecule to a larger molecule in order to make an organic peroxide (a molecule with C-O-O-C in it), which would never occur spontaneously. You need luciferase to hold everything in place to get this to happen. Fireflies use D-luciferin as the starting material, while a lot of bioluminescent marine organisms use coelenterazine. But in both cases, the luciferase in these organisms makes it possible for the organic peroxide to form:
Organic peroxide (C-O-O-C) in each case is indicated with a downward-pointing arrow
Organic peroxides like these have too much stored energy to be stable, so soon after they detach from the enzyme, they spontaneously break apart, shedding a CO2 molecule and leaving an “excited ketone”, which has one electron in a high-energy state. That also isn’t stable, so that electron drops back to its normal energy level. The electron’s loss of energy is manifested as a little packet of light, or a photon (denoted hν):
R1 and R2 can be any atom or group of atoms, so this mechanism holds for D-luciferin and coelenterazine
Depending on what molecule you start with, you get a different color of light from this reaction.
Color of light emitted by luciferase using D-luciferin (firefly) or coelenterazine (marine organisms). The wavelength given, in nm (nanometers) is the peak emission wavelength
People have made other synthetic starting molecules that can still be acted on by natural luciferases but emit different colors, like these derivatives of firefly D-luciferin:
Different luciferin structures lead to different glow colors
Other things can affect the final color, too. Here is the addition of cadmium in small amounts to a reaction of firefly luciferase and D-luciferin:
“mM” = millimolar (millimoles per liter) cadmium
Some organisms can change the color of their bioluminescence by setting a fluorescent protein next to the emission so it absorbs the light from the luciferase reaction and then emits it at a longer wavelength (less energy). This shifts the color toward red. The jellyfish Aequoria victoria uses coelenterazine as the starting material for its luciferase, yet because of its green fluorescent protein, it emits not blue light, but light at around 510 nm, which to the human eye is about as green as you can get.
Aequoria victoria green areas are luciferase plus green fluorescent protein
Interestingly (to me, at least), a glowstick gives off light by doing pretty much the same thing as luciferase, except it uses a straight-up chemical reaction with hydrogen peroxide (H2O2) to make a simple organic peroxide. A living organism can’t do this sort of thing because it can’t have a pool of corrosive hydrogen peroxide in its rear end:
The reactions that lead to glowstick light
Before this turns into a luciferases-are-fun diary (which would actually be OK, because they totally are), let’s get to the one that was designed from scratch and see the results.
The authors designed a luciferase to accept diphenylterazine (DTZ) as its starting material because DTZ has good characteristics for medical imaging. They had to do two things: 1) make a protein that could fit DTZ specifically, and 2) put the right amino acids in just the right positions to catalyze the light-producing reaction.
They fished around with a lot of natural proteins to find some that could bind DTZ tightly, to get an idea of what kind of overall shape they would need. They found that a family of proteins called NTF2 included some with a general shape that could cling to DTZ pretty well — even though NTF2 proteins don’t have any luciferase activity. So they used that range of forms as a starting point to make a large number of proteins that would have approximately the right shape, all designed by a neural network trained on a large number of the known protein structures out there.
NTF2 proteins tend to have extra loops and other things that make them less stable and larger than they need to be, but the new designs could get rid of all the extras and just make the necessary shape, within a structure geared toward maximum stability.
Now that they had the general shape, they could place the right amino acids within it. It’s a good thing this happened in David Baker’s lab, because they’ve been designing and studying proteins for literally decades. If anyone in the world could figure that part out, it’s them. They decided on the top 50,000 (!) computationally designed enzymes and narrowed that down to about 7,000 with more help from AI.
That sounds like a lot of candidates, but not when you can synthesize all of the genes encoding those candidates and stick them into E. coli bacteria, feed them some DTZ, and look for colonies that light up. That is how the first prototype called “LuxSit” (let there be light) was found.
LuxSit could emit light visible by a sensitive imager, but by assessing all their designs collectively, they made three substitutions in the sequence of LuxSit to arrive at “LuxSit-i”, which gave off light about 100 times as well — every bit as well as luciferases from nature, in fact.
Below are tubes containing DTZ and 100 nanomolar (100 billionths of a mole per liter) of a designed luciferase protein. The top row is from the imager to make it obvious, and the bottom row is from an iPhone 8 camera. Behold bioluminescence designed by a computer, with no relation to natural evolution:
Even a simple phone camera easily detects light emitted by LuxSit-i in a little test tube
LuxSit-i is now the smallest known functioning luciferase, it is more specific for its starting material than natural luciferases, and it is also very stable. It will be great for fusing to other proteins to see where they go in real biological samples, or as in this example with conventional luciferase, showing that a therapeutic vehicle can deliver a gene across the blood-brain barrier:
An imager is used to track the intensity of the luciferase signal in a live animal, so the measurements can be done noninvasively, without harming the animal. “PdXYP” in this case is “polydixylitol-based polymer”, while “PEI25k” is “polethyleneimine, molecular weight 25,000”. “pGL3” is the DNA containing the luciferase gene. PdXYP was able to deliver a functional gene (luciferase) across the blood-brain barrier
Millions of years of evolution have been recapitulated from start to finish within the lifetime of an expert lab. It took decades of study and the right computational tools to be in position to do this, but they’ve gotten here.
"We were able to design very efficient enzymes from scratch on the computer, as opposed to relying on enzymes found in nature. This breakthrough means that custom enzymes for almost any chemical reaction could, in principle, be designed," said lead author Andy Hsien-Wei Yeh.
You got that right, Andy. And it goes even further. As more and more synthetic proteins like this are made, this process will get easier and easier. It will lead to proteins that make biofuels, renewable chemicals, therapeutics, structural materials, and more, without arduous searches of natural genes or lengthy rounds of directed evolution. So yes, AI might help students cheat on their homework, but it also has the power to make our lives better, indeed even to save lives.
Genes score touchdowns too.