Every living organism on Earth uses pretty much the same genetic code. CCC means proline, GTA means valine, and so on for every possible triplet (codon) of DNA. All of us Earthly organisms read them the same way. If you put a gene from a tomato, a butterfly, a bacterium, or even a virus into me, I will be able to read it and make a protein out of it.
Genes get transferred between species all the time and function and evolve in their new hosts. In that sense, every living thing on Earth is part of the same uber-organism. We all exchange meaningful DNA with each other because it works. Heck, even we humans have stolen genes from viruses and repurposed them for our own use.
But why do we all use the same code? What if it had turned out differently? Could it have?
Well, for the first time, an entire genome has been made from scratch and redesigned such that it does not use the genetic code the same way the rest of us organisms do. Can you really yank out an organism’s entire genome, replace it with a synthetic one where the rules are different, and expect it to actually live? The answer turns out to be yes! Knowing the entire sequence of the genome of the bacterium Escherichia coli (about 5 million base pairs), the authors were able to reconstruct it in its entirety so it would abandon the traditional genetic code and follow a new one, and the result was indeed a new living organism.
Check it out in a report from the University of Cambridge (UK) and the nearby MRC Laboratory of Molecular Biology that appears in the November 4 issue of Science.
This is the first time in at least billions of years (indeed, if ever) that a living organism on Earth has used a substantially nonstandard genetic code. It can’t exchange functioning genes either way with any other organisms — not humans, not even other E. coli — and it can’t be infected by viruses. It lives on its own genetic island.
It’s quite astonishing to think that going into the 1940s, we didn’t even know that DNA was the genetic material (!), and yet here in 2022 we have managed to do this. You could say, without being at all hyperbolic, that it’s the first species ever to arise synthetically. But is it even a “species” of bacterium? Is it another kingdom entirely? We don’t have an adequate word to use for it.
One immediate practical application of this organism is that it can be used to make organic chemicals and pharmaceutical proteins in fermenters like regular E. coli, but without any risk whatsoever of phage (virus) infection. One little phage particle that sneaks into a bio-manufacturing plant can close the whole thing down for days’ worth of fumigation. (I’ve actually seen it happen. It’s not fun!) But that’s not going to happen with this organism. Viruses simply don’t work on it anymore.
But this organism is fascinating in its own right, too, because we can use it to study how different genetic codes compete in real life with the “standard” one, which should lead to some insights as to why we all collectively “chose” and locked into the one we did. Did the genetic code only evolve once? Or did we start out with several, only to have the one we use now win out for some reason?
So let’s look at what’s different about this new organism. Below is the standard genetic code we all use. You probably know that DNA has four “letters” (A, C, G, and T) representing its bases, and that a group of three of these is a “codon”, the meaningful unit that codes for something. DNA is used as a template to make messenger RNA (mRNA), which has the same letters, except the place of T (thymine) is taken by U (uracil). It’s the codons of mRNA that get translated directly into protein, and that always follows the rules shown below, with very few and very minor exceptions in nature:
To translate these mRNA codons into amino acids, we need some kind of adapter molecule, and that’s transfer RNA (tRNA). Three out of the 64 codons above are “stop” codons, which mean “end the protein now”, but the other 61 code for amino acids. Each of these 61 has its own type of tRNA — so yes, we actually need to have 61 separate tRNAs, and that means 61 separate genes that code for tRNAs.
When we’re making a protein according to an mRNA’s instructions, we’ve got a ribosome holding our mRNA, and along come tRNAs, each carrying the appropriate amino acid, and each with an “anticodon” at the bottom to match up with a codon of that mRNA. The ribosome machinery links those amino acids into a protein, and we’ve made the translation:
But how does each tRNA “know” what amino acid it’s supposed to carry? Well, there are still other proteins called aminoacyl tRNA synthetases (aaRSs) that recognize a specific tRNA and attach the right amino acid to it. We need to have at least 20 of these aaRSs, one for each amino acid. You could use as many as 61, if every aaRS recognized only one tRNA, but for most organisms it usually ends up somewhere in between.
Now, tRNAs aren’t really yellow rectangles, of course. They do look different from one another, so that aaRSs can tell them apart. Here are two different tRNAs, and you won’t have much trouble telling their shapes apart, either:
The “anticodon” is at the bottom, and you can see that it’s only a tiny part of the whole tRNA structure. Some aaRSs do actually need to use the anticodon to help them tell tRNAs apart, but some don’t. For the latter kind, we can actually swap out the anticodon for a different one, and the aaRS won’t even know the difference. That means we can change the code on some tRNAs fairly easily.
The authors used some previous knowledge and did their own studies to find out which tRNAs were swappable like this, and they concentrated on those. They were able to design tRNAs that could translate UCA and UCG not to serine anymore, but now to alanine and histidine, respectively. When synthesizing the new genome, they’d need to leave out the two old serine tRNA genes so that UCA and UCG wouldn’t be translated as serine anymore, and replace them with genes encoding the new tRNAs.
But that was actually the easy part. They would also have to resynthesize the entire genome so that every single occurrence of TCA and TCG within all its genes was changed to something else encoding serine out of the other four possible choices (TCT, TCC, AGT, or AGC). If they didn’t do that, every occurrence of TCA or TCG would get translated to alanine or histidine instead of serine, so most of its proteins wouldn’t work anymore, and bye-bye, cell.
Just for good measure, they also got rid of all TAG stop codons in the entire genome and replaced them with TAA. They also took out the gene for RF-1 (prfA) that is needed to stop protein synthesis at TAG. So a “TAG” codon would literally mean nothing at all to the new cells, and they wouldn’t have any TAG codons of their own.
OK, whew! So what are we left with after all that? Now we have a genome where some fundamental rules are changed. Instead of the usual UCG = serine, UCA = serine, and UAG = stop, now we have UCG = alanine, UCA = histidine, and UAG = nonsense. So this whole genome was synthesized and slipped into an E. coli cell that had its old genome removed, giving us the curious new entity called Syn61∆3 (tRNACGAAla, tRNAUGAHis).
And it grows! About half as fast as plain old E. coli, but I mean, give the thing a break — we just recoded its entire genome! Maybe it needs a catchier name, though. Sparg? Flizz? Let’s think about that.
Anyway, the authors demonstrated that lots of different things that work well in regular E. coli do not work at all in this thing — and vice versa. For example, a gene that gives regular E. coli resistance to the antibiotic spectinomycin didn’t do anything in this organism. When the same gene was recoded so that it did work in the new organism, it similarly did nothing in regular E. coli. This gene has a few TCG’s and TCA’s, and when we try to move the functional gene between regular E. coli and the new organism, either way, they get translated incorrectly, contorting the protein all wrong, and it becomes a functionless blob. So you see that meaningful genetic transfer either way between these two kinds of organisms just can’t happen.
And of course the authors also showed that a virus able to infect E. coli just gets tossed aside like a ragdoll by this new organism.
If you’re thinking that some genes don’t have TCA or TCG or TAG in them, you are correct, but these aren’t the only codons that can be dealt with this way. More can still be added, and the authors plan to do so, making meaningful transfer of any gene to or from these organisms basically impossible.
We often worry about genes from transgenic organisms getting loose and finding their way out into the wild, like antibiotic resistance genes, toxin genes, and genes for pharmaceutical and other products. But this organism, and others like it to follow, can’t do that, because to all other organisms on Earth, me and you included, their genes are pure gobbledygook. And again, if we use them in any sort of industrial process, they’re impervious to viruses that can shut the place down for days.
I’ve talked recently about our accumulated knowledge of biological systems reaching a point where we are entering a sort of golden age of therapy, especially regarding cancer and autoimmune diseases. But this is another manifestation of that. In just 80 years, we’ve gone from knowing nothing at all about how genetics works to arguably introducing a new lifeform by redesigning the genetic code in a way that has never occurred naturally on Earth. Aside from the sheer wonder of having accomplished that, it should benefit us both practically and scientifically.
It’s a perfect illustration of why we must continue to respect and value science and education in this country, and why the search for objective truth must never be hindered by any political agenda. Elections have always been important, but much more so now.
For now we can contemplate inhabiting Earth along with a new kind of organism, on its own island, no longer part of the 4-billion-year trajectory the rest of us have shared.