When the master plan for human construction turns 20, researchers celebrate historic achievement and look for ways to reinforce its shortcomings.
Ting Wang, a geneticist at the University of Washington School of Medicine in St. Louis, said the human genome project, which built the project, called the human reference genome, has changed the way medical research is conducted. "It's very valuable."
For example, before the project, drugs were developed for serendipity, but having the master plan led to the development of therapies that could specifically target certain biological processes. As a result, more than 2,000 drugs targeting specific human genes or proteins have been approved. The reference genome also allowed to unravel complicated networks involved in the regulation of gene activity (SN: 9/5/12) and to learn more about how chemical modifications of DNA adjust that activity (SN: 18/02/15). It has also led to the discovery of thousands of genes that do not manufacture proteins, but produce many different useful RNAs (SN: 4/7/19). The researcher set out these and other achievements on February 10 in Nature.
“That said, the human reference genome we use has certain limitations,” Wang says.
For one thing, it’s not really over; the gaps remain in the longest model of more than 3 billion letters of DNA, especially in stretches of repetitive DNA. They are holes in which the technology that built the reference does not do a good job reading each letter. Scientists know that there is DNA there, neither how much nor how the letters are arranged. And despite being a compilation of more than 60 people of DNA, the reference does not completely encapsulate the full range of human genetic diversity.
Sign up to receive the latest from Science News
Headlines and summaries of the latest Science News articles, delivered in your inbox
One of the easiest ways to compile a complete catalog of human diversity is to decipher or sequence the genomes of 3 million Africans, medical geneticist Ambroise Wonkam of the University of Cape Town in South Africa, proposes in a commentary also published on February 10 in Nature . Africa is where modern humans originated and study after study has discovered thousands and millions of new genetic variants among people of African descent.
For example, the Human Health and Heredity in Africa project, known as H3Africa, discovered more than 3 million single-letter variants never seen before – known as SNPs, short for single-nucleotide polymorphisms – examining DNA from just 426 people from different parts. of Africa, researchers reported on October 28 in Nature.
Wonkam says researchers will not find just one letter of DNA or base changes when examining African genomes. They may discover a lot of DNA that no one expected even in the human genome. Even healthy people sometimes lack large pieces of DNA (SN: 22/10/09). And some people may have more DNA than others.
In a 2019 study of 910 people of African descent, researchers found 296.5 million additional DNA bases not listed in the current reference. This suggests that sequencing of Africans may uncover 10 percent or more of the human genome that has not been cataloged previously. That additional genetic material is not necessarily in the gaps that researchers already knew. It was not found because about 60 people whose DNA comprises the reference simply did not take it.
“We need a database reference that is representative of humanity,” which has its roots in African origins, Wonkam says. “Genomic variation of the African population is the next frontier” in human genetics.
That doesn’t mean researchers stop studying people from other parts of the world, he says. A project to examine the genetics of Icelanders, for example, may uncover genetic variants that have emerged among the founders of that island nation and that people still carry today.
But the genetic diversity that was present in modern humans before Eurasian ancestors left Africa thousands of years ago is still present in people on that continent and more variants have emerged as people adapted to specific environments or by chance.
Research on genetic variation in Africa will surely help Africans better understand their health problems. But a reference that encompasses all of human genetic diversity will help everyone in the world, Wonkam says. Already, new drugs to lower cholesterol and other medical advances have come from studying the DNA of people of African descent.
Filling the gaps
Although Wonkam’s proposal may solve the problem of genetic diversity, it does not necessarily address the gaps in the existing reference genome.
The current reference genome was made by fitting small strings of DNA like thousands of small pieces of puzzles. In some parts of the genome, the DNA sequence is repeated over and over again, producing virtually identical puzzle pieces. It’s hard to know exactly where all those pieces go and how many repetitions there are. Thus, some repetitive pieces were left out, leaving holes in the finished puzzle.
This can create problems, Wang says. For example, doctors can sequence a patient's DNA and find a genetic variant that they suspect may cause a health problem. But if the suspicious DNA is not listed in the current reference, there is no way to know if the variant is harmful or not.
“It’s time to fully address this problem (with) the limitations of the current set of the human genome,” says Wang. To do this, Wang and other scientists from the Human Pangenoma Reference Consortium will use a new DNA decipherment technology, called long-range sequencing or long-reading, to read each human chromosome from end to end.
In 2020, researchers reported the first completely complete sequence of a human chromosome, the X chromosome. That effort closed 29 gaps in that chromosome's reference sequence, including 3.1 million bases spanning the centromere, the part of the chromosome important to separate. chromosomes during cell division, the researchers reported on July 14 in Nature. Learning more about centromeres can help researchers understand why chromosome splitting sometimes goes wrong, causing cancer or genetic diseases like Down syndrome.
That initial success suggests that long-read sequencing technology can fill gaps in the reference genome and help find the missing 10 percent of DNA. The pangenoma team hopes to gather complete genomes for 350 people around the world.
And when he says complete, Wang means complete. The reference genome contains more than 3 billion bases of DNA, but human cells have more than 6 billion bases. The discrepancy comes from representing only one set of chromosomes instead of the two sets that people actually inherit, one from each parent.
This is because when DNA was originally sequenced with a person’s DNA cut into small pieces to reassemble later, there was no way to distinguish which fragment came from a person’s mother’s inherited chromosome from the father’s inherited chromosome. So it all came down to one.
But by sequencing each chromosome in its entirety, researchers will be able to construct a complete picture of a person’s genome, including determining exactly what came from each parent. Such complete imaging may allow researchers to better follow inheritance patterns and trace the genetic source of disease more easily.
Investing in a better reference genome will also have great benefits in other ways, Wonkam says. The Human Genome Project spent $ 3.8 billion building the existing benchmark. That investment has not only advanced genetic medicine, but also led to advances in the study of infectious diseases, friendly microbes, and other areas of biomedical research.
Having a truly complete reference genome will be even more of a blessing, Wonkam predicts. It is estimated that the 10-year project to sequence the DNA of 3 million Africans will cost about $ 450 million a year. But "we're going to make a singular profit, globally, far beyond (cost)."