The Human Genome Project

Alan Riley

This presentation was prepared as a freshmen honors project for Chemistry 1046 at Chipola College in the spring of 2004.

The thought of decoding the human genome had its start at Cambridge University in England. A man named Fred Sanger decoded the genome of a virus that contained about five thousand letters. He accomplished this by tagging the ends of fragments of the DNA with four different chemicals each of which would only react with one of the four bases of DNA. Each of the fragments had common starting points, and they were assorted by length. It took Fred Sanger four years to work out the code of the virus, and at that rate, it would have taken him 2.5 million years to decode the 3 billion letters of the human genome.

Jim Watson, one of the original discoverers of the structure of DNA, heard about Sanger mapping the genome of the virus. So he invited four-hundred scientists to his labs in Long Island, NY and they debated the idea of mapping the entire human genome. The group determined that it would take approximately $3 billion to achieve this goal. While this may seem to be an outrageously high amount of money, it really isn’t when compared to the hundreds of billions of dollars that were spent to split the atom or send a man to the moon. So, Jim Watson went to Congress and asked for the $3 billion. Congress agreed to finance the project. Then 100 anonymous donors gave blood, and the human genome project got underway in 1990. Sixteen labs all over the world helped in the sequencing. The DNA was divided among the labs by using the repeating sequences that are used in DNA fingerprinting as dividing points. An institute in England was built and named after Fred Sanger, and it had the responsibility of decoding a large amount of the genome.

After eight years only a third of the genome had been sequenced. So, a man named Craig Venter who was involved in the project came up with a new approach to sequence the DNA that he believed would be faster and cheaper. His idea was a process known as whole genome shotgun. In this process the strands of DNA are blasted into millions of smaller sections. Then the individual sections are sequenced and the information is fed into a supercomputer. The supercomputer looks for overlapping sections of the DNA that match each other and reassembles them accordingly. This idea, however, was rejected by the public project. This prompted Venter to seek private funding, and on May 10, 1998 he indeed announced that he was launching a privately funded project.

Venter co-founded Celera Genomics with a company that produces robotic DNA sequencers. Celera’s plan was to sequence the human genome much quicker than the public project and sell their data to drug companies and universities. This private project met with a great deal of hostility from those working with the public project. They saw it as an attempt to undercut their project and to privatize something that should belong to all human beings. Jim Watson knew that if the public project was going to be able to keep up with the new private threat, then it would need the robotic DNA sequencing machines. The problem was, though, that no one knew if Celera’s parent company that manufactured the machines would even sell them to the public project. Nevertheless, Jim Watson went back to Congress and secured $80 million dollars more. The company agreed to sell the machines and even went a step further saying they would allocate at least half of the production to the public project so that it wouldn’t get behind. The public project needed hundreds of machines and each machine cost $300,000. So, it was really a win-win situation for Celera’s parent company. With the new machines in use, labs could churn out four million letters a day, which is something that would have taken Fred Sanger more than three thousand years.

Like other great achievements of the 20th century such as splitting the atom and sending a man to the moon, a race began to see who could be the first to map the genome. Celera’s entry into the race ignited a fire under the public project, and since the public project released their data free onto the internet every night, Celera’s plans to sell the information slowly began to fall apart. However, Celera pumped out as much information every day as it could and rushed to take out gene patents. When Celera announced that it was going to try to patent six-thousand five-hundred genes, DNA became Wall Street’s hottest commodity. Celera stock went from $8.00 a share to over $500.00 a share. Speculation that Celera was going to try to patent the entire genome prompted President Clinton to announce that the Human Genome belonged to every member of the human race and therefore could not be patented. Although he did still leave room for individual genes to be patented, the announcement sent the markets into freefall and sparked the Nasdaq’s second biggest crash in history.

President Clinton called in a mutual friend of both side to try and get them to work together and stop competing. The two sides agreed and a deadline of 6 weeks was set to announce a draft of the human genome. The only problem was that neither side had the sequenced DNA arranged in the order it was supposed to be in. The public project called on a computer programmer named Jim Kent to create a computer program that could assemble the nearly four-hundred thousand pieces of DNA from all around the globe. Celera had a similar task, but since they used the whole genome shotgun process, they had forty million pieces instead of four-hundred thousand. It would take twenty thousand CPU hours to assemble the information, and to do that they used the world’s largest civilian super computer. Ironically, Celera had to use the public information that was on the internet as a map to accurately know where to place long repeating patterns in the DNA. The announcement was made as planned on June 26, 2000 and Francis Collins, who was the head of the public project, ended his speech by saying, “I am happy that today the only race we are talking about is the human race.” The draft contained 90% of the complete genome.

In April 2003, a finalized version of the complete human genome was published by the public project and Celera simultaneously. It was determined that there are around 31,000 genes contained in the code. The genome is really like a story with little bits about how the human body functions scattered throughout it. Some parts are very busy with 10 or 15 genes right in a row while other parts are immense blank areas known as deserts. These so called deserts are thought to contain the genetic debris of primitive life-forms from times before humans became humans. This debris is sort of like our genetic history. Also humans share many genes with plants and even bacteria, and while humans may have only three-hundred more genes than a mouse, we have found ways to combine them that produces extraordinary results. For example, as few as thirty-three genes can combine in different ways to give each of the 6 billion people in the world their own unique facial appearance. Humans also have 99.9% of the genome in common; the difference is only one letter in every 1200.

The future implications of the Human Genome Project are yet to be seen. Whether it is used to cure cancer or to combat inherited diseases, the human genome has the potential to provide great leaps forward in the medical field. There is no doubt that biological science will be revolutionized by the use of the sequenced genome, and that the human race is better off for knowing the code that makes us who we are.