Human Genetic Variation
National Human Genome Research Institute Home
skip navigation Main

Getting Started

Teacher's Guide Student Activities About NIH and NHGRI
glossary | map | contact 
Teacher's Guide - return to teacher's guide home hand using a mouse

PDF Files for PrintingActivity 2 - The Meaning of Genetic Variation

At a Glance

Focus: Students investigate variation in the beta globin gene by identifying base changes that do and do not alter function, and by using several Internet-based resources to consider the significance in different environments of the base change associated with sickle cell disease.

Major Concepts: The ultimate source of genetic variation is differences in DNA sequences. Most of those genetic differences do not affect how individuals function. Some genetic variation, however, is associated with disease, and some improves the ability of the species to survive changes in the environment. Genetic variation, therefore, is the basis for evolution by natural selection.

Objectives: After completing this activity, students will

Prerequisite Knowledge: Students should understand basic Mendelian patterns of inheritance, especially autosomal-recessive inheritance; the basic structure of DNA; the transcription of DNA to messenger RNA; and the translation of messenger RNA to protein.

Basic Science-Health Connection: Although the idea is made explicit only in annotations to teachers, this activity illustrates how advances in science and technology have allowed us to establish relationships between some genetic variations and particular phenotypes. For example, our understanding of the relationship between DNA and protein has allowed us to establish a relationship between a change in a single base pair and the symptomology of sickle cell disease. Similarly, our understanding of the basic biochemical mechanisms underlying the symptoms associated with sickle cell disease has provided important clues about possible strategies for clinical intervention. You may wish to make some of these points with your students as they complete the activity.


As discussed in Understanding Human Genetic Variation, there is considerable variation between the genomes of any two individuals, but only a small amount of that variation appears to have any significant biological impact, that is, produces differences in function. The Human Genome Project will continue to illuminate the extent of human genetic variation as well as the variations that have biological significance.

This activity uses an examination of variation in a 1,691-base segment of the beta globin gene to help students consider the extent of human genetic variation at the molecular level and the relationships between genetic variation and disease and between genetic variation and evolution.

Materials and Preparation

You will need to prepare the following materials before conducting this activity*:

*Day 1, Step 12 describes an optional laboratory exercise that you may wish to conduct to enrich your students' understanding of molecular variation and the methods by which it can be identified and studied. Information about the materials required is provided at the end of this module.

Note to teachers: If you do not have enough computers equipped with Internet access to conduct this activity, you can use the print-based alternative.



1. Introduce the activity by asking students to identify the ultimate source of the variation they investigated in Activity 1.

Students should recognize that the ultimate source of genetic variation is differences in DNA sequences.

2. Explain that in this activity, students will investigate human genetic variation at a molecular level and examine the impact of that variation on biological function. Distribute one copy of Masters 2.1, How Much Variation? Beta Globin Gene—Person A, and Master 2.2, How Much Variation? Beta Globin Gene—Person B , to each student. Explain that the sequences on these pages come from the beta globin gene in two different people.

Hemoglobin, the oxygen carrier in blood, is composed of four polypeptide chains, two alpha polypeptide chains and two beta polypeptide chains. The beta globin gene encodes the amino acid sequence for the beta chain. Person A and Person B each show 1,691 nucleotides from the "sense" strand of the gene (that is, the strand that does not serve as the template for transcription and thus has the same base sequence as the messenger RNA, with Ts substituted for Us). Both the sense strand of DNA and the messenger RNA are complementary to the DNA strand that serves as the template for transcription. We recommend that you remind students that DNA is double-stranded, even though only one strand is shown here. Explain that geneticists use "shortcuts" like this because, given the sequence of one DNA strand, they can infer the sequence of the complementary strand.

The beta globin gene is one of the smallest human genes that encodes a protein; the entire gene has only about 1,700 nucleotide pairs and includes just two introns. The sequences on Beta Globin Gene—Person A and Beta Globin Gene—Person B do not show the gene's promoter regions, but begin with the first sequences that are translated.

3. Ask the students to read the information at the top of each page and then estimate the total number of bases on each page. Direct students to write their estimate at the top of the page.

The total number of bases on each page is 1,691. Students will need this number to complete their calculations in Step 6.

4. Remind the students that the sequences on the masters come from the beta globin gene in two different people. Ask the students what they notice when they compare the sequence from person A with the sequence from person B.

Students should notice that most of the sequence appears to be exactly the same in both people. They also should notice that the bases that are in bold are different. If necessary, point out that these bases are at the same positions in each gene (that is, be sure that students realize that only these two bases, located in these specific positions, are different in the sequences from person A and person B).

5. Point out that this sequence is only 1,691 bases long and the complete human genome is approximately 3 billion bases long. Ask the students how they might use the sequences for person A and person B and the total size of the human genome to estimate the extent of variation (the number of bases that differ) between person A and person B. Ask as well what assumption they would be making as they arrived at their estimate.

Students could estimate the extent of variation across the entire genome by calculating the percentage of difference between the two sequences shown for person A and person B, and then multiplying this percentage by 3 billion (the approximate number of bases in the human genome). This estimate assumes that the sequence shown displays a typical amount of variation.

6. Distribute one copy of Master 2.3, How Much Variation? Doing the Math, to each student and direct the students to use the master as a guide to estimate this value.

If your students need help completing this estimate, suggest that they first try the example at the bottom of the master.

The proportion of sequence difference between person A and person B is 2/1,691 = 0.001 (rounded off). To make this more concrete for your students, note that this means that about 1 base in every 1,000 is different. The percentage difference is 0.001 X 100 = 0.1 percent.

The total number of base differences would be 3,000,000,000 X 0.001 = 3,000,000 or, in scientific notation, 3 X 109 X 10-3 = 3 X 106. That is, we could expect to find 3 million base differences in DNA sequence between any two people.

Note that the actual number of base differences between two people is likely somewhat higher than this because this estimate, based as it is on the approximate size of the human genome (one copy of each of the autosomes, plus the X, Y, and mitochondrial chromosome), does not take into consideration the fact that humans are diploid.

7. Ask students what their estimates indicate about the extent of human genetic variation at the molecular level.

Students should recognize that at the molecular level, humans are far more alike (about 99.9 percent of the bases are the same) than they are different (only about 0.1 percent of the bases are different). Students should also realize, however, that even a small percentage difference can represent a very large actual number of differences in something as large as the human genome.

If students have difficulty reaching these conclusions, help them by asking questions such as, "Based on this comparison, do you think that at the molecular level, people are more alike than they are different or vice versa?" and "How can a difference of only 0.1 percent (1 in 1,000) result in such a large number of differences (3 million differences)?"

8. Explain that the rest of the activity focuses on this 0.1 percent difference between people. Ask students questions such as, "Do you think these differences matter? What effect do you think they have? What might affect how much a specific difference matters?"

These questions focus students' attention on the significance of the differences, instead of the number of differences. Remind students of the differences among people that they observed in Activity 1 and point out that most of these differences have their basis in a difference in the DNA sequence of particular genes (probably pierced versus non pierced body parts do not). To help them understand the magnitude of the number of differences between their DNA and that of another person, ask students if they think there are 3 million differences in appearance and biological functions between themselves and the person sitting next to them.

9. Explain that studying the beta globin gene more closely will help students begin to answer these questions for themselves. Direct students to examine the sequences on Beta Globin Gene—Person A and Beta Globin Gene—Person B again. Explain that the regions that show bases grouped in triplets are from the coding regions ("exons") of the gene, while the other regions are from the noncoding regions ("introns"). Then ask students which of the two base differences in bold is most likely to matter, and why.

Most eukaryotic genes are composed of both coding and noncoding regions, which are transcribed into an initial messenger RNA. The noncoding introns are then spliced out of the RNA; other processing steps ultimately result in the mature messenger RNA that is translated into protein. Students should realize that the second base difference occurs in a noncoding region of the gene and is unlikely to have an impact on individuals. The first difference occurs in a coding region and is more likely to matter.

puzzle piece A major concept that students should understand from Day 1 of the activity is that most genetic differences do not affect how individuals function.

10. Explain that although 3 million base differences sounds like a lot, most of these differences have no significant impact on individuals, either because they occur in a noncoding region or for another reason. Point out that most of these 3 million differences can only be detected by examining the DNA sequence.

Students should now understand that while some base differences occur in coding regions and may result in an altered amino acid sequence in the protein coded for by a gene, others occur in noncoding regions where they likely have no impact. Point out that only a small percentage of the DNA sequences in the human genome are coding sequences. Furthermore, only a small percentage of the noncoding DNA sequences are regulatory sequences such as promoters or enhancers that can influence the amount of gene product that results from a given gene. The remaining DNA sequences (the majority of the total DNA sequences in the genome) have no known function. Most of the variations in DNA sequence occur in these latter sequences and have no detectable impact.*

If you wish to offer your students a more sophisticated understanding of why most DNA sequence differences have no impact, extend the discussion to include the following ideas. Even many of the differences that occur in coding regions have no impact. Only those differences that result in a change in amino acid sequence in a critical region of the protein (one that affects the function of the protein), or that result in a premature stop codon in the RNA (and thus a truncated protein) have a significant impact on the individual carrying that variation. As students will see in Day 2, those few differences that do affect individuals often have devastating consequences.

*You may wish to clarify for students the reason that most molecular variation occurs in noncoding regions. It is true that there are more noncoding than coding regions. However, the fundamental biological reason for the increased variability of noncoding regions is that there is no selective pressure exerted on changes in these silent/nonfunctional regions. You also may wish to point out that some differences that occur in noncoding regions do have an impact. For example, several mutations within introns in the beta globin gene cause incorrect splicing of the messenger RNA, and as a result, several codons may be inserted into or omitted from the sequence, leading to nonfunctional beta globin polypeptides.

manual table of contents
     1 | 2    next

Copyright | Credits | Accessibility