Reflections. Tales from the Genome. Lessons 9-10

In [2]:
import webbrowser
from IPython.display import Image

Glossary

Activator: transcription factors that bind to an enhancer and increase transcriptional activity

Adenine (A): a type of nitrogenous base that is typically used in both DNA and RNA (A basepairs with U or T)

Allele: Specific variant of a genetic sequence for which more than one variation exists, sometimes associated with a unique phenotype

Allele Frequency: the frequency of an allele is equal to the number of that allele divided by the total number of alleles in a given population

Amino Acid: the basic building blocks of proteins, combined by ribosomes during the process of translation; there are 20 different amino acids

Aneuploidy: a chromosomal aberration in which certain chromosomes are present in extra copies or are deficient in number

Autosome: a nuclear chromosome that is not a sex chromosome (X or Y)

Basepair: the phenomenon of nitrogenous bases in nucleic acid pairing with one another in double-stranded DNA or RNA, following the rules A:T, G:C in DNA and A:U, G:C in RNA

Behavioral Trait: any trait that concerns an organism's action or interaction with or within an environment, for example: aggressiveness

Bioethicist: someone who focuses on ethical issues relating to biological topics

Blending Inheritance: the idea that a particular trait in an offspring is a mix of the parents’ traits

Bottleneck: genetic drift resulting from the reduction of a population, typically by a natural disaster, such that the surviving population is no longer genetically representative of the original population

Chromosome: a super-coiled structure of organized DNA wrapped around histones; contains a single molecule of DNA

Coding DNA: sometimes referred to as "protein-coding DNA"; refers to any sequence in the genome that specifies amino acids and translation signals (initiation and termination codons)

Codon: a three-nucleotide sequence of mRNA that specifies a particular amino acid or termination signal

Combinatorial Regulation: the idea that transcription of most genes is controlled by more than one activator or repressor to achieve a particular level of activity

Computational Biologist: someone who applies their knowledge of computer science or computer coding to biological problems

Concordance: the presence of the same trait in both members of a pair of twins or set of individuals

Consensus Sequence: a single sequence that represents the most prevalent individual unit at each position, derived by comparing variants of the sequence from different sources

Correlation: refers to an observable relationship between any paired values

Cytochrome P450: a large and diverse group of enzymes that catalyze the oxidation (metabolism) of organic substances (drugs)

Cytoplasm: the interior of a cell, excluding the nucleus

Cytosine (C): a type of nitrogenous base that is typically used in both DNA and RNA (C basepairs with G)

Deoxynucleotides: the building blocks of DNA; there are four different bases used in deoxynucleotides: Adenine (A), Thymine (T), Guanine (G), and Cytosine (C)

Direct Selection: (see natural selection)

Director Of Clinical Operations: someone who designs and manages clinical trials

Dizygotic Twins: twins that came from two different fertilized eggs or zygotes; fraternal twins

DNA: deoxyribonucleic acid; the hereditary material of almost all cells that makes up their genomes

DNA Amplification: the in vitro replication of a DNA sequence to make many more copies

DNA Extraction: the isolation of DNA from a biological sample

DNA Sequence: a string of DNA letters (bases) in consecutive order

Dominant Trait: can mask the presence of a recessive allele or trait

Double Helix: the structure of DNA, referring to its two adjacent strands wound into a spiral shape and held together through basepairing

Duplicated Chromosome: the stereotypical structure in the shape of an "X" for a nuclear chromosome that appears only before cell division

Duplication: a replication error that doubles a large segment of DNA

Efficacy: a drug's ability to produce a therapeutic effect

Egg: female reproductive cell

Enhancer: a DNA sequence that binds certain transcription factors, activators, that can stimulate transcription of nearby genes

Enzyme: a class of proteins that enable chemical reactions without being consumed by the reaction

Exon: a sequence from a gene that is transcribed and remains in the mRNA after splicing and includes codes for amino acids

Founder Effect: a cause of genetic drift attributable to colonization by a limited number of individuals from a parent population

Frameshift: any mutation that results in changing the reading frame of translation

Gain-Of-Function Mutation: changes the gene product such that it gains a new and/or abnormal function

Gamete: sperm or egg cells; produce as a result of meiosis from germ cells

Gene: a discrete unit of hereditary information consisting of a specific deoxynucleotide sequence in DNA

Gene Expression: the process by which information from a gene is used in the synthesis of a functional gene product

Genetic Counselor: someone who explains and discusses personal genetic information with individuals and families

Genetic Genealogist: someone who uses genetic information to determine and catalog family relationships and uncover ancestry

Genome: the complete complement of an organism's genetic material

Genome Wide Association Study (GWAS): GWAS seek to correlate, in populations, the association of specific alleles with the trait or disorder being studied

Genotype: the genetic makeup of an organism, specifically the composition of alleles

Germ Cell: the type of the cell in the body that makes gametes; this is the only cell type where mutations affect the next generation of an organism

Guanine (G): a type of nitrogenous base that is typically used in both DNA and RNA (G basepairs with C)

Hidden Trait: any trait that is not apparent through outward observation

Histone: a protein that is used to organize and fold DNA like a string on a spool

Heterozygous: having two different alleles for a given gene

Heterozygote Advantage: describes the case in which the heterozygous genotype has a higher relative fitness than either the homozygous dominant or homozygous recessive genotype

Heritability: the proportion, between 0 and 1, of observable differences in variation of a trait between individuals within a population that is due to genetic differences

Homozygous: having two identical alleles for a given gene

Human Geneticist: someone who studies human genetics and inheritance

Hybridization: the process of basepairing that can occur between any types of nucleic acids (DNA:DNA, DNA:RNA, or RNA:RNA)

Identity By Descent (IBD): genetic sequence shared through ancestry

Identity By Descent (IBD): genetic sequence shared through ancestry

Identity By State (IBS): genetic sequence that is identical between two individuals

Inheritance: the passing down of traits from one generation to the next, at the level of the cell or the organism

Innate Trait: any trait that is in-born, for example: your pancreas secreting enzymes that break down the food in your gut

Intron: a sequence from a gene that is transcribed but cut out of the mRNA by splicing and typically does not code for any amino acids

Learned Trait: any trait that is not in-born and instead acquired through environmental (typically cognitive) influence, for example: belief in a particular religion

Loss-Of-Function Mutation: results in the gene product having less or no function

Meiosis: a two-stage type of cell division in germ cells that results in gametes with half the chromosome number of the original cell

Mitochondrial DNA: a circular DNA molecule that can only be found in the mitochondria of all cells in the body and is inherited only from the mother

Mitosis: the normal chromosome doubling and division that all somatic (body) cells do to maintain the same number of chromosomes at the end of each division

Missense/Non-Synonymous Mutation: a mutation that changes an amino acid

Monogenic Trait: traits that are significantly influenced by a single gene

Monozygotic Twins: twins that came from the same fertilized egg or zygote; identical twins

Multifactorial Trait: a trait that is controlled by many genes and is also influenced by the environment

Mutation: a change in the genetic sequence

Natural Selection: the process by which traits become either more or less common in a population because of pressures directly affecting the reproductive fitness of individuals carrying particular alleles; also considered "direct selection" due to pressures that directly affect the fitness of particular alleles

Non-coding DNA: any sequence that does not specify amino acids and translation signals (initiation and termination codons)

Non-Duplicated Chromosome: a non-"X"-shaped nuclear chromosome;

Nonsense Mutation: a mutation that changes an amino acid codon to a STOP codon

Nuclear Genome: the complete set of 23 pairs of chromosomes that reside within the nucleus of the cell

Nucleotide: building block of DNA and/or RNA consisting of a base, a ribose or deoxyribose sugar and a phosphate group; there are five different bases used in nucleotides: Adenine (A), Thymine (T) in DNA or Uracil (U) in RNA, Guanine (G), and Cytosine (C)

Nucleus: a separate, membrane-bound compartment of eukaryotic cells that houses the DNA and separates it from the rest of the cell; this is where transcription occurs

Particulate Inheritance: the idea that characteristics can be passed down from generation to generation through discrete particles, i.e. genes

Pedigree: organized way of illustrating (drawing) family relations and traits

Penetrance: the degree to which a particular allele causes a trait

Personal Genome: the entirety of your own individual DNA

Pharmacodynamics: the target effects of a drug; what a drug does to the body

Pharmacogenetics/Pharmacogenomics: the study of how different drugs interact with the body in different ways based on genetic variation

Pharmacokinetics: how a drug is metabolized; what the body does to a drug

Pharmacology: the study of drugs and their origins, as well as how they interact with the body of a living organism

PharmGKB: Pharmacogenomics Knowledge Base is an interactive tool for researchers to investigate how genetic variation affects drug response, both with regards to pharmacodynamics and pharmacokinetics

Phenotype: the physical makeup, or appearance, of an organism or individual trait

Physical Trait: any trait that concerns our material makeup

Polygenic Trait: a trait that is the result of multiple gene interactions with very little environmental impact

Population: a group of individuals of one species that live in a particularly defined area

Promoter: a region of DNA at which transcription of a particular gene is initiated

Protein: a large chain or combination of multiple chains of amino acids

Promoter: a region of DNA at which transcription of a particular gene is initiated

Qualitative Trait: a trait that is described by either its presence or absence

Quantitative Trait: a trait that varies continuously over a range of measurements and displays a normal distribution (bell curve)

Random Selection: the process by which traits become either more or less common in a population due to random chance, not because of pressures directly affecting the reproductive fitness of particular alleles; it is random because the pressures do not directly affect the fitness of particular alleles

Recessive Trait: masked by the presence of a dominant allele or trait

Recombination: a special process during meiosis that can swap pieces of your maternal and paternal chromosome copies

Relative Risk: an individual's risk based on family or genetic background compared to the general population

Repressor: transcription factors that bind to a silencer and inhibit transcriptional activity

Ribosome: a molecular machine that translates, or reads, the genetic code within the mRNA sequence and synthesizes a corresponding chain of amino acids

RNA and mRNA: ribonucleic acid; A type of nucleic acid, usually single-stranded, consisting of nucleotides with the nitrogenous bases of A, C, G, and U

RSID: reference SNV identification

Sex Chromosome: a nuclear chromosome that distinguishes the sexes: XY - male, XX - female, and can affect sex-specific traits

Silencer: a DNA sequence capable of binding transcription regulation factors, called repressors, and inhibit transcriptional activity

Silent/Synonymous Mutation: a mutation that does not change the amino acid sequence

Single Nucleotide Variation (SNV): single base change in DNA; SNVs (also known as single nucleotide polymorphisms, or SNPs) are one of the smallest kinds of mutations and are responsible for a large number of differences among humans

SNV Genotyping: determining the base at any given position in a genome, not through total genome sequencing

Somatic Cell: any cell of the body that is not a germ cell (not directly responsible for carrying the information passed on to the next generation)

Splicing: the process of removing introns and combining exons in a mRNA sequence after transcription

TATA-Box: the TATAAA sequence that can be found in the promoter of many genes and is essential to initiate transcription

Thymine (T): a type of nitrogenous base that is typically used in DNA (T basepairs with A)

Toxicity: the degree to which a drug causes negative effects

Trait: any distinguishing feature of an individual

Transcription: the step of gene expression in which a particular segment of DNA is copied into RNA

Transcription Factor: any protein that joins the transcription process by binding to DNA or to other proteins that bind DNA to regulate transcription

Translation: the process by which ribosomes read the mRNA sequence and connect the amino acids in the order specified by the sequence

Translocation: rearrangement of a large sequence of genetic information, typically transferring from one chromosome to another through DNA breakage and resealing

Uracil (U): a type of nitrogenous base that is typically used in RNA (U basepairs with A)

Variant: version of a genetic sequence or gene for which more than one version exist

Variation: diversity among members of a population

Visible Trait: any trait that is apparent through outward observation

X Chromosome: one of two mammalian sex chromosomes that can be found in both males (XY) and females (XX)

Y Chromosome: the mammalian sex-determining chromosome that can only be found in males (XY) and is passed from father to son

Lesson 9

In [3]:
Image(filename='cm09.jpg')
Out[3]:

Most human genetic traits can be classified as either monogenic or complex. Monogenic traits are strongly influenced by variation within a single gene and are recognized by their classic patterns of inheritance within families. While monogenic traits formed the basis for "classic" genetics, it has become clear that conditions whose inheritance strictly conforms to Mendelian principles are relatively rare.

Complex traits are believed to result from variation within multiple genes and their interaction with behavioral and environmental factors. Complex traits do not follow readily predictable patterns of inheritance.

This distinction between monogenic and complex traits, while useful, can be overly simplistic. Traits that appear to be monogenic can be influenced by variation in multiple genes ("modifier genes"); complex traits can be predominantly influenced by variation in a single gene.

Penetrance

is a term that describes the proportion of individuals, possessing a particular genotype, that display the phenotype expected of that genotype. If only 80% of the individuals of a genotype show the expected phenotype, then the gene is said to be 80% penetrant (most genes are fully penetrant).

Penetrance is distinct from expressivity; the penetrance of a gene refers only to whether it is expressed phenotypically or not, while expressivity refers to the degree of expression (so an allele may have high penetrance in a population, but varying levels of expressivity between individuals). Penetrance is often difficult to determine because the expression of an allele often depends heavily on epigenetic or environmental factors, such as smoking, drinking and nutrition.

Quantitative Traits

A quantitative trait is a trait that fits into discrete categories. This means that you can neatly categorize a trait. For example, if a species of plant had either red leaves or yellow leaves, and nothing in between, this would be a discrete trait. "Yes or no" traits, traits where an organism either has the trait or doesn't, also fit into this category. Usually, a single gene or small group of genes control qualitative traits.

Quantitative Traits

Quantitative traits occur as a continuous range of variation. This means that these traits occur over a range. To picture this, imagine the length of a lizard's tail. The length can vary, and does not fit into natural categories. Generally, a larger group of genes control qualitative traits. When multiple genes influence a trait, you can also describe it as a "polygenic trait."

Genetic determination of eye color

It was originally thought that eye color was a simple Mendelian trait, meaning it was determined by a single gene, with brown being dominant and blue recessive. It is now clear that eye color is a polygenic trait, meaning it is determined by multiple genes. Among the genes that affect eye color, OCA2 and HERC2 stand out. Both are located on human chromosome 15. The OCA2 gene produces a cell membrane transporter of tyrosine, a precursor of melanin. Mutations in OCA2 result in oculocutaneous albinism, a condition associated with vision problems such as reduced sharpness and increased sensitivity to light. HERC2 regulates the OCA2 genes’ expression. In the European population, a common polymorphism in HERC2 gene is responsible for the blue eye phenotype. A person who has two copies of C allele at HERC2 rs1293832 will likely have blue eyes while homozygous TT predicts likely brown eyes.

In [4]:
#### rs12913832
#### Likelihood of eye color for people of European descent 
#### TT ALLELE ##################	CC ALLELE #################	TC ALLELE
#### 85% chance of brown eyes; ## 72% chance of blue eyes;  ## 56% chance of brown eyes;
#### 14% chance of green eyes; ## 27% chance of green eyes; ## 37% chance of green eyes;
#### 1% chance of blue eyes.   ## 1% chance of brown eyes.  ## 7% chance of blue eyes.

If a trait is shown to have some heritability in a population, then it is possible to quantify the degree of heritability.

If genotypes are not distributed randomly across environments, there will be some covariance between genotype and environmental values, and the covariance will be hidden in the genetic and environmental variances.

The degree of heritability can be defined as the part of the total variance that is due to genetic variance:

In [4]:
Image(filename='bio056.jpg')
Out[4]:

H2, so defined, is called the broad heritability of the character.

It must be stressed that this measure of “genetic influence” tells us what part of the population’s variation in phenotype can be assigned to variation in genotype. It does not tell us what parts of an individual’s phenotype can be ascribed to its heredity and to its environment. This latter distinction is not a reasonable one. An individual’s phenotype is a consequence of the interaction between its genes and its sequence of environments. It clearly would be silly to say that you owe 60 inches of your height to genes and 10 inches to environment. All measures of the “importance” of genes are framed in terms of the proportion of variance ascribable to their variation. This approach is a special application of the more general technique of the analysis of variance for apportioning relative weight to contributing causes. The method was, in fact, invented originally to deal with experiments in which different environmental and genetic factors were influencing the growth of plants. (For a sophisticated but accessible treatment of the analysis of variance written for biologists, see R. Sokal and J. Rohlf, Biometry, 3d ed. W. H. Freeman and Company, 1995.)

In general, the heritability of a trait is different in each population and in each set of environnents; it cannot be extrapolated from one population and set of environments to another.

In [5]:
Image(filename='bio057.jpg')
Out[5]:

Problem set 9

#1

Venous thromboembolism (VTE) involves the formation of a blood clot in a vein deep within the body that can then travel through the circulatory system. Both genes and environment affect people's risk for VTE. Two genes known to be involved in VTE are F5 and F2. In Europeans, one VTE risk allele is a SNV in F5 for a T­allele, and another is a SNV in F2 for an A­allele. Below are the genotypes of F5 and F2 for a set of twins,

John and James Doe:

###### Gene F5 ##### Gene F2

John Doe ## CC ##### GG

James Doe ## CT ##### AG

Are these twins likely to be: A. Monozygotic B. Dizygotic

#2

#3

#4

#5

In [ ]:
 

#6

In [ ]:
 

#7

In [ ]:
 

#8

In [ ]:
 

#9

#10

In [ ]:
 

#11

#12

In [ ]:
 

#13

#14

#15

#16

#17

In [ ]:
 

#18

In [ ]:
 

#19

#20

Lesson 10

In [4]:
# Image(filename='/Users/olgabelitskaya/Downloads/cm10.jpg')

Concept map

In [2]:
from IPython.core.display import HTML
HTML('')
Out[2]:

Problem set 10

In [ ]:
 

#1

#2

In [ ]:
 

#3

In [ ]:
 

#4

In [ ]:
 

#5

#6

#7

#8

#9

#10

In [ ]:
 

#11

In [ ]:
 

#12

#13

In [ ]:
 

#14

In [ ]:
 

#15

In [ ]:
 

#16

In [ ]:
 

#17

#18

#19

#

#20

In [ ]: