Plant DNA C Values


Regardless of the species' level of ploidy, the C-value refers to the quantity of nuclear DNA in the unreplicated gametic nucleus.

A carnivorous herb called Genlisea margaretae has a genome size of 0.129 pg (63 Mbp), whereas the monocot Trillium hagae has a genome size of 264.9 pg (for reference, Arabidopsis thaliana has a genome size of 0.321 pg; Bennett and Leitch, 2005; Zonneveld et al., 2005; Greilhuber et al., 2006. The plant C-value database contains C-value information for more than 5000 plant species.

Genome Size

The total amount of DNA in the unreplicated haploid nucleus, or the size of a plant's genome, has attracted increased attention in recent decades as the biological, evolutionary, and ecological significance of this essential biodiversity attribute has come to light.

The astounding diversity in genome sizes found in terrestrial plants and the substantial diversity in some algal clades—the Chlorophyta clade of green algae having the most variability with a range of 274-fold—are undoubtedly contributing factors to this interest.

It is now evident that genome size can affect a variety of dimensions, including gene and genome dynamics, whole-plant dynamics, plant growth strategies, plant community makeup, plant-animal interactions, evolutionary trajectories, and ecosystem dynamics.

How to Access the Plant Genome Size Data Online?

Version 1 of the Plant DNA C-values database, which was made available in 2001, greatly aided in revolutionizing the field by enabling extensive comparative phylogenetic analyses among various plant taxa (e.g. Leitch & Bennett, 2002; Soltis et al., 2003).

Six updates have since been made, with the most recent going live in April 2019 (Leitch et al., 2019) and compiling information from 1067 original publications and private communications.

The Plant DNA C-values Database (release 7.1, April 2019)

The previous revision in 2012 (Bennett & Leitch, 2012; Garcia et al., 2014) is with estimates for 10,770 species, angiosperms make up the great majority of the data. The database does, however, also provide C-values for all other significant land plant groupings, including data for 334 bryophytes, 246 ferns (monilophytes), and 421 gymnosperms (209 mosses, 102 liverworts and 23 hornworts).

Moreover, data are available for 445 "algae," which include species from several higher-order lineages in evolution (i.e. Rhodophyta, Chlorophyta, and the streptophyte green algae within Kingdom Plantae, and Phaeophyta and Heterokonta within the Stramenopiles).

The database's new user-friendly interface offers a variety of searching and output options, allowing users to extract and show particular information as needed. For instance, queries can be done utilizing the entire database or only for certain taxonomic levels and lineages (e.g. families, genera).

Paradox of C Values

The total amount of DNA is expressed in base pairs throughout the genome.

According to the so-called C-Value Paradox, genome size does not increase in lockstep with the perceived complexity of a species, such as vertebrates against invertebrates or "lower" vertebrates versus "higher" vertebrates (red box).

This is referred to as the "C-value conundrum," and the "C-value" stands for the amount of DNA that a haploid cell has. This is a result of junk DNA, a piece of DNA that serves only as a data storage area (still debatable)

The "amazing stability in the nuclear DNA content of all the cells in all the individuals within a given animal species" observed by Roger and Colette Vendrely in 1948 was interpreted by them as proof that DNA, and not protein, makes up genes.

This observed consistency is reflected by the term C-value. However, it was soon observed that C-values (genome sizes) vary greatly among species and that this bears no relationship to the estimated number of genes (as reflected by the complexity of the organism).

Understanding of C-Statistic

The area under the curve (AUC) is the same as the c-statistic, also referred to as the concordance statistic, and it can be interpreted as follows −

  • A poor model is one with a value below 0.5.

  • A value of 0.5 implies that the model is no better out classifying outcomes than random chance.

  • The more closely the value approaches 1, the more accurately the model can classify the results.

  • A value of 1 indicates that the model can categorize results with absolute accuracy.

So, a c-statistic offers us an insight of how well a model classifies outcomes accurately.

The c-statistic can be calculated in a clinical situation by selecting all feasible pairings of individuals, each consisting of one person who had a favorable outcome and one person who had a negative outcome.

The c-statistic can then be determined as the percentage of these pairs in which the person who had a positive outcome had a greater predicted probability of having that result than the person who did not have a positive result.

Calculating C Values

The formulas for converting the number of nucleotide pairs (or base pairs) to picograms of DNA and vice versa are

genome size (bp) = (0.978 x 109) x DNA content (pg)

DNA content (pg) = genome size (bp) / (0.978 x 109)

1 pg = 978 Mbp


The old but more widely used term "C-value paradox" has been updated to "C-value enigma." Unlike the earlier C-value paradox, the C-value enigma is expressly characterized as a set of separate but equally significant component questions, such as −

  • What kinds and amounts of non-coding DNA are present in the genomes of various eukaryotes?

  • Where does this non-coding DNA originate from, and how does it move throughout genomes throughout time? How does it get lost?

  • What impact does this non-coding DNA have on chromosomes, nuclei, cells, and organisms, or simply what functions does it serve?

  • Why do certain species have extraordinarily compact chromosomes while others have a lot of non-coding DNA?

Updated on: 31-Mar-2023


Kickstart Your Career

Get certified by completing the course

Get Started