Biopython - Population Genetics



Population genetics plays an important role in evolution theory. It analyses the genetic difference between species as well as two or more individuals within the same species.

Biopython provides Bio.PopGen module for population genetics and mainly supports `GenePop, a popular genetics package developed by Michel Raymond and Francois Rousset.

A simple parser

Let us write a simple application to parse the GenePop format and understand the concept.

Download the genePop file provided by Biopython team in the link given below −https://raw.githubusercontent.com/biopython/biopython/master/Tests/PopGen/c3line.gen

Load the GenePop module using the below code snippet −

>>> from Bio.PopGen import GenePop

Parse the file using GenePop.read method as below −

>>> record = GenePop.read(open("c3line.gen"))

Show the loci and population information as given below −

>>> record.loci_list 
['136255903', '136257048', '136257636'] 
>>> record.pop_list 
['4', 'b3', '5'] 
>>> record.populations 
[[('1', [(3, 3), (4, 4), (2, 2)]), ('2', [(3, 3), (3, 4), (2, 2)]), 
   ('3', [(3, 3), (4, 4), (2, 2)]), ('4', [(3, 3), (4, 3), (None, None)])], 
[('b1', [(None, None), (4, 4), (2, 2)]), ('b2', [(None, None), (4, 4), (2, 2)]), 
   ('b3', [(None, None), (4, 4), (2, 2)])], 
[('1', [(3, 3), (4, 4), (2, 2)]), ('2', [(3, 3), (1, 4), (2, 2)]), 
   ('3', [(3, 2), (1, 1), (2, 2)]), ('4', 
   [(None, None), (4, 4), (2, 2)]), ('5', [(3, 3), (4, 4), (2, 2)])]] 
>>>

Here, there are three loci available in the file and three sets of population: First population has 4 records, second population has 3 records and third population has 5 records. record.populations shows all sets of population with alleles data for each locus.

Manipulate the GenePop file

Biopython provides options to remove locus and population data.

Remove a population set by position,

>>> record.remove_population(0) 
>>> record.populations 
[[('b1', [(None, None), (4, 4), (2, 2)]), 
   ('b2', [(None, None), (4, 4), (2, 2)]), 
   ('b3', [(None, None), (4, 4), (2, 2)])], 
   [('1', [(3, 3), (4, 4), (2, 2)]), 
   ('2', [(3, 3), (1, 4), (2, 2)]), 
   ('3', [(3, 2), (1, 1), (2, 2)]), 
   ('4', [(None, None), (4, 4), (2, 2)]), 
   ('5', [(3, 3), (4, 4), (2, 2)])]]
>>>

Remove a locus by position

>>> record.remove_locus_by_position(0) 
>>> record.loci_list 
['136257048', '136257636'] 
>>> record.populations 
[[('b1', [(4, 4), (2, 2)]), ('b2', [(4, 4), (2, 2)]), ('b3', [(4, 4), (2, 2)])], 
   [('1', [(4, 4), (2, 2)]), ('2', [(1, 4), (2, 2)]), 
   ('3', [(1, 1), (2, 2)]), ('4', [(4, 4), (2, 2)]), ('5', [(4, 4), (2, 2)])]]
>>>

Remove a locus by name

>>> record.remove_locus_by_name('136257636') 
>>> record.loci_list 
['136257048'] 
>>> record.populations 
[[('b1', [(4, 4)]), ('b2', [(4, 4)]), ('b3', [(4, 4)])], 
   [('1', [(4, 4)]), ('2', [(1, 4)]), 
   ('3', [(1, 1)]), ('4', [(4, 4)]), ('5', [(4, 4)])]]
>>>
Advertisements