Data Fragmentation, Replication, and Allocation Techniques for Distributed Database


Data Fragmentation

The process of dividing the database into smaller multiple parts or sub−tables is called fragmentation. The smaller parts or sub−tables are called fragments and are stored at different locations. Data fragmentation should be done in a way that the reconstruction of the original parent database from the fragments is possible. The restoration can be done using UNION or JOIN operations.

Database fragmentation is of three types: Horizontal fragmentation, Vertical fragmentation, and Mixed or Hybrid fragmentation.

Horizontal Fragmentation

It divides a table horizontally into a group of rows to create multiple fragments or subsets of a table. These fragments can then be assigned to different sites in the database. Reconstruction is done using UNION or JOIN operations. In relational algebra, it is represented as σp(T) for any given table(T).

Example

In this example, we are going to see how the horizontal fragmentation looks in a table.

Input :

STUDENT

id	  name	   age	  salary
1	  aman	   21	  20000
2	  naman	   22	  25000
3	  raman	   23	  35000
4	  sonam	   24	  36000

Example

SELECT * FROM student WHERE salary<35000;
SELECT * FROM student WHERE salary>35000;

Output

id	name	age	salary
1	aman	21	20000
2	naman	22	25000
id	name	age	salary
4	soman	24	36000

There are three types of Horizontal fragmentation: Primary, Derived, and Complete Horizontal Fragmentation

A: Primary Horizontal Fragmentation: It is a process of segmenting a single table in a row−wise manner using a set of conditions.

Example

This example shows how the Select statement is used with a condition to provide output.

SELECT * FROM student SALARY<30000;

Output

id	name	age	salary
1	aman	21	20000
2	naman	22	25000

B: Derived Horizontal Fragmentation: Fragmentation that is being derived from primary relation.

Example

This example shows how the Select statement is used with the where clause to provide output.

SELECT * FROM student WHERE age=21 AND salary<30000; 

Output

id	name	age	  salary
1	aman	21	  20000

C: Complete horizontal fragmentation: It derives a set of horizontal fragments to make the table have at least one partition.

Vertical Fragmentation

It divides a table vertically into a group of columns to create multiple fragments or subsets of a table. These fragments can then be assigned to different sites in the database. Reconstruction is done using full outer join operation.

Example

This example shows how the Select statement is used to do the fragmentation and to provide the output.

Input Table :

STUDENT

id	name	age	salary
1	aman	21	20000
2	naman	22	25000
3	raman	23	35000
4	sonam	24	36000

Example

SELECT * FROM name;#fragmentation 1
SELECT * FROM id, age;#fragmentation 2

Output

name
aman
naman
raman
sonam
age
21
22
23
24

Mixed or Hybrid Fragmentation

It is done by performing both horizontal and vertical partitioning together. It is a group of rows and columns in relation.

Example

This example shows how the Select statement is used with the where clause to provide the output.

SELECT * FROM name WHERE age=22;

Output

name     age
naman	 22

Data Replication

Data replication means a replica is made i. e. data is copied at multiple locations to improve the availability of data. It is used to remove inconsistency between the same data which result in a distributed database so that users can do their task without interrupting the work of other users.

Types of data replication :

Transactional Replication

It makes a full copy of the database along with the changed data. Transactional consistency is guaranteed because the order of data is the same when copied from publisher to subscriber database. It is used in server−to−server environments by consistently and accurately replicating changes in the database.

Snapshot Replication

It is the simplest type that distributes data exactly as it appears at a particular moment regardless of any updates in data. It copies the 'snapshot' of the data. It is useful when the database changes infrequently. It is slower to Transactional Replication because the data is sent in bulk from one end to another. It is generally used in cases where subscribers do not need the updated data and are operating in read−only mode.

Merge Replication

It combines data from several databases into a single database. It is the most complex type of replication because both the publisher and subscriber can do database changes. It is used in a server−to−client environment and has changes sent from one publisher to multiple subscribers.

Data Allocation

It is the process to decide where exactly you want to store the data in the database. Also involves the decision as to which data type of data has to be stored at what particular location. Three main types of data allocation are centralized, partitioned, and replicated.

Centralises: Entire database is stored at a single site. No data distribution occurs

Partitioned: The database gets divided into different fragments which are stored at several sites.

Replicated: Copies of the database are stored at different locations to access the data.

Conclusion

This article consists of three parts. The first one has data fragmentation which divides data into subtables and stores them at different locations. Types of data fragmentation are horizontal Fragmentation which divides the table horizontally into groups of rows, followed by vertical fragmentation which divides the table vertically into groups of columns and last one is mixed fragmentation which is done by performing both horizontal and vertical partitioning together. The second part comes with data replication in which data is copied at multiple locations. Types of replication are Transactional Replication which makes a full copy of the database along with the changes that occur, followed by Snapshot Replication which copies the snapshot of data to distribute and last one is Merge Replication which combines data into a single database. third part comes with Data Allocation which tells us where to store the data.

Updated on: 13-Jul-2023

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements