- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Data Fragmentation, Replication, and Allocation Techniques for Distributed Database
Data Fragmentation
The process of dividing the database into smaller multiple parts or sub−tables is called fragmentation. The smaller parts or sub−tables are called fragments and are stored at different locations. Data fragmentation should be done in a way that the reconstruction of the original parent database from the fragments is possible. The restoration can be done using UNION or JOIN operations.
Database fragmentation is of three types: Horizontal fragmentation, Vertical fragmentation, and Mixed or Hybrid fragmentation.
Horizontal Fragmentation
It divides a table horizontally into a group of rows to create multiple fragments or subsets of a table. These fragments can then be assigned to different sites in the database. Reconstruction is done using UNION or JOIN operations. In relational algebra, it is represented as σp(T) for any given table(T).
Example
In this example, we are going to see how the horizontal fragmentation looks in a table.
Input :
STUDENT
id name age salary 1 aman 21 20000 2 naman 22 25000 3 raman 23 35000 4 sonam 24 36000
Example
SELECT * FROM student WHERE salary<35000; SELECT * FROM student WHERE salary>35000;
Output
id name age salary 1 aman 21 20000 2 naman 22 25000
id name age salary 4 soman 24 36000
There are three types of Horizontal fragmentation: Primary, Derived, and Complete Horizontal Fragmentation
A: Primary Horizontal Fragmentation: It is a process of segmenting a single table in a row−wise manner using a set of conditions.
Example
This example shows how the Select statement is used with a condition to provide output.
SELECT * FROM student SALARY<30000;
Output
id name age salary 1 aman 21 20000 2 naman 22 25000
B: Derived Horizontal Fragmentation: Fragmentation that is being derived from primary relation.
Example
This example shows how the Select statement is used with the where clause to provide output.
SELECT * FROM student WHERE age=21 AND salary<30000;
Output
id name age salary 1 aman 21 20000
C: Complete horizontal fragmentation: It derives a set of horizontal fragments to make the table have at least one partition.
Vertical Fragmentation
It divides a table vertically into a group of columns to create multiple fragments or subsets of a table. These fragments can then be assigned to different sites in the database. Reconstruction is done using full outer join operation.
Example
This example shows how the Select statement is used to do the fragmentation and to provide the output.
Input Table :
STUDENT
id name age salary 1 aman 21 20000 2 naman 22 25000 3 raman 23 35000 4 sonam 24 36000
Example
SELECT * FROM name;#fragmentation 1 SELECT * FROM id, age;#fragmentation 2
Output
name aman naman raman sonam
age 21 22 23 24
Mixed or Hybrid Fragmentation
It is done by performing both horizontal and vertical partitioning together. It is a group of rows and columns in relation.
Example
This example shows how the Select statement is used with the where clause to provide the output.
SELECT * FROM name WHERE age=22;
Output
name age naman 22
Data Replication
Data replication means a replica is made i. e. data is copied at multiple locations to improve the availability of data. It is used to remove inconsistency between the same data which result in a distributed database so that users can do their task without interrupting the work of other users.
Types of data replication :
Transactional Replication
It makes a full copy of the database along with the changed data. Transactional consistency is guaranteed because the order of data is the same when copied from publisher to subscriber database. It is used in server−to−server environments by consistently and accurately replicating changes in the database.
Snapshot Replication
It is the simplest type that distributes data exactly as it appears at a particular moment regardless of any updates in data. It copies the 'snapshot' of the data. It is useful when the database changes infrequently. It is slower to Transactional Replication because the data is sent in bulk from one end to another. It is generally used in cases where subscribers do not need the updated data and are operating in read−only mode.
Merge Replication
It combines data from several databases into a single database. It is the most complex type of replication because both the publisher and subscriber can do database changes. It is used in a server−to−client environment and has changes sent from one publisher to multiple subscribers.
Data Allocation
It is the process to decide where exactly you want to store the data in the database. Also involves the decision as to which data type of data has to be stored at what particular location. Three main types of data allocation are centralized, partitioned, and replicated.
Centralises: Entire database is stored at a single site. No data distribution occurs
Partitioned: The database gets divided into different fragments which are stored at several sites.
Replicated: Copies of the database are stored at different locations to access the data.
Conclusion
This article consists of three parts. The first one has data fragmentation which divides data into subtables and stores them at different locations. Types of data fragmentation are horizontal Fragmentation which divides the table horizontally into groups of rows, followed by vertical fragmentation which divides the table vertically into groups of columns and last one is mixed fragmentation which is done by performing both horizontal and vertical partitioning together. The second part comes with data replication in which data is copied at multiple locations. Types of replication are Transactional Replication which makes a full copy of the database along with the changes that occur, followed by Snapshot Replication which copies the snapshot of data to distribute and last one is Merge Replication which combines data into a single database. third part comes with Data Allocation which tells us where to store the data.