What is a parallel database and explain how it works?



A parallel database is one which involves multiple processors and working in parallel on the database used to provide the services.

A parallel database system seeks to improve performance through parallelization of various operations like loading data, building index and evaluating queries parallel systems improve processing and I/O speeds by using multiple CPU’s and disks in parallel.

Working of parallel database

Let us discuss how parallel database works in step by step manner −

Step 1 − Parallel processing divides a large task into many smaller tasks and executes the smaller tasks concurrently on several CPU’s and completes it more quickly.

Step 2 − The driving force behind parallel database systems is the demand of applications that have to query extremely large databases of the order of terabytes or that have to process a large number of transactions per second.

Step 3 − In parallel processing, many operations are performed simultaneously as opposed to serial processing, in which the computational steps are performed sequentially.

This working of parallel database is explained in the diagram given below −

Performance measures

There are two main resources of performance of a database system, which are explained below −

  • Throughput − The number of tasks that can be completed in a given time interval. A system that processes a large number of small transactions can improve throughput by processing many transactions in parallel.

  • Response time − The amount of time it takes to complete a single task from the time it is submitted. A system that processes large transactions can improve response time, as well as throughput by performing subtasks of each transaction in parallel.

Benefits of parallel Database

The benefits of the parallel database are explained below −

Speed

Speed is the main advantage of parallel databases. The server breaks up a request for a user database into parts and sends each part to a separate computer.

We eventually function on the pieces and combine the outputs, returning them to the customer. It speeds up most requests for data so that large databases can be reached more easily.

Capacity

As more users request access to the database, the network administrators are adding more machines to the parallel server, increasing their overall capacity.

For example, a parallel database enables a large online store to have at the same time access to information from thousands of users. With a single server, this level of performance is not feasible.

Reliability

Despite the failure of any computer in the cluster, a properly configured parallel database will continue to work. The database server senses that there is no response from a single computer and redirects its function to the other computers.

Many companies, such as online retailers, want their database to be accessible as fast as possible. This is where a parallel database stands good.

This method also helps in conducting scheduled maintenance on a computer-by-computer technician. They send a server command to uninstall the affected device, then perform the maintenance and update required.

Benefits for queries

Parallel query processing can benefit the following types of queries −

  • Select statements that scan large numbers of pages but output a few rows only.

  • Select statements that include union, order by, or distinct, since these queries can populate worktables in parallel, and can make use of parallel sorting.

  • Select statements that use merge joins can use parallel processing for scanning tables and also for sorting and merging.

  • Select statements where the reformatting strategy is chosen by the optimizer, since these can populate worktables in parallel, and can make use of parallel sorting.

  • Create index statements, and the alter table - add constraint clauses that create indexes, unique and primary keys.


Advertisements