DBMS - Placing File Records on Disk

Quiz

Storing data means more than just saving it somewhere. It is about organizing the data efficiently such that the stored data can retrieved and used easily. In DBMS, it means figuring out how to place file records on a disk. Although it may seem like a simple task, it involves some clever techniques to handle different types of records. These techniques are used to save space, and make the database faster.

Read this chapter to learn how file records are placed on a disk. We will have specific examples to understand the methods used to store both fixed-length and variable-length records.

Records and File Types

A record in a DBMS is a collection of data values. We often tie them to a specific entity. Think of it like a detailed entry in a contact list. For instance, an EMPLOYEE record may have different fields such as name, employee ID, department, and salary. Each of these fields holds a piece of information about a particular employee.

Placing file records on disk enables better data indexing and searching capabilities. With an organized structure, it becomes easier to locate specific files or retrieve relevant information without wasting time and resources.

Techniques for Placing File Records on Disk

There are several techniques for placing file records on disk, including −

Fixed-Length Records − Every record is the same size. Each field has a predetermined length, which makes it easier to locate data because the position of each field is consistent.
Variable-Length Records − Here the records can differ in size. It happens when some fields hold varying amounts of data. For example, a name that might be 5 characters long for one person and 20 for another.

Let's now discuss each of these techniques in detail.

Fixed-Length Records on Disk

Fixed-length records are straightforward to store because of their uniform size. The following example shows how it works −

Example of Fixed-Length Records

Suppose we have a fixed-length record for employees. It contains the following fields −

Name − 30 characters
Social Security Number (SSN) − 9 characters
Salary − 4 bytes (integer)
Job Code − 4 bytes (integer)
Department − 20 characters

Each fieldâs length is fixed. If we add them up, the total record size will be (30 + 9 + 4 + 4 + 20) = 67 bytes. It is this uniformity that makes it simple to calculate where a specific field is within a record. For instance, the salary field starts at byte 39 (the first 30 bytes are for the name, followed by 9 bytes for SSN).

Limitations of Fixed-Length Records

Fixed-length records are easy to handle, but they can waste space. For example, if a department name is only 5 characters long, the remaining 15 bytes are unused. For thousands of records, this wasted space adds up.

Another issue arises with optional fields. Sometimes some records do not have values for certain fields. Space is reserved for those fields as well. Let us say not every employee has a job code. Even so, 4 bytes will be reserved for that field in every record.

Variable-Length Records on Disk

Variable-length records save space by allowing fields to take up only as much space as they need. But, how do we manage records when their sizes are unpredictable?

Using Separators − We can use separator characters like pipe (|) to separate the fields in a record. So, a record might look like this:

Smith|123456789|45000|Computer Department|

Separators make it clear where each field ends, even if the field sizes vary. This format works well but requires extra processing to find the data.

Storing Field Lengths − Another method is to store the length of each field at the beginning of the record. For example,

30 Smith	9 123456789	6 45000

Here, the numbers before each field indicate its size. This system reads the length, then grabs the corresponding number of bytes.

Practical Example: Handling Optional Fields

Let us say our EMPLOYEE records include an optional field for a middle name. For some employees, this field may be empty. With variable-length records, we can save space by only including the field when it has a value.

In a file with such records −

Record A − Smith|123456789|45000|Computer Department|
Record B − Jones|987654321|52000|HR|Michael

In Record A, the middle name is skipped. This flexibility makes variable-length records more space-efficient, however it complicates how records are processed.

Mixed Records and Real-Life Applications

Sometimes, files contain a mix of record types. If we consider a university database with two types of records −

Student Records − Fields for name, ID, courses, and grades.
Course Records − Fields for course name, instructor, and schedule.

If related student and course records are stored together, their sizes will vary. This is common in real-world databases where different entities need to be linked efficiently.

Example of Mixed Records

In one block, we might have −

Student Record − John Doe|12345|Math: A, History: B|
Course Record − Math|Prof. Smith|MWF 10:00 AM|

The database system keeps track of the record type and adjusts accordingly.

Organization of Records on Disk

When records are placed on a disk, they are grouped into blocks. Blocks are small chunks of data that the disk reads and writes. The way the records are packed into blocks affects the performance.

Unspanned Records − In this method, a record must fit entirely within one block. If a block has extra space left after storing several records, that space remains unused. This approach is simple but wastes some disk space.
Spanned Records − For larger records, the spanned approach allows a single record to stretch across multiple blocks. Here, the pointer at the end of one block tells the system where the rest of the record is stored. This method is more space-efficient but slightly more complex to handle in real life scenario.

Example: Spanned vs. Unspanned

Take a look at the following examples of spanned and unspanned records −

Unspanned Block − Record 1 | Record 2 | Record 3 | Empty Space
Spanned Block − Record 1 | Part of Record 2 (next block has the rest of Record 2)

Optimizing Record Placement

We need to optimize the placement of records to best utilize the disk space and improve the speed. For example, if an employeeâs records are frequently accessed alongside their department details, these can be placed on the same block.

There is another type of optimization called indexing. By creating an index that points to the location of records, the system reduces the time it takes to find specific data.

Conclusion

In this chapter, we covered in detail how file records are placed on disks. We highlighted the difference between fixed-length and variable-length records. Fixed-length records are easier to manage but may waste space. Variable-length records, on the other hand, save space but require more effort to process. Additionally, we looked at how records are grouped into blocks and discussed the trade-offs between spanned and unspanned organizations.

Print Page