Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to split or break large files into pieces in Linux?
The split command is used to divide large files into smaller, more manageable pieces in Linux systems. By default, it creates output files with 1000 lines each and uses 'x' as the filename prefix. For example, if no output filename is specified, the split files will be named xaa, xab, etc. When a hyphen (-) is used instead of an input file, the command reads data from standard input.
Syntax
The general syntax of the split command is as follows:
split [OPTION]... [FILE [PREFIX]]
Command Options
| Option | Description |
|---|---|
-a, --suffix-length=N |
Generate suffixes of length N (default is 2) |
--additional-suffix=SUFFIX |
Append an additional suffix to filenames |
-b, --bytes=SIZE |
Put SIZE bytes per output file |
-C, --line-bytes=SIZE |
Put at most SIZE bytes of records per output file |
-d |
Use numeric suffixes starting at 0 instead of alphabetic |
-l, --lines=NUMBER |
Put NUMBER lines per output file |
-n, --number=CHUNKS |
Generate CHUNKS output files |
--verbose |
Print a diagnostic message for each file created |
Basic Examples
Default Split
To split a large file using default settings (1000 lines per file):
split largefile.txt
This creates files named xaa, xab, xac, etc.
Split by Number of Lines
To split a file into pieces with 100 lines each:
split -l 100 largefile.txt split_
This creates files named split_aa, split_ab, etc.
Split by File Size
To split a file into 10MB pieces:
split -b 10M largefile.txt chunk_
Size units can be: K (kilobytes), M (megabytes), G (gigabytes).
Split into Specific Number of Files
To split a file into exactly 5 pieces:
split -n 5 largefile.txt part_
Using Numeric Suffixes
To use numeric suffixes instead of alphabetic ones:
split -d -l 500 largefile.txt file_
This creates files named file_00, file_01, file_02, etc.
Verbose Output
To see what files are being created during the split process:
split --verbose -l 200 largefile.txt section_
creating file 'section_aa' creating file 'section_ab' creating file 'section_ac'
Practical Use Cases
Log file management − Breaking large log files for easier analysis
File transfer − Splitting large files to fit email attachment limits
Backup operations − Creating smaller backup chunks for storage
Data processing − Dividing large datasets for parallel processing
Reassembling Split Files
To reassemble split files back into the original file, use the cat command:
cat split_* > original_file.txt
Conclusion
The split command is an essential Linux utility for managing large files by dividing them into smaller, more manageable pieces. It offers flexible options for splitting by lines, bytes, or number of chunks, making it valuable for file management, data processing, and system administration tasks.
