csplit - Unix, Linux Command



NAME

csplit - To split a file into context-determined pieces.

SYNOPSIS

csplit [options]... FILE PATTERN...

DESCRIPTION

cspilt outputs pieces of FILE separated by PATTERN(s) to files 'xx00', 'xx01', ..., and output byte counts of each piece to standard output. (standard input if FILE is - ).

OPTIONS

TagDescription
-f PREFIX
--prefix=PREFIX
Use PREFIX as the output file name prefix.
-b SUFFIX
--suffix=SUFFIX
Use SUFFIX as the output file name suffix. When this option is specified, the suffix string must include exactly one 'printf(3)'-style conversion specification, possibly including format specification flags, a field width, a precision specifications, or all of these kinds of modifiers. The format letter must convert a binary integer argument to readable form; thus, only 'd', 'i', 'u', 'o', 'x', and 'X' conversions are allowed. The entire SUFFIX is given (with the current output file number) to 'sprintf(3)' to form the file name suffixes for each of the individual output files in turn. If this option is used, the '--digits' option is ignored.
-n DIGITS
--digits=DIGITS
Use output file names containing numbers that are DIGITS digits long instead of the default 2.
-k
--keep-files
Do not remove output files when errors are encountered.
-z
--elide-empty-files
Suppress the generation of zero-length output files. (In cases where the section delimiters of the input file are supposed to mark the first lines of each of the sections, the first output file will generally be a zero-length file unless you use this option.) The output file sequence numbers always run consecutively starting from 0, even when this option is specified.
-s
-q
--silent
--quiet
Do not print counts of output file sizes.

PATTERN

The contents of the output files are determined by the PATTERN arguments, as detailed below. An error occurs if a PATTERN argument refers to a nonexistent line of the input file (e.g., if no remaining line matches a given regular expression). After every PATTERN has been matched, any remaining input is copied into one last output file. By default, 'csplit' prints the number of bytes written to each output file after it has been created. The output files' names consist of a prefix ('xx' by default) followed by a suffix. By default, the suffix is an ascending sequence of two-digit decimal numbers from '00' and up to '99'. In any case, concatenating the output files in sorted order by filename produces the original input file. By default, if 'csplit' encounters an error or receives a hangup, interrupt, quit, or terminate signal, it removes any output files that it has created so far before it exits.

TagDescription
NCreate an output file containing the input up to but not including line N (a positive integer). If followed by a repeat count, also create an output file containing the next LINE lines of the input file once for each repeat.
/REGEXP/[OFFSET]Create an output file containing the current line up to (but not including) the next line of the input file that contains a match for REGEXP. The optional OFFSET is a '+' or '-' followed by a positive integer. If it is given, the input up to the matching line plus or minus OFFSET is put into the output file, and the line after that begins the next section of input.
%REGEXP%[OFFSET]Like the previous type, except that it does not create an output file, so that section of the input file is effectively ignored.
{REPEAT-COUNT} Repeat the previous pattern REPEAT-COUNT additional times. REPEAT-COUNT can either be a positive integer or an asterisk, meaning repeat as many times as necessary until the input is exhausted.

EXAMPLES

Let's have a sample file sample.txt

$ cat sample.txt
Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10

Spilt the file and see the splited files.

$ csplit sample.txt 5 28 43
$ ls
sample.txt xx00 xx01
$ cat xx00
Line 1
Line 2
Line 3
Line 4
$ cat xx01
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10
Print
Advertisements