Article Categories

Selected Reading

How to replace string in a large one line, text file in Linux?

Linux Open Source Operating System

Some software reads an entire input file into memory before processing it. If the input file contains a very long single-line string, the software may crash due to insufficient memory to hold the entire string.

We'll examine methods to replace strings in very large one-line files in Linux. Since some applications cannot handle extremely large single-line files efficiently, we need specialized approaches that don't load the entire file into memory at once.

Target File

Modern JavaScript frameworks often compress all code into a single line. Consider a one-line JavaScript file called original.js with an error ? it calls fliter instead of filter. We'll correct this mistake using memory-efficient techniques.

Using tr and sed

We can split the long line into smaller segments using tr, then substitute strings using sed, and finally rejoin the segments.

Splitting Long Lines

While sed -i is typically used for single-line replacement, it loads the entire file into RAM. To overcome this limitation, we break the line into multiple smaller lines, process them with sed, and join the results back together.

The key is choosing a delimiter character that doesn't exist in the content we want to modify. The tr command processes each character individually, making it memory-efficient for large files.

To replace semicolons with newlines, use this command:

$ echo "This is line one;This is line two" | tr ";" "<br>"

This is line one
This is line two

If the original file contains newlines, we need to preserve them by swapping semicolons and newlines bidirectionally:

$ echo "This is line one;This is line two" | tr ";<br>" "<br>;" | tr "<br>;" ";<br>"

This is line one;This is line two

The output matches the original input, confirming our transformation preserves the file structure.

Using awk

The awk command provides another approach using its gsub function for string substitution. This involves two steps: setting up custom line delimiters and performing the string replacement.

Changing the Line Delimiter

We can replace the default newline character () with any character not present in our target string. This is done by setting the RS (record separator) variable in awk's BEGIN block.

To use semicolons as line delimiters:

$ echo "This is line one;This is line two" | awk 'BEGIN{RS=";"}{print}'

This is line one
This is line two

The print function adds extra newlines. Using printf avoids this:

$ echo "This is line one;This is line two" | awk 'BEGIN{RS=";"}{printf "%s", $0}'

This is line oneThis is line two

To restore the original semicolon delimiters, we prepend them to all records except the first:

$ echo "This is line one;This is line two" | awk 'BEGIN{RS=";"}{
   if (NR != 1) {
      printf "%c", RS
   }
   printf "%s", $0
}'

This is line one;This is line two

The NR variable tracks the current record number, while RS contains our delimiter character.

Replacing the String

Now we can combine line splitting with string replacement. The gsub function in awk works similarly to sed's substitute command, taking a regular expression pattern and a replacement string.

To replace .fliter( with .filter( in our JavaScript file:

$ awk 'BEGIN{RS=";"} {
   gsub("\.fliter\(", ".filter(")
   if (NR != 1) {
      printf "%c", RS
   }
   printf "%s", $0
}' < original.js > fixed.js

Note that awk requires different escaping than sed ? we need double backslashes and must escape parentheses and dots in the regular expression.

Comparison

Method	Memory Usage	Complexity	Best For
tr + sed	Low	Medium	Simple character-based delimiters
awk	Low	Low	Complex pattern matching and replacement
sed alone	High	Low	Small files only

Conclusion

Both tr + sed and awk provide memory-efficient methods to replace strings in extremely large one-line files. The key is splitting the content using delimiters, processing smaller segments, and reconstructing the original format. Choose awk for complex patterns or tr + sed for simpler character-based operations.

Satish Kumar

Updated on: 2026-03-17T09:01:38+05:30

1K+ Views

Previous Next