USD ($)
$
United States Dollar
Euro Member Countries
India Rupee

Text Processing and Stream Editing

Lesson 18/49 | Study Time: 20 Min

Text processing and stream editing are core functionalities in Linux that empower users to manipulate text data efficiently. These operations are critical for tasks such as filtering logs, transforming data formats, automating edits, and extracting information.

Linux offers powerful tools like sed (Stream Editor) and awk (pattern scanning and processing language) that work on streams or files, enabling sophisticated text transformations and reporting, all from the command line. 

What is Stream Editing?

Stream editing involves processing text sequentially from input (like files or pipelines), applying specified transformations or filters, and outputting the result without requiring manual intervention.

sed is a non-interactive stream editor, perfect for tasks like substitution, deletion, insertion, or selective print commands across large data sets.

Sed - The Stream Editor

Text manipulation in Linux can be efficiently handled using the sed stream editor. Following are essential sed commands used for editing and filtering text.


Basic Usage:

text
sed [options] 'command' file


Common Commands:


1. Substitution:

text
sed 's/old-text/new-text/g' filename

Replaces all occurrences of old-text with new-text globally.


2. Deletion:

text
sed '3d' filename

Deletes the third line.


3. Printing Specific Lines:

text
sed -n '1,5p' filename

Prints lines 1 through 5.


4. In-place Editing:

text
sed -i 's/foo/bar/g' filename

Replaces text in the file directly (creates a backup if specified).


Appending and Inserting Lines:


1. Append after a match:

text
sed '/pattern/a New line text' filename


2. Insert before a match:

text
sed '/pattern/i New line text' filename


Sed reads line by line, applies the commands, then writes to standard output or files.

Awk - The Pattern Scanning and Processing Language

Awk processes input data record by record and field by field. Below is a list of practical awk commands for text analysis and reporting.


Usage:

text
awk 'pattern {action}' filename


Processes files record by record (usually line by line) and field by field. Fields are represented by $1, $2, ..., and NF is the number of fields.


Common Actions:


1. Print specific fields:

text
awk '{print $1, $3}' filename

Prints first and third columns.


2. Pattern matching with conditional actions:

text
awk '/error/ {print $0}' filename

Prints all lines containing “error.”


3. Summation and aggregation:

text
awk '{sum += $3} END {print sum}' filename

Sums values in the third column.

Awk also supports variables, arithmetic, built-in functions, control flow, making it a powerful data extraction and reporting tool.

Comparing Sed and Awk

Common Text Processing Workflow


1. Use grep to filter input lines.

2. Pipe output to sed for substitution or text cleanup.

3. Use awk to extract fields, perform calculations, or generate reports.

4. Output results or redirect to files.


Example:

bash
grep 'ERROR' logfile | sed 's/ERROR:/Error:/' | awk '{print $1, $5}'
Samuel Wilson

Samuel Wilson

Product Designer
Profile

Class Sessions

1- What is Linux and Operating System Concepts 2- Linux History and Evolution 3- Linux Distributions and Their Purposes 4- Open Source Software and Licensing 5- Graphical User Interface (GUI) and Desktop Environments 6- Terminal Access and Command-Line Fundamentals 7- Getting Help and Command Documentation 8- File System Hierarchy and Directory Structure 9- Navigating Directories and Listing Contents 10- Creating, Copying, and Moving Files and Directories 11- Deleting Files and Directories 12- Symbolic and Hard Links 13- Understanding File Permissions Model 14- Modifying Permissions and Ownership 15- User and Group Management 16- Sudo and Privilege Escalation 17- Text Searching and Pattern Matching 18- Text Processing and Stream Editing 19- Compressing and Archiving Files 20- Text Editing and File Creation 21- Package Management Systems Overview 22- Installing and Updating Software with APT 23- Installing and Updating Software with YUM/DNF 24- Managing Software from Non-Repository Sources 25- Understanding Processes and Process Management 26- Viewing Running Processes 27- Process Control and Termination 28- Task Scheduling with Cron 29- Networking Concepts and IP Addressing 30- Viewing and Configuring Network Interfaces 31- Basic Network Troubleshooting 32- Shell Script Basics 33- Variables and Data Types 34- Conditional Logic in Scripts 35- Loops and Iteration 36- Functions and Code Reuse 37- Input/Output and User Interaction 38- System Authentication and Access Control 39- File System Security 40- Software Updates and Patching 41- Basic Firewall Concepts 42- System Information and Monitoring 43- Service and Daemon Management 44- System Boot Process and Runlevels 45- System Backup and Disaster Recovery 46- Comprehensive File System Management 47- System Automation Workflows 48- Multi-Concept Troubleshooting Scenarios 49- Continued Learning Pathways