Text Processing Commands in Linux

By | 03/12/2012

Linux provides us with really interesting commands to play around with text and text files. These commands fall under the category of text processing commands. This article will discuss three such commands – cat, sort and uniq. So, get ready with your Linux system booted up with a terminal open and be prepared to play around really cool with these text processing commands in Linux.

(Note: All the examples in this article are expected to work in all flavors of Linux, though are well tested in Ubuntu.)

 

Text processing commands in Linux

 

1. Linux cat command

In Linux, ‘cat’ command got its name from concatenate text and display it. So, the capabilities of the ‘cat’ command includes, it can create a text file, and add the input text to it. The input is given through the standard input.
Syntax:

$cat > <filename>

It can also append the file with our input text.
Syntax:

$cat >> <filename>

The command can even be used to print the content of the text file to standard output.
Syntax:

$cat  <filename>

The various capabilities of the ‘cat’ command come through the different ways of using the command. Lets check out the usage with the help of some examples.

Creating a file with the input text through standard input. Type following command and press enter.

$cat >sampleText.txt

It gives the input command prompt for the user to enter the input text which will be the content of the file. Type the text, even with the ‘enter’ to add multiple lines. To indicate end of file, press ‘Ctrl + D’, which will bring back the command prompt.

Here is the input text we entered

$cat >sampleText.txt
A sample text.

Now lets check the contents of the file, which can be done using ‘cat’ command.

$cat sampleText.txt
A sample text.

So, no need to open the file in an editor to just view the contents. It also supports an option,
Syntax:

$cat -A <filename>

With this ‘-A’ option, it indicates a tab in the text with a ‘^I’ and end of file with a ‘$’. As an example, if our file contains text

“	Text with a tab and 3 spaces in the end.   “

Then, the way it works is:

$cat tabText.txt
^IText with a tab and 3 spaces in the end.   $

Lets concatenate a text to a already existing file ‘sampleText.txt,

$ cat >>sampleText.txt
 Concatenated Text
$ cat sampleText.txt 
A sample Text
 Concatenated Text
$

Another useful ‘cat’ command option is ‘-n’ which is for adding the line numbers to the file
Syntax:

$cat -n <filename>

Example:

$ cat -n sampleText.txt 
     1	A sample Text
     2	 Concatenated Text

Note, the line numbers are added only in the output, the file is not modified.
For more interesting options, refer to its man page.

 

2. Linux sort Command

Linux also offers a sort command which sorts the multiple line content of the file. In case, the file already exists, it can be used to display the contents of file on standard output as sorted.
Syntax:

$sort <filename>

‘sort’ can be used to create a new file, the same way ‘cat’ does, however the input text in multiline form are sorted and then written to the newly created file.
Syntax:

$sort > <filename>

Lets see the behaviors through examples:

First of all, in case of existing file

$ cat > file
4
7
2
$ sort file 
2
4
7

To create a new file, with sorted elements

$ sort > sfile
t
a
r
$ cat sfile 
a
r
t

Let’s see what happens in case of multiline strings

$ sort > strfile
program
pragma
aroma
$ cat strfile 
aroma
pragma
program

It works great!

There is also an option ‘-n’ with which the sorting happens with respect to the first numerical value encountered in every line of the file. As in,
As an example,

$ cat > list2
3 ghost 67
12 hello 34
6 stret 45

$ sort -n list2
3 ghost 67
6 stret 45
12 hello 34

This command is really useful in scenarios where we want a sorted list according to some weights, sizes, etc. To give a highlight, here is how we can list the running processes sorted with respect to their process ID’s.

$ ps
  PID TTY          TIME CMD
 2081 pts/0    00:00:01 bash
 2627 pts/0    00:00:00 ps
$ ps | sort -n
  PID TTY          TIME CMD
 2081 pts/0    00:00:01 bash
 2628 pts/0    00:00:00 ps
 2629 pts/0    00:00:00 sort

However, we can also specify the column by which to sort the file lines using ‘-k’ option.

$ ls -lt | sort -n -k 5
-rw-r--r-- 1 rupali rupali  0 2012-12-03 00:09 1sampleFile.txt
total 44
-rw-r--r-- 1 rupali rupali  6 2012-12-03 01:03 file
-rw-r--r-- 1 rupali rupali  6 2012-12-03 01:04 file2
-rw-r--r-- 1 rupali rupali  6 2012-12-03 01:10 sfile
-rw-r--r-- 1 rupali rupali 15 2012-12-03 01:12 strfile2
-rw-r--r-- 1 rupali rupali 21 2012-12-03 01:13 strfile
-rw-r--r-- 1 rupali rupali 21 2012-12-03 01:28 r
-rw-r--r-- 1 rupali rupali 22 2012-12-03 00:58 BlankLine.txt
-rw-r--r-- 1 rupali rupali 26 2012-12-02 23:52 mlbtext.txt
-rw-r--r-- 1 rupali rupali 32 2012-12-03 01:18 list
-rw-r--r-- 1 rupali rupali 33 2012-12-03 00:49 sampleText.txt
-rw-r--r-- 1 rupali rupali 34 2012-12-03 01:19 list2

One wishes to reverse sort the list? It can be done through -r option

$ sort -nr list 
87 hello
66 linux
45 abc
23 rcb

Amazing options are provided with the ‘sort’ command of Linux which can be referred through its man page.

 

3. Linux uniq command

Given a sorted input or a file, it eliminates the duplicates from the standard output.
Syntax:

$uniq <filename>

An Example:

$ cat sfile 
a
r
t
x
x
z
z

$ uniq sfile 
a
r
t
x
z

Note, ‘uniq’ command doesn’t work if the input is not sorted, hence it needs be sorted on the fly if it is not before using uniq command.

$ cat > ufile
3
6
7
1
6
3
3
$ uniq ufile
3
6
7
1
6
3
$ sort ufile | uniq
1
3
6
7

A really interesting option is ‘-c’ which counts the number of duplicates and displays\

$ cat ufile 
3
6
7
1
6
3
3
$ uniq -c ufile 
      1 3
      1 6
      1 7
      1 1
      1 6
      2 3

Conclusion

These were interesting commands to play with texts and text files in Linux. As one can see, these are really handy and pretty helpful in some cumbersome scenarios. The main objective of this article is to let people know that such commands exists in Linux making our day to day life really easy, just by knowing about their presence.  However, I highly recommend to have a look at their man pages to know all the options of these commands.

Category: Commands Linux administration

About Rupali

Rupali Sharma holds an Honours degree in BE. She holds several years of experience in Linux software development and has worked on several programming languages like C/C++, Assembly etc. She contributes as author and content Editor at MyLinuxBook.

Leave a Reply

Your email address will not be published. Required fields are marked *