Linux provides us with really interesting commands to play around with text and text files. These commands fall under the category of text processing commands. This article will discuss three such commands – cat, sort and uniq. So, get ready with your Linux system booted up with a terminal open and be prepared to play around really cool with these text processing commands in Linux.
(Note: All the examples in this article are expected to work in all flavors of Linux, though are well tested in Ubuntu.)
Text processing commands in Linux
1. Linux cat command
In Linux, ‘cat’ command got its name from concatenate text and display it. So, the capabilities of the ‘cat’ command includes, it can create a text file, and add the input text to it. The input is given through the standard input.
$cat > <filename>
It can also append the file with our input text.
$cat >> <filename>
The command can even be used to print the content of the text file to standard output.
The various capabilities of the ‘cat’ command come through the different ways of using the command. Lets check out the usage with the help of some examples.
Creating a file with the input text through standard input. Type following command and press enter.
It gives the input command prompt for the user to enter the input text which will be the content of the file. Type the text, even with the ‘enter’ to add multiple lines. To indicate end of file, press ‘Ctrl + D’, which will bring back the command prompt.
Here is the input text we entered
$cat >sampleText.txt A sample text.
Now lets check the contents of the file, which can be done using ‘cat’ command.
$cat sampleText.txt A sample text.
So, no need to open the file in an editor to just view the contents. It also supports an option,
$cat -A <filename>
With this ‘-A’ option, it indicates a tab in the text with a ‘^I’ and end of file with a ‘$’. As an example, if our file contains text
“ Text with a tab and 3 spaces in the end. “
Then, the way it works is:
$cat tabText.txt ^IText with a tab and 3 spaces in the end. $
Lets concatenate a text to a already existing file ‘sampleText.txt,
$ cat >>sampleText.txt Concatenated Text $ cat sampleText.txt A sample Text Concatenated Text $
Another useful ‘cat’ command option is ‘-n’ which is for adding the line numbers to the file
$cat -n <filename>
$ cat -n sampleText.txt 1 A sample Text 2 Concatenated Text
Note, the line numbers are added only in the output, the file is not modified.
For more interesting options, refer to its man page.
2. Linux sort Command
Linux also offers a sort command which sorts the multiple line content of the file. In case, the file already exists, it can be used to display the contents of file on standard output as sorted.
‘sort’ can be used to create a new file, the same way ‘cat’ does, however the input text in multiline form are sorted and then written to the newly created file.
$sort > <filename>
Lets see the behaviors through examples:
First of all, in case of existing file
$ cat > file 4 7 2 $ sort file 2 4 7
To create a new file, with sorted elements
$ sort > sfile t a r $ cat sfile a r t
Let’s see what happens in case of multiline strings
$ sort > strfile program pragma aroma $ cat strfile aroma pragma program
It works great!
There is also an option ‘-n’ with which the sorting happens with respect to the first numerical value encountered in every line of the file. As in,
As an example,
$ cat > list2 3 ghost 67 12 hello 34 6 stret 45 $ sort -n list2 3 ghost 67 6 stret 45 12 hello 34
This command is really useful in scenarios where we want a sorted list according to some weights, sizes, etc. To give a highlight, here is how we can list the running processes sorted with respect to their process ID’s.
$ ps PID TTY TIME CMD 2081 pts/0 00:00:01 bash 2627 pts/0 00:00:00 ps $ ps | sort -n PID TTY TIME CMD 2081 pts/0 00:00:01 bash 2628 pts/0 00:00:00 ps 2629 pts/0 00:00:00 sort
However, we can also specify the column by which to sort the file lines using ‘-k’ option.
$ ls -lt | sort -n -k 5 -rw-r--r-- 1 rupali rupali 0 2012-12-03 00:09 1sampleFile.txt total 44 -rw-r--r-- 1 rupali rupali 6 2012-12-03 01:03 file -rw-r--r-- 1 rupali rupali 6 2012-12-03 01:04 file2 -rw-r--r-- 1 rupali rupali 6 2012-12-03 01:10 sfile -rw-r--r-- 1 rupali rupali 15 2012-12-03 01:12 strfile2 -rw-r--r-- 1 rupali rupali 21 2012-12-03 01:13 strfile -rw-r--r-- 1 rupali rupali 21 2012-12-03 01:28 r -rw-r--r-- 1 rupali rupali 22 2012-12-03 00:58 BlankLine.txt -rw-r--r-- 1 rupali rupali 26 2012-12-02 23:52 mlbtext.txt -rw-r--r-- 1 rupali rupali 32 2012-12-03 01:18 list -rw-r--r-- 1 rupali rupali 33 2012-12-03 00:49 sampleText.txt -rw-r--r-- 1 rupali rupali 34 2012-12-03 01:19 list2
One wishes to reverse sort the list? It can be done through -r option
$ sort -nr list 87 hello 66 linux 45 abc 23 rcb
Amazing options are provided with the ‘sort’ command of Linux which can be referred through its man page.
3. Linux uniq command
Given a sorted input or a file, it eliminates the duplicates from the standard output.
$ cat sfile a r t x x z z $ uniq sfile a r t x z
Note, ‘uniq’ command doesn’t work if the input is not sorted, hence it needs be sorted on the fly if it is not before using uniq command.
$ cat > ufile 3 6 7 1 6 3 3 $ uniq ufile 3 6 7 1 6 3 $ sort ufile | uniq 1 3 6 7
A really interesting option is ‘-c’ which counts the number of duplicates and displays\
$ cat ufile 3 6 7 1 6 3 3 $ uniq -c ufile 1 3 1 6 1 7 1 1 1 6 2 3
These were interesting commands to play with texts and text files in Linux. As one can see, these are really handy and pretty helpful in some cumbersome scenarios. The main objective of this article is to let people know that such commands exists in Linux making our day to day life really easy, just by knowing about their presence. However, I highly recommend to have a look at their man pages to know all the options of these commands.