bzip2 and bunzip

Different compression commands use a different type of compression algorithms. Similarly, bzip2 uses Burrows-Wheeler block sorting text compression algorithm along with Huffman coding. We already covered below similar commands

13 Zip and Unzip command examples in Linux/Unix

7 Linux/Unix gzip and gunzip command examples

In this post, we will see how to use the bzip2 command.

Syntax for bzip2 command in Linux

bzip2 [OPTIONS] [FILENAME]

The bzip2 command compresses the files to their corresponding bzip2 file, by default replacing the original file by original_name.bz2. The compressed files retain the permissions, modification date and ownership of the original files, when possible.

 

Note: bzip has a perfect compression ratio. Hence it is efficiently used to compress the backup files like tar to disk management.

Example 1: To compress a single file within a directory. To remind again, this will replace the original file.

sujit@sujitkumar:~/default$ ls -l apac*
-rw-r--r-- 1 sujit sujit 637 Mar 27 13:45 apache2
sujit@sujitkumar:~/default$ bzip2 apache2
sujit@sujitkumar:~/default$ ls -l apac*
-rw-r--r-- 1 sujit sujit 407 Mar 27 13:45 apache2.bz2

Example 2: Though the compressed file usually will have a format of original_name.bz2, but we can provide a custom name for our compressed file by using ‘-c’ option.

sujit@sujitkumar:~/default$ bzip2 -kvc apache2 > apac.bz2
apache2: 1.565:1, 5.111 bit's/byte, 36.11% saved, 637 in, 407 out.
sujit@sujitkumar:~/default$ ls -l apac*
-rw-rw-r-- 1 sujit sujit 407 Mar 27 15:55 apac.bz2
-rw-r--r-- 1 sujit sujit 637 Mar 27 13:45 apache2
-rw-r--r-- 1 sujit sujit 407 Mar 27 13:45 apache2.bz2

Example 3: To compress a file without deleting the original file, we can use the -k option, which means to keep the original file :

sujit@sujitkumar:~/default$ bzip2 -kv apache2
apache2: 1.565:1, 5.111 bit's/byte, 36.11% saved, 637 in, 407 out.
sujit@sujitkumar:~/default$ ls -l apache2*
-rw-r--r-- 1 sujit sujit 637 Mar 27 13:45 apache2
-rw-r--r-- 1 sujit sujit 407 Mar 27 13:45 apache2.bz2

Example 4: By default, the decompression will not overwrite the files, since apache2 already exists here, the decompression of apache2.bz2 fails. Here, the decompression should force to overwrite apache2.bz2 using the ‘-f’ option.

sujit@sujitkumar:~/default$ bzip2 -vd apache2.bz2
bzip2: Output file apache2 already exists.
sujit@sujitkumar:~/default$ bzip2 -vdf apache2.bz2
apache2.bz2: done
sujit@sujitkumar:~/default$

Example 5: Another interesting point here, to compress or decompress the files that start with ‘-’ in their name, we can use the below approach.

sujit@sujitkumar:~/default$ bzip2 -v -- -testfile
-testfile: 1.565:1, 5.111 bit's/byte, 36.11% saved, 637 in, 407 out.
sujit@sujitkumar:~/default$ ls -l -- -*
-rw-rw-r-- 1 sujit sujit 407 Mar 27 14:36 -testfile.bz2
sujit@sujitkumar:~/default$ bunzip2 -v -- -testfile.bz2
-testfile.bz2: done
sujit@sujitkumar:~/default$ ls -l -- -*
-rw-rw-r-- 1 sujit sujit 637 Mar 27 14:36 -testfile
sujit@sujitkumar:~/default$

Note 1: bunzip2 can be used to decompress the files compressed using bzip2 instead of using ‘bzip2 -d’.

Note 2: ‘–’ option can be used with almost all the UNIX commands when dealing with files starting with dash ‘-’ in their filename.

Example 6: To test the integrity of the decompression before the actual file is decompressed, we use the ‘-t’ option.

sujit@sujitkumar:~/default$ bzip2 -tv apache2.bz2
apache2.bz2: ok

Example 7: We can compress all the files within a directory, using ‘tar’ with ‘-j’ option which is specifically for the bzip2 compression.

sujit@sujitkumar:~$ tar cvjf default.tar.bz2 default
default/
default/bootlogd
default/ntfs-3g
default/nss
default/irqbalance
default/whoopsie
default/acpid
default/rsync
default/ssh
default/puppetmaster
default/puppetqd
default/cron
default/rsyslog
default/apache2
default/devpts
default/apport
default/keyboard
default/grub
default/rcS
default/useradd
default/locale
default/ufw
default/-testfile
default/ntpdate
default/halt
default/tomcat7
default/dbus
default/crda
default/console-setup
sujit@sujitkumar:~$ ls
default default.tar.bz2

Example 8: The bzip2 command compresses the files into the blocks, and each of the blocks holds a 32-bit CRC code for damaged file detection during the integrity check of the compressed file.

The damaged files, though very unlikely in bzip2, can be recovered using ‘bzip2recover’ command. bzip2recover searches for blocks in .bz2 files, and write each block out into its own .bz2 file with sequence numbers in the filename “dec00001file.bz2“, “dec00002file.bz2”, etc. You can then use bzip2 -t to test the integrity of these resulting files, and decompress those who are undamaged.

bzip2 -dc dec*file.bz2 > decfile

So keeping the compressed blocks as small as possible during the compression of huge files would help during the retrieving of maximum quality data, in case of corruption. ‘-s’ option is used with bzip2 to compress the data with small sized blocks.

Example 9: Unzip the bzip2 compressed files. Either “bzip2 -d” or bunzip2 can be used to decompress the bz2 files to obtain the original data.

bzip2 -d file.bz2 

Working practically always pops-up new ideas, Happy learning.

 

The following two tabs change content below.

author_sujit

Latest posts by author_sujit (see all)