In another article compression tools on linux gzip vs bzip2 vs lzma vs compress, I compared those tools, some are good for speed, while some are good for space saving or for particular type of files. Hence, they either save on space, or CPU. In nowadays, CPU resources are generally configured high enough to meet peak hour usage, so there are plenty of them can be used in other times.

How do we zipping tools to utilize more CPU resources?

Multiple zipping processes, using xargs

Here is the example, details in  Control and run multiple processes in bash

nice /usr/bin/find /home/backups/archivelogs -not -name "*.bz2" | xargs -n 1 -P 5 bzip2

In the case above, there usually have thousands of small log files need to be zipped everything before archived, by using 5 parallel processes, easily, I got zipping process done 5 times faster.

Multiple zipping thread

What about big single file?
There is a multiple threads zipping tool called pbzip2. It's available in many linux distributions.
Here is example, file is loaded into memory before following test.

Bzip2 and time cost, single thread

$time bzip2 M
real    0m49.041s
user    0m48.709s
sys    0m0.268s

bzip2 -d and time cost

$time bzip2 -d M.bz2
real    0m16.593s
user    0m16.055s
sys    0m0.524s

pbzip2 and time cost, multiple threads

$time pbzip2 M
real    0m11.787s
user    1m29.173s
sys    0m1.445s

pbzip2 -d and time cost

$time pbzip2 -d M.bz2
real    0m3.469s
user    0m25.821s
sys    0m1.021s

More examples for pbzip

Example: pbzip2 -b15vk myfile.tar
Example: pbzip2 -p4 -r -5 myfile.tar second*.txt
Example: tar cf myfile.tar.bz2 --use-compress-prog=pbzip2 dir_to_compress/
Example: pbzip2 -d -m500 myfile.tar.bz2
Example: pbzip2 -dc myfile.tar.bz2 | tar x
Example: pbzip2 -c < myfile.txt > myfile.txt.bz2


it does even better, but with a little less space saving.

$time pigz M
real    0m3.552s
user    0m26.017s
sys    0m0.403s

$time pigz -d M
real    0m1.224s
user    0m1.890s
sys    0m0.172s

File size compare between pigz and pbzip2

origional filesize
-rw-r--r-- 1 test test 363407360 Dec 19 11:24 M
pbzip2 filesize
-rw-r--r-- 1 test test 142629350 Dec 19 11:24 M.bz2
pigz filesize
-rw-r--r-- 1 test test 152936211 Dec 19 11:24 M.gz




Comments powered by CComment