Welcome to the Linux Foundation Forum!

GZ or BZ2?

Hi. I'd like to adopt a default compression algorithm to my files. It has to be the most linux friendly, have the highest compression rate and security. Instantly I thought in tar.gz and tar.bz2. But I don't know their main differences. Witch one should I chose?

Comments

  • mfillpot
    mfillpot Posts: 2,177
    A simple comparison review of linux compression tools is at http://wj32.wordpress.com/2008/04/27/comparison-of-compression-programs-on-linux-2625tar/ , you also have xz (http://tukaani.org/xz/) and 7zip which is not installed in most base installations.

    Here is what I ran to test various compress algorithms.

    File: linux-2.6.36-rc4
    Uncompressed Size: 454M

    tar.gz
    Time: 25.546s
    New Size: 88M
    Compression Percent: 80.6%

    7zip
    Time: 4m 0.327s
    New Size: 61M
    Compression Percent: 86.6%

    xz
    Time: 6m 6.009s
    New Size: 59M
    Compression Percent: 87.0%

    tar.bz2
    Time: 1m 41.552s
    New Size: 52M
    Compression Percent: 88.5%

    By what I have heard xz is supposed to be much better than bz2 or gz, but it may just be my archive contents that restricted the compression. In the end you will have to create compressed archives with the various algorithms on a realistic sample and decide which best files your speed, compression and system load needs.
  • Scifer wrote:
    It has to be the most linux friendly, have the highest compression rate and security.
    What exactly do you refer to when you mention security? Do you want to password encrypt these archives, or are you looking for a format that is somewhat more resistant to corruption?

    I know that the format rar is can add a defined amount of redundancy to an archive, thus making the archive more robust when it comes to corruption; the catch is that the rar format isn't open. I don't think that GNU tar has the same feature built in, but third party utilities might provide something similar.

    When it comes to efficiency, all tests I have seen so far indicate that the compression ratio of bz2 is superior to that of it's older counterpart gz, but that efficiency boost comes at the price of requiring more CPU time. I don't have anything to contribute regarding other formats (xz, 7z, etc); see Matthews excellent answer for some raw numbers on this issue.
  • mfillpot wrote:
    A simple comparison review of linux compression tools is at wj32.wordpress.com/2008/04/27/comparison...ms-on-linux-2625tar/ , you also have xz (tukaani.org/xz/) and 7zip which is not installed in most base installations.
    Your help is greatly appreciated. By what I understood from Tel's comment in wj32 review is that bz2 has a higher compression ratio while gz has higher decompression speed. Which is exactly as I suspected. And considering that both are the most commonly used in linux, I'm picking them as my default compression formats.
  • jabirali wrote:
    What exactly do you refer to when you mention security? Do you want to password encrypt these archives, or are you looking for a format that is somewhat more resistant to corruption?
    By security I meant encryption, redundancy and recovery features.
  • jabirali wrote:
    the catch is that the rar format isn't open.
    So it's off my list.
  • I use both gz and bz2 for compression. Both are linux-friendly, so you don't have to worry about that.
    Lets compare the two formats:

    file.tar.gz : generally larger, but it takes less time to decompress
    extraction command: tar -zxvf file.tar.gz

    file.tar.bz2: generally smaller, but takes more time to decompress
    extraction command: tar -jxvf file.tar.bz2

    Now let's assess your needs: If you want to distribute your file online, I reccomend using bz2. This allows for faster downloads of said file. If you are running on a lower-end machine with a faster internet connection, than by all means use gz. This will allow for faster decompression, but download time will suffer.

    To conclude: Both formats are (basically) useable accross Linux systems, you just have to decide which format meets your needs best.

Categories

Upcoming Training