Previous  |  Next >  
Product: NetBackup System Administrator's Help  

How Much Compression Can You Expect?

The degree to which a file can be compressed depends on the types of data. A backup usually involves more than one type of data. Examples include stripped and unstripped binaries, ASCII, and repeating non-unique strings. If more of the data is favorable to compression you obtain more compression.


Note   Note    When compression is not used, it is normal to receive slightly more data at the server than is on the client (on UNIX, this is as shown by du or df) due to client disk fragmentation and file headers added by the client.

Compression Specifications 

Types of data that compress well:

Programs, ASCII files, and unstripped binaries (typically 40% of the original size).

Best-case compression:

Files composed of repeating, nonunique strings can sometimes be compressed to 1% of their original size.

Types of data that do not compress well:

Stripped binaries (usually 60% of original size).

Worst-case compression:

Files that are already compressed become slightly larger if compressed again. On UNIX clients, if this type of file exists and it has a unique file extension, exclude it (and other others with the same extension) from compression by adding it under the NetBackup host UNIX Client > Client Settings dialog.

The UNIX Client host property to exclude files for compression corresponds to adding a COMPRESS_SUFFIX =.suffix option to the bp.conf file.

Effect of file size:

File size has no effect on the amount of compression. It takes longer, however, to compress many small files than a single large one.

Client resources required:

Compression requires client computer processing unit time and as much memory as the administrator configures.

Effect on client speed:

Compression uses as much of the computer processing unit as available and affects other applications that require the computer processing unit. For fast CPUs, however, I/O rather than CPU speed is the limiting factor.

Effect on total backup time:

On the same set of data, backups can take three or more times as long with compression.

Files that are not compressed:

NetBackup does not compress:

Files that are equal to or less than 512 bytes, because that is the tar block size.

On UNIX clients, files ending with suffixes specified with the COMPRESS_SUFFIX =.suffix option in the bp.conf file.

On UNIX clients, files with the suffixes as shown below:

.arc or .ARC

.arj or .ARJ

.au or .AU

.cpt or .CPT

.cpt.bin or .CPT.BIN

.F

.F3B

.gif or .GIF

.gz or GZ

.hqx or .HQX

.hqx.bin or .HQX.BIN

.jpeg or .JPEG

.jpg or .JPG

.lha or .LHA

.lzh

.pak or .PAK

.iff or .IFF

.pit or .PIT

.pit.bin or .PIT.BIN

.scf or .SCF

.sea or .SEA

.sea.bin or .SEA.BIN

.sit or .SIT

.sit.bin or .SIT.bin

.tiff or .TIFF

.Y

.zip or .ZIP

.zom or .ZOM

.zoo or .ZOO

.z or .Z


 ^ Return to Top Previous  |  Next >  
Product: NetBackup System Administrator's Help  
VERITAS Software Corporation
www.veritas.com