Compression Saves Bandwidth, Disk Space

Copyright © 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003 Amer Neely

In my last article (September) we discussed finding files on Internet FTP sites using Archie. Before we get too far away from finding and downloading files we must talk about file compression. This concept should be understood before you wander too far into cyberspace. As I mentioned in my last article a lot of the things you will hear about and run into on the Internet involve concepts that have been around for many years, and it is important for newcomers to get a good handle on them. We must learn to swim before we surf.

What it is?

File compression is something done to files so they take up less room on your hard disk (or floppy, or whatever). This may not seem too important to you, especially if you've just bought a new system with a large hard disk. It's common nowadays for systems to come with 2 GB and larger drives.

Why do it?

However, consider if you had tens of thousands of files on your computer, which you were making available to anyone on the Internet to access and download. This is a typical FTP site. Wouldn't you want to be able to reduce the amount of space each file took if possible? We're talking a reduction of up to 50% or more here. This would be essentially the same as doubling the size of your hard drive.

That is the reason for file compression. As an added benefit, consider that since the files are smaller, they take less time to transfer over the Internet (or any medium for that matter). In English that means it takes less time for you to download the files you've found on an FTP site.

How do you tell if a file is compressed?

Most compressed files are easy to identify by their extension. Some example filename extensions of compressed files are:

.zip DOS or Windows zip file
.exe possibly a DOS or Windows self-extracting *
.Z Unix ** compress
.gz Unix ** Gzip
.tar Unix ** (not compressed but a tape archive)
.sit Macintosh Stuffit
.lzh DOS (not in common use)
.arc DOS (not in common use)
.arj DOS (not in common use)
.zoo DOS (not in common use)
.pak DOS (not in common use)

Self-extracting?

There is also the possibility that a file is both compressed and self-extracting. In this case, it will have a .exe extension so all you have to do is run it. There should be a note on the site where you found the file that it is indeed a compressed file. Sometimes however there isn't. An example of a self- extracting file is the distribution file for PKZIP - PKZ204G.EXE. It has to be self-extracting since how would you uncompress it - you don't have the utility to begin with! Many files are distributed this way. If you find one, it is best to put it into its own directory or folder before running it. If you're not sure if it is self-extracting or not, move it to a temporary directory anyway before running it.

** The three Unix examples have DOS versions of programs that will correctly uncompress them. See below to find out where to get these utilities.

You thought we were done with archives!

No, it seems we can't get away from them. Just to throw one more curve at you, a feature of PKZIP and WinZip is that they can compress a number of files into one compressed file. This is a compressed archive. Most utilities and applications are comprised of a number of files, not just one. They can all be bundled into one large compressed archive, and when uncompressed (or extracted), explode into their original sizes.

So to recap, we can have:

Where do you get compression software?
DOS and Windows

For DOS users, the most popular is PKZIP, named after Phil Katz the developer of it. The Windows version is called WinZip and is available from many Web sites and almost any FTP site, or see below for the latest beta version.4g.exe

If you don't have it already, download this file and put it in its own directory before running it. As mentioned above it is a self-extracting compressed archive and will automagically uncompress the dozen or so files it contains. The two programs you will be most concerned with are PKUNZIP.EXE and PKZIP.EXE. You use PKUNZIP to uncompress zipped files, and you use PKZIP to compress files. Read the documentation files to learn how to use them. DO NOT DOWNLOAD ANY VERSION OF PKZIP NUMBERED 3.x. IT IS A HACKED COPY AND CONTAINS A VIRUS.

The WinZip file will install itself when you run it. It has a number of options you can set during installation, or after. One nice feature for Windows 3.x users is to have it make itself available from File Manager.

The TUCOWS software repository has compression utilities for almost every operating system.

What do you do with compressed files?

Once you have installed the compression software on your system, you can now deal with compressed files. You may have some already, or you may want to go out and find something on an FTP site.

I can't stress enough the importance of putting a compressed file into its own directory or folder before you extract it. It may contain hundreds of files, and if you just extract it into an existing directory with other files in it, you will be guaranteed a long night of computer work if you want to delete or move them.

Assuming you have a file on your system that you know is compressed you have a number of options.

Windows users have it pretty good. As with most applications, you can just double-click on the file (either zipped or self-extracting) from within File Manager or Explorer, and WinZip will let you look at the contents of it before doing the actual extraction, or uncompression. This is useful for reading any documentation or pre-installation files that come with it.

When you are ready to go ahead, select the files you want to extract (usually all of them), and tell WinZip where you want to extract the files to. This is an important step, because if you just let it extract to the default directory, it may not be where you want the files.

DOS users have to explode the whole file first (in its own directory) before reading any of the files. Or, if you are a real connoisseur, you might have a shell program to do this. Don't forget - if it's a self-extracting file (.exe), you don't need anything else, just run the program.

Once you have all the files you need from the zip file, move it to a floppy disk if it will fit. If it won't fit on one disk, PKZIP and WinZip will let you span as many disks as needed. Note that I said move (copy and delete), since the whole point of compressing files is to conserve space. Why have the original file/s on your hard drive as well as the compressed version? If you have the compressed file on disk, it's your backup in case you need it.

If you're not concerned about that, delete the compressed file.

When / why would you compress your own files?

You would compress files of your own for the same reason the maintainer of an FTP site would - to reduce the amount of space they take up. Not that you have tens of thousands of files (or maybe you do!), but some examples of when to consider it would be:

For the purposes of this article we'll concentrate on file compression as it applies to Internet scenarios.

E-mail attachments

As you may know, files can be sent to someone else as an attachment to an e- mail message. Compressed files are binary, and must be sent as an attachment. But this method of sending a file to someone should only be used for small files, say anything up to about 200 KB (about 200,000 bytes). Anything larger and you should be looking at FTP to send it.

Another situation presents itself right here. This article is number eight in a series. Each article is a separate file, so if I wanted to send all these files to someone else I could send each one individually. A better solution would be to bundle them up into an archive with PKZIP or WinZip. Then I only need to send one message with one attachment.

FTP

Remember that FTP is used for transferring files on the Internet. That includes uploading files from your computer to someone else's, not just downloading. And, since compressed files are smaller, they take less time to transfer. So before sending someone your latest 3 MB novel to read, exhibit good netiquette and compress it first. Besides, it is probably broken up into separate chapters anyway, so bundling all the files into one compressed archive makes more sense. On the Internet bandwidth is an endangered resource and is conserved using compression.

When not to use compression

OK, now I'm going to really throw a loop at you and tell you not to use compression - but only in specific cases. If you've been using the Internet for any length of time, you've run into image files of the GIF (Graphics Interchange Format) type (.gif). They make up the bulk of all the images you see when you visit a World Wide Web site. If you are a graphic artist, or just like looking at images on your computer, you may have a collection of GIF (pronounced as "jif") files on your hard drive.

A funny thing might happen if you try to compress a GIF file with PKZIP - it gets bigger! That's because of some code attached to the beginning and end of every .zip file. Depending on the content of the GIF file, it might compress slightly, but not enough to be worthwhile. On tests I performed using maximum compression, I achieved only 1% reduction in the file size. As mentioned, I've seen GIF files actually grow after being "compressed".

A better solution with GIF files might be to convert them to JPEG (Joint Photographic Experts Group) format. JPEG ("jay-peg") files are highly com- pressed image files and are very popular. Image viewers and converters for GIF and JPEG files are available at the TUCOWS Ontario mirror site.

Shut down

If you want to learn more about compression a good starting point would be the "Compression FAQ" in the newsgroup news.answers.

For lots more compression utilities, see the TUCOWS (The Ultimate Collection Of Winsock Software) Ontario mirror site.

A JPEG FAQ (Frequently Asked Questions) is posted regularly to the news.answers newsgroup. Read it first before converting all your GIFs to JPEGs.

This has hopefully given you a good start on file compression. Next issue we'll continue on with more Internet stuff. If you have a suggestion for an upcoming article let me know. In the meantime have fun and bcnu...


Top of Page

Amer Neely