Hashes #1

#0 | #1 | #2 | #3 | #4 | #5 | #6 | #7

Hashes are similar to arrays, except they are indexed by a string instead of a number.

Also known as Associative Arrays, these guys are extremely useful and powerful.

Hashes are identified by using the percent symbol: %WeekDays.

They use a format you might be familiar with known as key-value. This format is used in configuration files, XML files, and many other structured documents.

While arrays use @ (perhaps to remind us they are @rrays), hashes use % (perhaps to remind us they use a key/value structure - squinting helps.)

The key refers to an attribute of something; value is what the value of that attribute is.

To use a file listing as an example:

-rwxr-xr-x@ 1 user  staff  1725 Apr  3 13:09 arrays1.pl
-rwxr-xr-x@ 1 user  staff   834 Apr 21 13:42 arrays2.pl
-rwxr-xr-x@ 1 user  staff  1156 Apr 24 16:51 arrays6.pl
-rwxr-xr-x@ 1 user  staff  1288 Apr  5 13:20 cities.pl
-rwxr-xr-x@ 1 user  staff   431 Apr  5 20:50 cities2.pl

The keys (attributes) would be:

Whereas values would be:

An odd quality (?) of hashes is that they are unordered. In other words, we might define a hash (see our %WeekDays example below) in a particular order, but when we print it out it will most likely not be in that same order.

You might ask 'What's the point of that?', but being unordered doesn't mean you can't access them individually. In fact hashes are very quick to access, and stay fast no matter how many values you put in them.

Hashes have no beginning or end, so cannot be pushed or popped like arrays. But it has been said that you aren't really thinking in Perl, until you start thinking in terms of hashes.

Here's what a hash of weekdays and their abbreviations might look like:

my %WeekDays = (
    "Sunday","SU",
    "Monday","MO",
    "Tuesday","TU",
    "Wednesday","WE",
    "Thursday","TH",
    "Friday","FR",
    "Saturday","SA");

I used to work in an academic library, so those are 2-letter abbreviations we used (I guess librarians are lazy too.)

That is kind of hard to read, (imagine it being written all on 1 line). Perl let's us use the "fat comma" (=>) to replace the regular one.

my %WeekDays = (
    "Sunday"    =>  "SU",
    "Monday"    =>  "MO",
    "Tuesday"   =>  "TU",
    "Wednesday" =>  "WE",
    "Thursday"  =>  "TH",
    "Friday"    =>  "FR",
    "Saturday"  =>  "SA"
    );

Much easier to read and understand. A bit of formatting doesn't hurt either. You can format Perl pretty much as you like - what ever is easy to read and maintain (indenting, tabs, and new lines). Statements typically end with a semi-colon.

In our example above, the weekdays are the key and the abbreviations are the value.

Note: there are always an even number of elements in a hash (key-value, remember?).

To access a value in a hash, we use the key as the index:

print "The abbreviation for Sunday is: $WeekDays{Sunday}\n";

The dollar sign $ is used when referring to a scalar value of a hash, similar to an array. We use braces { } to enclose the value. Also since it contains no spaces, we don't have to put quotes around it.

To walk through the whole hash is similar to walking through an array.

But we need to introduce a new Perl term: keys.


foreach my $key (keys %WeekDays) {
    print "$key -- $WeekDays{$key}\n";
}

Well, we told you hashes were "unordered". If you do have a need to keep the hash in the original insert order, this can be done using the module Tie::IxHash, but that is beyond the scope of this hash, er, document.

This is one reason hashes are not used as widely as arrays in loops - it is easier to control an array by incrementing or decrementing a number. Not to say hashes do not have their place. As the programmer, you have a choice on how to manage your data - either as an array or a hash, depending on what the data is and how you need to access it.

Just to show you some of the features of using a hash, how would you find out if ThorpDay was in your hash?

We use exists (big surprise) to see if a key is present in a hash.


if (exists($WeekDays{ThorpDay})) {
    print "Found $WeekDays{ThorpDay}\n";
}
else {print "'ThorpDay' is not here.\n"; }

If we had used an array to hold the week days and abbreviations we could have used a similar technique to see if a value existed for a particular index. But we would have had to do it in a loop, checking each element in the array for a match.

print "Yes, we have $WeekDays[2][0]\n" if exists ($foodstuff[2][0]);

I like hashes better in situations like this.

Another way to loop over a hash is using each


while (my ($key,$value) = each(%WeekDays)) {
    print "$key --> $value\n";
}

Remember we said hashes are 'unordered'.

We could also use values to loop over a hash and get just values (DUH).


foreach my $value (values %WeekDays) {
    print "$value\n";
}

All this talk of keys and values brings up a question: can we invert or reverse the keys and values in a hash?

Surprisingly no. JUST KIDDING!

Yes, like an array can be reversed, the keys and values in a hash can be as well.

Again surprisingly we use the reverse function to reverse keys and values.

However beware of duplicate values.

In hashes keys MUST be unique, but values may be duplicated, so if 2 values have the same value and you then reverse the hash, one of the duplicates will be lost.



my %WeekDaysRev = reverse %WeekDays;
foreach my $key (sort keys %WeekDaysRev) {
    print "$key ==> $WeekDaysRev{$key}\n";
}

Just for fun we also sorted our output, to show that it can be done.

Multi-dimensional hashes