Arrays #5

#0 | #1 | #2 | #3 | #4 | #5 | #6 | #7

So far we have introduced creating and sorting arrays containing strings or numbers. And our arrays have been pretty simple - single words or numbers.

But of course the real world is not typically made of single words or numbers, and as a programmer we are called upon to deal with some pretty complex data sometimes.

Keeping it simple, but getting a little more complex, what happens when our array of strings contains words with upper-case letters? What do think should happen?

In your text editor, start a new file and name it 'arrays2.pl'. Save it in the same directory as your original (arrays1.pl) file.

Now add the following code:

#!/usr/bin/perl
use strict;
use warnings;

my @foo = ("bravo","Charlie","echo","alpha","Foxtrot","Delta");
print "\nOriginal array:\n";
foreach (@foo) {
    print "$_\n";
}
print "\nSorted:\n";
my @sortedfoo = sort @foo;
foreach (@sortedfoo) {
    print "$_\n";
}

Note that it begins with same lines as 'arrays1.pl', and we use the same sorting procedure. In fact the ONLY difference is some of the words in our array are capitalized.

Run the script. ('perl arrays2.pl')

You will notice that the words are NOT in the order you might expect.

Programs do what we tell them to do; not what we want them to do

But they are sorted according to how the 'sort' function works. In computers, every character (glyph) has a particular numerical value (the ASCII value). It is this value that gets used by the 'sort' function.

Upper-case letters have a smaller numerical code than lower-case letters, therefore they appear first when sorted in ascending order.

Here is the WiKi page for ASCII glyphs

You will see that digits appear before upper-case letters, and various punctuation symbols are spread between the different sets.

So, the question now is 'How do we sort words in an array with different numerical codes if we don't have access to those codes?'

If all the words were upper-case or lower-case, it would be easy. So we will do just that!

We will add a new Perl operator to our code as follows:

print "\nSorted the right way:\n";
@sortedfoo = sort {lc($a) cmp lc($b)} @foo;
foreach (@sortedfoo) {
    print "$_\n";
}

You will see that this is very similar to previous sort code but with the addition of 'lc' ahead of '$a' and '$b'.

The 'lc' stands for 'lower case', and does exactly that - changes both $a and $b to lower case during the comparison.


We could have used 'uc' (upper case) as well.

Two more functions that deal with case-sensitivity are 'ucfirst' and 'lcfirst'.

They do kind of what you think - make the first letter of the current selection either upper case or lower case.

Add the code below to print out capitalized words:

print "\nAll CAPITALIZED:\n";
foreach (@foo) {
    print uc($_);
    print "\n";
}

We haven't changed the content of our original '@foo' array.
We have only printed the contents differently.

But you may be asking 'Why do we need 2 print statements?

We could easily combine both those statements into 1, but it involves something about printing strings we haven't covered yet. Apologies for some extra typing. We will cover it later in a 'Strings' page.

This brings up a good question: we have already seen a couple of ways to create arrays. Can we join arrays together?

Of course we can - this is Perl!

We already have 2 arrays: '@foo' and '@sortedfoo'. How do we merge them together?

Add this code to 'arrays2.pl':

print "\nTwo arrays merged:\n";
my @foos = (@foo, @sortedfoo);
foreach (@foos) {
    print "$_\n";
}

That was too easy! Since we now have a new array (@foos) we can access any element in it by the proper index. So to print the 8th item in the array will be:

print "\nEighth item: $foos[7]\n";

Next: multi-dimensional arrays