Trick or Treat #4

#0 | #1 | #2 | #3 | #4

Report Writing

True to its name Practical Extraction and Report Language, Perl provides a way to take the data we've been parsing, and presenting it in a formatted report.

It's a 'template' maker, good for presenting data that follows a structure. A typical use might be to generate emails informing your staff of some new changes at work; or perhaps creating a year-end financial report of sales in different departments.

Mind you it doesn't get very fancy, but for simple printing or viewing it is quite suitable. Anyway, you'll be able to see for yourself what it can do.

First we have to get the data we are going to use, then design how the report will look using that data.

For this we'll use some year-end financial data for a company with several different branches.

The information we need may be in a file; it doesn't matter for this exercise what the format is, but the one I use is formatted a particular way we are familiar with - fields separated with vertical bars (|), and records on separate lines.

The important thing is to have the data available, so that we can parse it and get it into our script.

A bonus is that we already know how to parse it and get particular parts out of it (Files #5 and #6).

Let's get right into designing our report format. Would you be surprised to discover that the command we need is format?

The structure of the format command is as follows:

format [name] =
    format for line 1
    variable/s for line 1
    format for line 2
    variable/s for line 2
    .
    .
    .
    .
.

Starting with a name (minus brackets), we then alternate a line laying out the format of the data, followed by a line containing the arguments (variables) to fill in the data.

If you've ever done a 'mail-merge' at work you understand somewhat how this works, using different fields to hold variable data.

We have several ways to format the actual data. For example we can define how many characters to use; whether they are justified left or right or centered; dollar amounts can be aligned on the decimal point; several lines can be treated as one chunk; etc.

Here's how we want our report to look:

     MezoPico Software Consulting
          2015 Yearly Report

            Q1      Q2      Q3      Q4      YTD
Branch 1    $       $       $       $       $
Branch 2    $       $       $       $       $
Branch 3    $       $       $       $       $
Branch 4    $       $       $       $       $
----------------------------------------------
YTD Total   $       $       $       $       $


Here are some of the formatting characters we can use:

Here is how our format might look:

format STDOUT_TOP =
                            @||||||||||||||||||||||||||||
$CompanyName
                                 @||||||||||||||||||
$ReportTitle

.
format STDOUT =
                Q1              Q2              Q3              Q4          YTD
Ottawa      @#####.##       @#####.##      @#####.##       @#####.##     @######.##
$BR1Q1,$BR1Q2,$BR1Q3,$BR1Q4,$BR1YTD
London      @#####.##       @#####.##      @#####.##       @#####.##     @######.##
$BR2Q1,$BR2Q2,$BR2Q3,$BR2Q4,$BR2YTD
Toronto     @#####.##       @#####.##      @#####.##       @#####.##     @######.##
$BR3Q1,$BR3Q2,$BR3Q3,$BR3Q4,$BR3YTD
Montreal    @#####.##       @#####.##      @#####.##       @#####.##     @######.##
$BR4Q1,$BR4Q2,$BR4Q3,$BR4Q4,$BR4YTD
          -------------------------------------------------------------------------
YTD       $@######.##     $@######.##    $@######.##     $@######.##   $@#######.##
$Q1T,$Q2T,$Q3T,$Q4T,$YTD
.

Finally we end the format with a single . (dot). It must be the FIRST and ONLY character on the line.

The code for your format can be anywhere in your script, but putting it at the beginning or end would make it easier to find later, if you have to make some changes to it.

Now, all we need is some data to put into our variables. To do that I've put some 'dummy' data into a file, which is then read by the script (not shown here - that's for you to figure out), populating the variables:

$CompanyName = "MezoPico Software Consulting";
$ReportTitle = "2015 Yearly Report";
$BR1Q1 = 4587.03;
$BR1Q2 = 4600.88;
$BR1Q3 = 4598.5;
$BR1Q4 = 4997.54;
$BR1YTD = $BR1Q1 + $BR1Q2 + $BR1Q3 + $BR1Q4;
$BR2Q1 = 5500;
$BR2Q2 = 5774.72;
$BR2Q3 = 5209.77;
$BR2Q4 = 5644.88;
$BR2YTD = $BR2Q1 + $BR2Q2 + $BR2Q3 + $BR2Q4;
$BR3Q1 = 4399.8;
$BR3Q2 = 4377.78;
$BR3Q3 = 4503.99;
$BR3Q4 = 4465.43;
$BR3YTD = $BR3Q1 + $BR3Q2 + $BR3Q3 = $BR3Q4;
$BR4Q1 = 8734.1;
$BR4Q2 = 8820.34;
$BR4Q3 = 8794.45;
$BR4Q4 = 8723.23;
$BR4YTD = $BR4Q1 + $BR4Q2 + $BR4Q3 + $BR4Q4;
$Q1T = $BR1Q1 + $BR2Q1 + $BR3Q1 + $BR4Q1;
$Q2T = $BR1Q2 + $BR2Q2 + $BR3Q2 + $BR4Q2;
$Q3T = $BR1Q3 + $BR2Q3 + $BR3Q3 + $BR4Q3;
$Q4T = $BR1Q4 + $BR2Q4 + $BR3Q4 + $BR4Q4;
$YTD = $Q1T + $Q2T + $Q3T + $Q4T;

Now to execute our format, we use an old FORTRAN function:

the write command.


write STDOUT;

And here's our hard-earned report:

As I said, nothing fancy, but it does the job.

With a bit of work, the report can be saved to a file (perhaps for emailing).

The name for our template is STDOUT which is a standard file handle pointing to the screen. To print the output to a file instead, we need to do 2 things:

  1. rename our format
  2. point our output to a different file handle

In this case I've simply renamed them from STDOUT and STDOUT_TOP to YTD and YTD_TOP.

Then later I've changed the writing code to this:

open ($YTD,">","MezoPico 2015YTD.txt") or die "Can't open file: $!\n";
$^ = YTD_TOP;
$~ = YTD;
write;
close $YTD;

Two special variables in there control how format works:

  1. $^ is the current top-of-form format name
  2. $~ is the current format name

Other Special format Variables

These other variables can be used in your reports, perhaps if you have multiple pages.