A main() for Perl

In this post, I would like to address the question of a main() for Perl. Most Perl scripts don’t have such a subroutine. Either the Perl script is a module which gets included somewhere else or the script runs standalone and the traditional way will be to just write everything into the global scope:

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

my %opts = (); # save all cmd line opions in this hash
GetOptions ( # read cmd line args
    \%opts,
    "name=s",
);
if (exists($opts{name}) && $opts{name} eq "foo") {
    foo();
} else {
    print "I don't know what to do!\n";
    exit(1);
}
exit(0);

sub foo {
    print "Foo was called!\n";
    return 42;
}

(Download)

While the above approach is okay, I’d rather have something like this:

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

sub main {
    my %opts = (); # save all cmd line opions in this hash
    GetOptions ( # read cmd line args
        \%opts,
        "name=s",
    );
    if (exists($opts{name}) && $opts{name} eq "foo") {
        foo();
    } else {
        print "I don't know what to do!\n";
        return 1;
    }
    return 0;
}

sub foo {
    print "Foo was called!\n";
    return 42;
}

exit(main());

(Download)

Note the exit(main()) at the end. Variables have disappeared from the global scope which usually prevents stupid bugs. It also forces the developer to be more concise because main-logic cannot be intermingled with non-main-logic like so:

# includes
# some main logic
# more includes
# subroutines
# variable declarations needed for main
# more main logic
# ...

To further enhance encapsulation, consider the following code:

#!/usr/bin/perl

package Main; # declare package

use strict;
use warnings;
use Getopt::Long;

# treat as package unless called from cmd line
__PACKAGE__->main(@ARGV) unless caller();

sub main {
    my %opts = ();
    GetOptions (
        \%opts,
        "name=s",
    );
    if (exists($opts{name}) && $opts{name} eq "foo") {
        foo();
    } else {
        print "I don't know what to do!\n";
        return 1;
    }
    return 0;
}

sub foo {
    print "Foo was called!\n";
    return 42;
}

(Download)

The above code effectively converts the script into a Perl package. This doesn’t have any consequences when using the script standalone because main() gets executed if no caller exists. But: We can also treat this script like any normal package and use its defined subroutines etc. A perfect example would be unit testing:

use strict;
use warnings;
use Test::Most;
use File::Basename;

use lib dirname(__FILE__); # allow to find modules in the dir of this file
require qw(package_main.pl); # include our Main package

is(Main::foo(), 42, "Return value is fine.");

done_testing();

(Download)

One could argue that foo() should be defined in a specific package if you want to unit-test it. While this may be true, there are often situations where it feels copious to do so just for testing. And here I’m envisioning a scenario where foo() gets only called by main() and it thus makes sense to have both in the same package (Main). I am also usually testing command line options so I code the script in a way that main() either gets command line arguments provided (unit testing) or reads from the actual command line (Getopt).

Another useful consequence of this “package-script” approach is the possibility to chain multiple scripts or parts of them. This is a very specific use case but occasionally the end user wants this: “Can I have a script which does A, a script which does B and a script which does A & B together?”. There are obviously other ways to organize the code for this scenario but I in my experiences this approach works very well. It helps to reduce code lines, makes reviewing more easier because the codes sits where you expect it to be.