A main() for Perl
In this post, I would like to address the question of a main()
for Perl.
Most Perl scripts don’t have such a subroutine.
Either the Perl script is a module which gets included somewhere else or the script runs standalone and the traditional way will be to just write everything into the global scope:
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long;
my %opts = (); # save all cmd line opions in this hash
GetOptions ( # read cmd line args
\%opts,
"name=s",
);
if (exists($opts{name}) && $opts{name} eq "foo") {
foo();
} else {
print "I don't know what to do!\n";
exit(1);
}
exit(0);
sub foo {
print "Foo was called!\n";
return 42;
}
(Download)
While the above approach is okay, I’d rather have something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long;
sub main {
my %opts = (); # save all cmd line opions in this hash
GetOptions ( # read cmd line args
\%opts,
"name=s",
);
if (exists($opts{name}) && $opts{name} eq "foo") {
foo();
} else {
print "I don't know what to do!\n";
return 1;
}
return 0;
}
sub foo {
print "Foo was called!\n";
return 42;
}
exit(main());
(Download)
Note the exit(main())
at the end.
Variables have disappeared from the global scope which usually prevents stupid bugs.
It also forces the developer to be more concise because main-logic cannot be intermingled with non-main-logic like so:
# includes
# some main logic
# more includes
# subroutines
# variable declarations needed for main
# more main logic
# ...
To further enhance encapsulation, consider the following code:
#!/usr/bin/perl
package Main; # declare package
use strict;
use warnings;
use Getopt::Long;
# treat as package unless called from cmd line
__PACKAGE__->main(@ARGV) unless caller();
sub main {
my %opts = ();
GetOptions (
\%opts,
"name=s",
);
if (exists($opts{name}) && $opts{name} eq "foo") {
foo();
} else {
print "I don't know what to do!\n";
return 1;
}
return 0;
}
sub foo {
print "Foo was called!\n";
return 42;
}
(Download)
The above code effectively converts the script into a Perl package.
This doesn’t have any consequences when using the script standalone because main()
gets executed if no caller exists.
But: We can also treat this script like any normal package and use its defined subroutines etc.
A perfect example would be unit testing:
use strict;
use warnings;
use Test::Most;
use File::Basename;
use lib dirname(__FILE__); # allow to find modules in the dir of this file
require qw(package_main.pl); # include our Main package
is(Main::foo(), 42, "Return value is fine.");
done_testing();
(Download)
One could argue that foo()
should be defined in a specific package if you want to unit-test it.
While this may be true, there are often situations where it feels copious to do so just for testing.
And here I’m envisioning a scenario where foo()
gets only called by main()
and it thus makes sense to have both in the same package (Main).
I am also usually testing command line options so I code the script in a way that main()
either gets command line arguments provided (unit testing) or reads from the actual command line (Getopt
).
Another useful consequence of this “package-script” approach is the possibility to chain multiple scripts or parts of them. This is a very specific use case but occasionally the end user wants this: “Can I have a script which does A, a script which does B and a script which does A & B together?”. There are obviously other ways to organize the code for this scenario but I in my experiences this approach works very well. It helps to reduce code lines, makes reviewing more easier because the codes sits where you expect it to be.