A contact sheet for your website
A contact sheet for your website
I’m in the throes of a major redesign of the site of my comic book, Académie des Chasseurs de Primes. Like any of those redesign, it involves a lot of CSS whack-a-mole. Fine-tuning one page throws a second one slightly off another, and fixing that second one causes unforseen effects on a third one. And so on, and so bloody forth.
Generally, to discover those oopsies, I have to navigate the whole site. Bah, humbug. Wouldn’t it be much efficient to have a single document showing all of the site’s page? Something like a contact sheet for the website, if you will. Well, let’s see how hard that would be.
Step 1. Generate a sitemap
If we want to look at all our pages, we must have a listing of them. One of the sane ways of doing that is to have a sitemap for the website. Fortunately, as the ACP website uses Dancer, all I have to do is to add Dancer::Plugin::SiteMap to the application and, hop, self-generated sitemap. As a pleasant side-effect, I also have a sitemap to feed Google. Sweet.
Step 2. Read the links of the sitemap
Using WWW::Mechanize, it’s a breeze.
my $mech = Test::WWW::Mechanize->new;
$mech->get_ok( "$root_url/sitemap" );
my @links = $mech->links;
In the code snippet, I actually went a step further and used Test::WWW::Mechanize. As I’m going to crawl the site, why not formulate the script as a test suite to that we can get our contact sheet and a health report on the whole thing? There’s no point in simply being awesome if we can be mega-awesome, right?
Step 3. Grab the screenshots
For this part, we’ll use Selenium via WWW::Selenium.
my $sel = Test::WWW::Selenium->new(
host => "localhost",
port => 4444,
browser => "*firefox",
browser_url => $root_url,
);
$sel->start;
for my $url ( map { $_->url } @links ) {
$sel->open_ok( $url ) or next;
( my $screenshot_file = $url . ".png" ) =~ s#/#_#g;
# Selenium wants an absolute path
$sel->capture_entire_page_screenshot( file( $screenshot_file )->absolute );
}
$sel->stop;
Yes, in those few lines, we visited all of our website’s pages and generated a screenshot of every single one of them. Automation. I love it.
Step 4. Generate the thumbnails
For this I went with the good old GD module. We take the pictures, scale them down to a width of 300 pixels and save the result in a second file.
my $screenshot = GD::Image->newFromPng( $screenshot_file );
my ( $width, $height ) = $screenshot->getBounds;
my $scale = 300 / $width;
my $thumbnail = GD::Image->new( 300, $scale * $height );
$thumbnail->copyResampled($screenshot, ( 0 ) x 4,
300, $scale * $height,
$width, $height,
);
( my $thumbfile = $screenshot_file ) =~ s/(?=.png)/_thumbnail/;
print { file( $thumbfile )->openw } $thumbnail->png;
Step 5. Put the contact sheet together
Since I’m working on my Template::Caribou, I’m using it to generate the contact sheet. As Caribou is a work in progress, the following will not work out-of-the-box for you, but just for giggles, here’s what the template looks like:
package ContactSheet;
use 5.10.0;
use Moose;
use Method::Signatures;
use Template::Caribou::Utils;
use Template::Caribou::Tags::HTML;
with 'Template::Caribou';
has root_url => (
is => 'ro',
);
has shots => (
traits => [ 'Array' ],
is => 'ro',
default => sub { [] },
handles => {
all_shots => 'elements',
add_shot => 'push',
},
);
template 'main' => sub {
my ($self) = @_;
html {
head {
style {
say ::RAW q[
body {
background: lightgrey;
}
div {
width: 300px;
margin: 10px;
display: inline-block;
vertical-align: top;
text-align: center;
}
];
}
};
body {
show( 'shoot', @$_ ) for $self->all_shots;
}
};
};
template 'shoot' => method ( $url, $screen, $thumb ) {
div {
div {
anchor $self->root_url . "/$url" => sub {
$url;
}
};
anchor $screen => sub {
image $thumb, sub { };
};
}
};
Step 6. Putting it together
And we are done. We assemble the different parts together in the script
’website_contactsheet.pl
‘:
#!/usr/bin/env perl
use 5.10.0;
use strict;
use warnings;
use Test::More;
use Test::WWW::Mechanize;
use Test::WWW::Selenium;
use Path::Class qw/ file /;
use GD::Image;
use lib '.';
use ContactSheet;
my $root_url = shift or die;
my $mech = Test::WWW::Mechanize->new;
$mech->get_ok( "$root_url/sitemap" );
my @links = $mech->links;
my $sel = Test::WWW::Selenium->new(
host => "localhost",
port => 4444,
browser => "*firefox",
browser_url => $root_url,
);
$sel->start;
my $template = ContactSheet->new( root_url => $root_url );
for my $url ( map { $_->url } @links ) {
# selenium doesn't like the .xml for some reason
next if $url eq '/sitemap.xml';
$sel->open_ok( $url ) or next;
( my $screenshot_file = $url . ".png" ) =~ s#/#_#g;
# Selenium wants an absolute path
$sel->capture_entire_page_screenshot( file( $screenshot_file )->absolute );
my $screenshot = GD::Image->newFromPng( $screenshot_file );
my ( $width, $height ) = $screenshot->getBounds;
my $scale = 300 / $width;
my $thumbnail = GD::Image->new( 300, $scale * $height );
$thumbnail->copyResampled($screenshot, ( 0 ) x 4,
300, $scale * $height,
$width, $height,
);
( my $thumbfile = $screenshot_file ) =~ s/(?=.png)/_thumbnail/;
print { file( $thumbfile )->openw } $thumbnail->png;
$template->add_shot( [ $url, $screenshot_file, $thumbfile ] );
}
$sel->stop;
diag "creating index file";
print { file('index.html')->openw }
$template->render('main');
done_testing;
Run it:
$ ./website_contactsheet.pl http://enkidu:3000
ok 1 - GET http://enkidu:3000/sitemap
ok 2 - open, /
ok 3 - open, /albums
ok 4 - open, /feeds/acp.atom
ok 5 - open, /magasin
ok 6 - open, /magasin/panier
ok 7 - open, /magasin/paypal/confirmation
ok 8 - open, /magasin/paypal/retour
ok 9 - open, /primes
ok 10 - open, /primes/chienbine
ok 11 - open, /primes/figurines
ok 12 - open, /primes/icones
ok 13 - open, /rentree
ok 14 - open, /sitemap
# creating index file
1..14
Et voilà.
We now have a contact sheet where I can see at a glance… that I still have a lot of work to do. *sigh*