Hacking Thy Fearful Symmetry

Hacker, hacker coding bright
Powered by a Gamboling Beluga

Making Simple Things Easy

created: February 9, 2013, last updated: February 10, 2013

Hai!

There is that web page listing a bunch of PDF files. Is there a way to get them all and, while we are at it, collate'em into a single document?

Kthxbai


#!/usr/bin/env perl 

# usage: $0 <the_url>

use 5.16.0;

use Web::Query;
use LWP::Simple;
use Path::Tiny;
use List::AllUtils qw/ reduce /;
use CAM::PDF;

( reduce { $a->appendPDF($b); $a } @{
    wq( $ARGV[0] )
    ->find('a')
    ->filter( sub {
        $_[1]->attr('href') =~ /\.pdf$/;
    })
    ->map( sub {
        my $temp = Path::Tiny->tempfile;
        $temp->spew( get( $_[1]->attr('href') ) );
        CAM::PDF->new($temp);
    })
}) ->cleanoutput('aggregate.pdf');

You're welcome.

comments powered by Disqus