Hacking Thy Fearful Symmetry

Hacker, hacker coding bright

Making Simple Things Easy

February 9, 2013


There is that web page listing a bunch of PDF files. Is there a way to get them all and, while we are at it, collate'em into a single document?


#!/usr/bin/env perl 

# usage: $0 <the_url>

use 5.16.0;

use Web::Query;
use LWP::Simple;
use Path::Tiny;
use List::AllUtils qw/ reduce /;
use CAM::PDF;

( reduce { $a->appendPDF($b); $a } @{
    wq( $ARGV[0] )
    ->filter( sub {
        $_[1]->attr('href') =~ /\.pdf$/;
    ->map( sub {
        my $temp = Path::Tiny->tempfile;
        $temp->spew( get( $_[1]->attr('href') ) );
}) ->cleanoutput('aggregate.pdf');

You're welcome.

comments powered by Disqus

About the author

Yanick Champoux
Perl necrohacker , ACP writer, orchid lover. Slightly bonker all around. he/him/his