How to Download and Migrate Attachments and Images from a Large WordPress.com Blog

The default WordPress importer can’t transfer more images than the PHPexecution time allows it to, which makes it almost impossible to migrate a large WordPress.com site with lots of media assets and attachments.

I wrote a simple PHP script that extracts all attachment URLs from a WordPress export file and stores them in a text file with one URL per line:

$a = array();
$x = simplexml_load_file('export.xml');

foreach ( $x->channel->item as $item )
        if ( $wp = $item->children('http://wordpress.org/export/1.2/') )
                if ( $wp->post_type == 'attachment' )
                        $a[] = $wp->attachment_url;

file_put_contents( 'export_media.txt', implode( "\n", $a ) );

We can then use the export_media.txt file together with wget and xargs to download all attachements, like this:

$ xargs -n 1 wget -p -nc < urls.txt

or using CURL:

$ xargs -n 1 curl -O < export_media.txt

Note that the wget script will keep the correct folder structure.

Leave a Reply