The default WordPress importer can’t transfer more images than the PHP
execution time allows it to, which makes it almost impossible to migrate a large WordPress.com site with lots of media assets and attachments.
I wrote a simple PHP
script that extracts all attachment URLs from a WordPress export file and stores them in a text file with one URL per line:
$a = array(); $x = simplexml_load_file('export.xml'); foreach ( $x->channel->item as $item ) if ( $wp = $item->children('http://wordpress.org/export/1.2/') ) if ( $wp->post_type == 'attachment' ) $a[] = $wp->attachment_url; file_put_contents( 'export_media.txt', implode( "\n", $a ) );
We can then use the export_media.txt
file together with wget
and xargs
to download all attachements, like this:
$ xargs -n 1 wget -p -nc < urls.txt
or using CURL
:
$ xargs -n 1 curl -O < export_media.txt
Note that the wget
script will keep the correct folder structure.