--- author: '' category: '' date: 2012/09/03 21:47:01 description: '' link: '' priority: '' slug: improved-wordpresscom-importer tags: python,nikola title: Improved Wordpress.com Importer and a Question type: text updated: 2012/09/03 21:47:01 url_type: '' --- Thanks to the cooperation of `Humitos `__ who gave me his wordpress backup, I did some improvements in the wordpress.com import feature of `Nikola, my static website/blog generator `__ So, if you were to try to use ``nikola_wordpress_importer`` from master now, it would: 1. Not crash ;-) 2. Download attachments 3. Fix links to attachments so they work on the new site However, I am now unsure of what exactly is in wordpress.com's export XML file. The posts themselves are in this form:: Muchas gracias Nico por hacer el video este. Groso, quedó muy bueno. [youtube=http://www.youtube.com/watch?hl=es&v=882qxARXa6c] Two things jump to me: 1. That's not HTML 2. WTF is that youtube thing? I am having some success processing it as markdown, since that handles the paragraph breaks and some other stuff. Maybe the youtube embedding is done with a markdown extension? Anyone knows?