Improved Wordpress.com Importer and a Question

Thanks to the cooperation of Humitos who gave me his wordpress backup, I did some improvements in the wordpress.com import feature of Nikola, my static website/blog generator

So, if you were to try to use nikola_wordpress_importer from master now, it would:

  1. Not crash ;-)
  2. Download attachments
  3. Fix links to attachments so they work on the new site

However, I am now unsure of what exactly is in wordpress.com's export XML file. The posts themselves are in this form:

Muchas gracias Nico por hacer el video este. Groso, quedó muy bueno.

[youtube=http://www.youtube.com/watch?hl=es&v=882qxARXa6c]

Two things jump to me:

  1. That's not HTML
  2. WTF is that youtube thing?

I am having some success processing it as markdown, since that handles the paragraph breaks and some other stuff. Maybe the youtube embedding is done with a markdown extension?

Anyone knows?

Comments

Comments powered by Disqus