Wordpress to Octopress

Octopress

OK. This should be one of the last posts about Octopress. I just need this to sum up the modifications for documentation purposes.


Tools

Some of the commands and tools I’ve used for transforming the posts and images for the Octopress.

Export from Wordpress

Using exitwp you can convert your posts from Wordpress to Markdown files. I needed to apply a patch because the script was running into an error when using it to the exported file from Wordpress version 3.3.1:

python exitwp.py
     reading: wordpress-xml/blog.xml
     Traceback (most recent call last):
     File "exitwp.py", line 296, in
     data=parse_wp_xml(wpe)
     File "exitwp.py", line 61, in parse_wp_xml
     root=tree.parse(file)
     File "/usr/lib/python2.6/xml/etree/ElementTree.py", line 586, in parse
     parser.feed(data)
     File "/usr/lib/python2.6/xml/etree/ElementTree.py", line 1245, in feed
     self._parser.Parse(data, 0)
     xml.parsers.expat.ExpatError: unbound prefix: line 107, column 1

The fix for this I found in the comments of the git-hub issue-log:

exitwp.py
  View file @ 086b379 ...     ...    @@ -53,7 +53,7 @@ def parse_wp_xml(file):
   53         53              'content':"{http://purl.org/rss/1.0/modules/content/}",
   54         54              'wfw':"{http://wellformedweb.org/CommentAPI/}",
   55         55              'dc':"{http://purl.org/dc/elements/1.1/}",
   56                -            'wp':"{http://wordpress.org/export/1.2/}",
   56            +            'wp':"{http://wordpress.org/export/1.1/}",
   57         57              'atom':"{http://www.w3.org/2005/Atom}"
   58         58          }
   59         59

After that I had plenty of markdown files to play with…

Commands

Center all Images using the photo-tag plugin:

sed -e 's/&#94;\({% photo .*%}\)/<center>\n\1\n<\/center> /' *.markdown

Thumbnails

After taking a backup of all images, it was time working on them.

The photo tag plug-in uses a default name for a thumbnail which differs in an extension in the filename by a tailing _m.

Create thumbnails for all files:

for i in `find -type f`; do extension=`echo ${i##*.}`; cp ${i} ${i//.${extension}/_m.${extension}}; done

Resize all Thumbnails:

# Double the images
cd _posts
for i in `find -type f`; do extension=`echo ${i##*.}`; cp ${i} ${i//.${extension}/_m.${extension}}; done

Create a smaller thumbnail

You need to install exiftool for this one:

for i in `find -type f`; do
   # converts all pictures that are bigger to a width of 320 pixels.
   if [[ `exiftool "${i}" | grep -e '^Image Width' | awk '{ print $4 }'` -gt "320" ]]; then convert "${i}" -scale 320 "${i}"; fi
done

Alternative

find -type f -iname '*_m.*' -exec if [[ `et "{}" | grep -e '^Image Width' | awk '{ print $4 }'` -gt "320" ]]; then convert "{}" -scale 320 "{}"; fi \;

Annoyances

There’s quite a critical error when you fuck it up and that will cost you hours to find (unless you’re good in Chinese:

At one point when I tried to generate the HTML output with rake generate I got the error message:

$ rake generate
## Generating Site with Jekyll
overwrite source/stylesheets/screen.css
/usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/maruku-0.6.0/lib/maruku/input/parse_doc.rb:22:in `<top (required)>': iconv will be deprecated in the future, use String#encode instead.
Configuration from .../_config.yml
Building site: source -> public
/usr/share/ruby-rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/psych.rb:203:in `parse': (<unknown>): did not find expected key while parsing a block mapping at line 2 column 1 (Psych::SyntaxError)
     from /usr/share/ruby-rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/psych.rb:203:in `parse_stream'
     from /usr/share/ruby-rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/psych.rb:151:in `parse'
     from /usr/share/ruby-rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/psych.rb:127:in `load'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/jekyll-0.11.2/lib/jekyll/convertible.rb:33:in `read_yaml'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/jekyll-0.11.2/lib/jekyll/post.rb:39:in `initialize'
     from ../plugins/preview_unpublished.rb:23:in `new'
     from ../plugins/preview_unpublished.rb:23:in `block in read_posts'
     from ../plugins/preview_unpublished.rb:21:in `each'
     from ../plugins/preview_unpublished.rb:21:in `read_posts'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/jekyll-0.11.2/lib/jekyll/site.rb:128:in `read_directories'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/jekyll-0.11.2/lib/jekyll/site.rb:98:in `read'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/jekyll-0.11.2/lib/jekyll/site.rb:38:in `process'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/gems/jekyll-0.11.2/bin/jekyll:250:in `<top (required)>'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/bin/jekyll:19:in `load'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/bin/jekyll:19:in `<main>'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/bin/ruby_noexec_wrapper:14:in `eval'
     from /usr/share/ruby-rvm/gems/ruby-1.9.3-p194/bin/ruby_noexec_wrapper:14:in `<main>'

You’ve got literally almost no chance of guessing what’s the problem here. The issue will be in one of your articles. This article in chinese pointed me in the right direction. I’ve recently added some categories to some posts and use the wrong syntax to do so, instead of this syntax (the correct one):

Categories:
- Tech
- Life

I’ve used this one (the wrong one):

Categories: Tech, Life

It’s enough to have this in only one post or markdown file to let you run into this error. Fix that and run rake generate again. Then it should be fine again.