As I related previously, I just upgraded to Wordpress from
Blogger. I’m using Wordpress 2.2 under Debian Etch. There were a few
snags in the upgrade, so I’d like to tell you about my solutions to
them.
Preparing for the Jump
The first thing I found was that Wordpress 2.2 has a nice new Blogger
import feature, but it requires that you are using the “new” Blogger
(called “Beta” for quite some time), not the old. Since I had some
Blogger template modifications, I had been resisting the urge to
upgrade. This forced my hand, however. I used the Blogger control
panel to upgrade my template, then added back my
modifications
for
shaded blockquote and code boxes. I also had previously added some
javascript code that handled “Read More..” links in posts, but I left
this out of the upgraded template.
Authentication Failures
Next, I logged into my Wordpress control panel, and went to
Manage->Import->Blogger. There
was an “Authorize” link, which I clicked on, but the first time I did
this it failed Google authentication with the message “We were not
able to gain access to your account. Try starting over”. When I
clicked on “Clear account information” and tried again, the same error
would be displayed. After a little digging on the Wordpress support
forums, I
found
a
couple of solutions. One is to edit the PHP file responsible for
the import, the other is to use wordpress.com as a waypoint for
imports to your own installation. I chose the former, since this is my
own server and editing a file is much quicker than the other
method. The file
is
wp-admin/import/blogger.php,
you’ll find it relative to your web root, so if you have a site hosted
in
/var/www/foo, the file will be
/var/www/foo/wp-admin/import/blogger.php. Edit
this file and search for the
text
Host: www2.blogger.com
(around line 84), and change it
to
Host:
www.blogger.com. That’s it. Now the importer worked like a
charm. One note - this was fixed in Wordpress version 2.3, so if you
are using that version or higher, you should be fine. The importer
also won’t change your Blogger blog in any way, it just reads the
data, so it’s safe to experiment with if you’re not sure Wordpress is
for you.
Post Formatting
The next issue related to the fact that
I
edited
my blog posts using Emacs (I now use the Firefox
extension
It’s
All text to integrate more smoothly with Emacs). Anyway, in text
and HTML modes where you are using Emacs’ auto-fill, Emacs
inserts
soft
newlines, which are not true newlines, but used just to display
formatting on-screen. Blogger’s HTML post editor ignores these
linebreaks in pasted text (as it should), but the Wordpress editor
does not; it converts these pseudo-linebreaks to real (hard)
newlines. The end result was that every one of my imported posts had
bizarre formatting when it was displayed on-screen - this even broke a
bunch of hyperlinks where the link text spanned a line boundary.
After more digging, I found a Wordpress plugin that was tailor-made to
deal with this problem - Alex
King’s
WP
Unformatted plugin. Once I had this installed in my
wp-content/plugins directory,
I could then set a custom field in each new post with a key
of
sponge and a value
of
1.
The effect of this is to cause Wordpress to ignore the linebreaks in
pasted text. This works great for new posts, but what about my
100-some-odd existing posts? This is where GUI interfaces really fall
down - it would take me forever to manually set the required custom
field in each of my posts. Not to be deterred, I jumped into the MySQL
shell and started poking around the database used for the blog. I
found that the “geek_postmeta” table (’geek_’ is my database prefix,
yours will be different, and is specified in
your
wp-config.php file) held
each post’s custom field data. I drilled in on one of the posts where
I had set the ’sponge’ field manually. Notice the row with the
meta_key of
sponge and the
meta_value of
1:
mysql> describe geek_postmeta;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| meta_id | bigint(20) | NO | PRI | NULL | auto_increment |
| post_id | bigint(20) | NO | MUL | 0 | |
| meta_key | varchar(255) | YES | MUL | NULL | |
| meta_value | longtext | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
mysql> select * from geek_postmeta where post_id = 122;
+---------+---------+-------------------+--------------------...-+
| meta_id | post_id | meta_key | meta_value ... |
+---------+---------+-------------------+--------------------...-+
| 356 | 122 | blogger_blog | blog.unixlore.net ... |
| 357 | 122 | blogger_author | Doug ... |
| 358 | 122 | blogger_permalink /2007/11/great-fire... |
| 405 | 122 | sponge | 1 ... |
+---------+---------+-------------------+--------------------...-+
4 rows in set (0.00 sec)
mysql>
I needed one more piece of information to proceed - the post ID’s of
all my posts. The post ID is the number displayed to the left of each
post in the Wordpress admin panel, but you can also use the MySQL
shell to get the start and end post ID’s:
mysql> select MIN(ID) from geek_posts;
+---------+
| MIN(ID) |
+---------+
| 3 |
+---------+
1 row in set (0.00 sec)
mysql> select MAX(ID) from geek_posts;
+---------+
| MAX(ID) |
+---------+
| 141 |
+---------+
1 row in set (0.00 sec)
I was now ready to script a solution; this one is in Perl, feel free
to substitute your favorite language:
#!/usr/bin/perl
use DBI;
use strict;
use warnings;
my ($sth,$sql);
my $dbh = DBI->connect('DBI:mysql:wordpress:localhost','user','pass');
for (my $i=3; $i<=141;++$i) {
next if ($i == 122 || $i == 129 || $i == 123);
$sql = "INSERT INTO geek_postmeta (post_id,meta_key,meta_value)\
VALUES ('$i','sponge','1')";
print "Executing SQL: INSERT INTO geek_postmeta \
(post_id,meta_key,meta_value) VALUES ('$i','sponge','1')\n";
$sth = $dbh->prepare($sql);
$sth->execute();
}
You need to modify a few things to use this code:
- You need to specify your database username and password in the
DBI->connect line.
- You need to change the table
prefix geek_ to match your own
in the line that begins $sql =
"INSERT INTO geek_postmeta…
- You need to edit the starting and ending post ID’s in the for
loop. Mine were 3 and 141, respectively.
- You may or may not need a line
like next if ($i == 122 || $i == 129
|| $i == 123). I used this to skip the post ID’s where I had
already set the custom field manually. Just edit it appropriately or
comment it out if you haven’t manually changed any posts.
Permalink Structure
Another issue I ran across had to do with permalinks. Blogger was
using the following link structure:
http://blog.unixlore.net/year/month/post-name.html
This was important to me because I have a few posts that get quite a
bit of traffic, and I did not want users to get 404 errors from broken
links. I duplicated the Blogger permalink structure by specifying a
custom format in the Wordpress control panel, using the format
specifier
/%year%/%monthnum%/%postname%.html.
I also had to create
an
.htaccess file in my web
root with the following contents (this was displayed on-screen once I
set the permalink format. Your installation may be able to create the
file for you if permissions are set appropriately).
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
After all this, there were a few posts where the permalinks did not
match, most notably where the word ‘a’ was in the post title
(Wordpress includes the ‘a’, Blogger does not). You can individually
edit a post’s permalink in Wordpress by changing
the
Post Slug from the post
editor as needed.
The Blogroll
The final issue I came across was how to transfer my blogroll and
lists of links I had in my Blogger sidebar. These were specified as
HTML unordered lists in my Blogger template:
<ul>
<li><a href="http://blog.unixlore.net/2006/...</li>
<li><a href="http://blog.unixlore.net/2006/...</li>
<li><a href="http://blog.unixlore.net/2006/...</li>
<li><a href="http://blog.unixlore.net/2006/...</li>
</ul>
The easiest way to transfer these lists over is to place a text widget
into the site’s sidebar using the control panel (Go
to
Presentation->Widgets),
then edit that specific text widget and just paste the unordered list
directly into the widget’s text area, adding an appropriate title. If
you want to be able to categorize your links, you’ll need to convert
the list into OPML first (I
used
this
site to convert my list of HTML links into OPML). Copy/paste the
OPML into a file, then import it
via
Blogroll->Import
Links. The OPML for the above list of links looks like this:
<opml version="1.1">
<head>
<title/>
<dateModified/>
<ownerName>mmpower</ownerName>
<ownerEmail>foo@example.com</ownerEmail>
</head>
<body>
<outline type="link" url="http://blog.unixlore.net/... />
<outline type="link" url="http://blog.unixlore.net/... />
<outline type="link" url="http://blog.unixlore.net/... />
<outline type="link" url="http://blog.unixlore.net/... />
</body>
</opml>
That’s pretty much everything. Probably the hardest part was dealing
with the unformatted posts; if you use the various visual post
editors, you probably won’t run into that problem.