Migrating From Blogger to Wordpress
Feb 29th, 2008 by Doug
As I related previously, I just upgraded to Wordpress from
Blogger. I’m using Wordpress 2.2 under Debian Etch. There were a few
snags in the upgrade, so I’d like to tell you about my solutions to
them.
After more digging, I found a Wordpress plugin that was tailor-made to deal with this problem - Alex King’s WP Unformatted plugin. Once I had this installed in my wp-content/plugins directory, I could then set a custom field in each new post with a key of sponge and a value of 1.
The effect of this is to cause Wordpress to ignore the linebreaks in pasted text. This works great for new posts, but what about my 100-some-odd existing posts? This is where GUI interfaces really fall down - it would take me forever to manually set the required custom field in each of my posts. Not to be deterred, I jumped into the MySQL shell and started poking around the database used for the blog. I found that the “geek_postmeta” table (’geek_’ is my database prefix, yours will be different, and is specified in your wp-config.php file) held each post’s custom field data. I drilled in on one of the posts where I had set the ’sponge’ field manually. Notice the row with the meta_key of sponge and the meta_value of 1:
I needed one more piece of information to proceed - the post ID’s of all my posts. The post ID is the number displayed to the left of each post in the Wordpress admin panel, but you can also use the MySQL shell to get the start and end post ID’s:
I was now ready to script a solution; this one is in Perl, feel free to substitute your favorite language:
You need to modify a few things to use this code:
This was important to me because I have a few posts that get quite a bit of traffic, and I did not want users to get 404 errors from broken links. I duplicated the Blogger permalink structure by specifying a custom format in the Wordpress control panel, using the format specifier /%year%/%monthnum%/%postname%.html.
I also had to create an .htaccess file in my web root with the following contents (this was displayed on-screen once I set the permalink format. Your installation may be able to create the file for you if permissions are set appropriately).
After all this, there were a few posts where the permalinks did not match, most notably where the word ‘a’ was in the post title (Wordpress includes the ‘a’, Blogger does not). You can individually edit a post’s permalink in Wordpress by changing the Post Slug from the post editor as needed.
The easiest way to transfer these lists over is to place a text widget into the site’s sidebar using the control panel (Go to Presentation->Widgets), then edit that specific text widget and just paste the unordered list directly into the widget’s text area, adding an appropriate title. If you want to be able to categorize your links, you’ll need to convert the list into OPML first (I used this site to convert my list of HTML links into OPML). Copy/paste the OPML into a file, then import it via Blogroll->Import Links. The OPML for the above list of links looks like this:
That’s pretty much everything. Probably the hardest part was dealing with the unformatted posts; if you use the various visual post editors, you probably won’t run into that problem.
Preparing for the Jump
The first thing I found was that Wordpress 2.2 has a nice new Blogger import feature, but it requires that you are using the “new” Blogger (called “Beta” for quite some time), not the old. Since I had some Blogger template modifications, I had been resisting the urge to upgrade. This forced my hand, however. I used the Blogger control panel to upgrade my template, then added back my modifications for shaded blockquote and code boxes. I also had previously added some javascript code that handled “Read More..” links in posts, but I left this out of the upgraded template.Authentication Failures
Next, I logged into my Wordpress control panel, and went to Manage->Import->Blogger. There was an “Authorize” link, which I clicked on, but the first time I did this it failed Google authentication with the message “We were not able to gain access to your account. Try starting over”. When I clicked on “Clear account information” and tried again, the same error would be displayed. After a little digging on the Wordpress support forums, I found a couple of solutions. One is to edit the PHP file responsible for the import, the other is to use wordpress.com as a waypoint for imports to your own installation. I chose the former, since this is my own server and editing a file is much quicker than the other method. The file is wp-admin/import/blogger.php, you’ll find it relative to your web root, so if you have a site hosted in /var/www/foo, the file will be /var/www/foo/wp-admin/import/blogger.php. Edit this file and search for the text Host: www2.blogger.com (around line 84), and change it to Host: www.blogger.com. That’s it. Now the importer worked like a charm. One note - this was fixed in Wordpress version 2.3, so if you are using that version or higher, you should be fine. The importer also won’t change your Blogger blog in any way, it just reads the data, so it’s safe to experiment with if you’re not sure Wordpress is for you.Post Formatting
The next issue related to the fact that I edited my blog posts using Emacs (I now use the Firefox extension It’s All text to integrate more smoothly with Emacs). Anyway, in text and HTML modes where you are using Emacs’ auto-fill, Emacs inserts soft newlines, which are not true newlines, but used just to display formatting on-screen. Blogger’s HTML post editor ignores these linebreaks in pasted text (as it should), but the Wordpress editor does not; it converts these pseudo-linebreaks to real (hard) newlines. The end result was that every one of my imported posts had bizarre formatting when it was displayed on-screen - this even broke a bunch of hyperlinks where the link text spanned a line boundary.After more digging, I found a Wordpress plugin that was tailor-made to deal with this problem - Alex King’s WP Unformatted plugin. Once I had this installed in my wp-content/plugins directory, I could then set a custom field in each new post with a key of sponge and a value of 1.
The effect of this is to cause Wordpress to ignore the linebreaks in pasted text. This works great for new posts, but what about my 100-some-odd existing posts? This is where GUI interfaces really fall down - it would take me forever to manually set the required custom field in each of my posts. Not to be deterred, I jumped into the MySQL shell and started poking around the database used for the blog. I found that the “geek_postmeta” table (’geek_’ is my database prefix, yours will be different, and is specified in your wp-config.php file) held each post’s custom field data. I drilled in on one of the posts where I had set the ’sponge’ field manually. Notice the row with the meta_key of sponge and the meta_value of 1:
mysql> describe geek_postmeta;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| meta_id | bigint(20) | NO | PRI | NULL | auto_increment |
| post_id | bigint(20) | NO | MUL | 0 | |
| meta_key | varchar(255) | YES | MUL | NULL | |
| meta_value | longtext | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
mysql> select * from geek_postmeta where post_id = 122;
+---------+---------+-------------------+--------------------...-+
| meta_id | post_id | meta_key | meta_value ... |
+---------+---------+-------------------+--------------------...-+
| 356 | 122 | blogger_blog | blog.unixlore.net ... |
| 357 | 122 | blogger_author | Doug ... |
| 358 | 122 | blogger_permalink /2007/11/great-fire... |
| 405 | 122 | sponge | 1 ... |
+---------+---------+-------------------+--------------------...-+
4 rows in set (0.00 sec)
mysql>
I needed one more piece of information to proceed - the post ID’s of all my posts. The post ID is the number displayed to the left of each post in the Wordpress admin panel, but you can also use the MySQL shell to get the start and end post ID’s:
mysql> select MIN(ID) from geek_posts;
+---------+
| MIN(ID) |
+---------+
| 3 |
+---------+
1 row in set (0.00 sec)
mysql> select MAX(ID) from geek_posts;
+---------+
| MAX(ID) |
+---------+
| 141 |
+---------+
1 row in set (0.00 sec)
I was now ready to script a solution; this one is in Perl, feel free to substitute your favorite language:
#!/usr/bin/perl
use DBI;
use strict;
use warnings;
my ($sth,$sql);
my $dbh = DBI->connect('DBI:mysql:wordpress:localhost','user','pass');
for (my $i=3; $i<=141;++$i) {
next if ($i == 122 || $i == 129 || $i == 123);
$sql = "INSERT INTO geek_postmeta (post_id,meta_key,meta_value)\
VALUES ('$i','sponge','1')";
print "Executing SQL: INSERT INTO geek_postmeta \
(post_id,meta_key,meta_value) VALUES ('$i','sponge','1')\n";
$sth = $dbh->prepare($sql);
$sth->execute();
}
You need to modify a few things to use this code:
- You need to specify your database username and password in the DBI->connect line.
- You need to change the table prefix geek_ to match your own in the line that begins $sql = "INSERT INTO geek_postmeta…
- You need to edit the starting and ending post ID’s in the for loop. Mine were 3 and 141, respectively.
- You may or may not need a line like next if ($i == 122 || $i == 129 || $i == 123). I used this to skip the post ID’s where I had already set the custom field manually. Just edit it appropriately or comment it out if you haven’t manually changed any posts.
Permalink Structure
Another issue I ran across had to do with permalinks. Blogger was using the following link structure:
http://blog.unixlore.net/year/month/post-name.html
This was important to me because I have a few posts that get quite a bit of traffic, and I did not want users to get 404 errors from broken links. I duplicated the Blogger permalink structure by specifying a custom format in the Wordpress control panel, using the format specifier /%year%/%monthnum%/%postname%.html.
I also had to create an .htaccess file in my web root with the following contents (this was displayed on-screen once I set the permalink format. Your installation may be able to create the file for you if permissions are set appropriately).
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
After all this, there were a few posts where the permalinks did not match, most notably where the word ‘a’ was in the post title (Wordpress includes the ‘a’, Blogger does not). You can individually edit a post’s permalink in Wordpress by changing the Post Slug from the post editor as needed.
The Blogroll
The final issue I came across was how to transfer my blogroll and lists of links I had in my Blogger sidebar. These were specified as HTML unordered lists in my Blogger template:
<ul>
<li><a href="http://blog.unixlore.net/2006/...</li>
<li><a href="http://blog.unixlore.net/2006/...</li>
<li><a href="http://blog.unixlore.net/2006/...</li>
<li><a href="http://blog.unixlore.net/2006/...</li>
</ul>
The easiest way to transfer these lists over is to place a text widget into the site’s sidebar using the control panel (Go to Presentation->Widgets), then edit that specific text widget and just paste the unordered list directly into the widget’s text area, adding an appropriate title. If you want to be able to categorize your links, you’ll need to convert the list into OPML first (I used this site to convert my list of HTML links into OPML). Copy/paste the OPML into a file, then import it via Blogroll->Import Links. The OPML for the above list of links looks like this:
<opml version="1.1">
<head>
<title/>
<dateModified/>
<ownerName>mmpower</ownerName>
<ownerEmail>foo@example.com</ownerEmail>
</head>
<body>
<outline type="link" url="http://blog.unixlore.net/... />
<outline type="link" url="http://blog.unixlore.net/... />
<outline type="link" url="http://blog.unixlore.net/... />
<outline type="link" url="http://blog.unixlore.net/... />
</body>
</opml>
That’s pretty much everything. Probably the hardest part was dealing with the unformatted posts; if you use the various visual post editors, you probably won’t run into that problem.
![[SDF Public Access Unix System] [SDF Public Access Unix System]](http://www.unixlore.net/images/sdf.jpg)