Google Apps migration

mesI've been running the domain sandberg.pp.se for several years. Ever since the domain saw light of dawn mail has been managed by a local server and UUCPssh. I had been planning to migrate critical services, such as e-mail to a managed facility for quite some time. When UUCPssh decided to shutdown I in late September 2009, I realized that I would soon be without a backup MX. This meant that migration suddenly was urgent.

For various reasons I choosed to use Google Apps on the domain. Price was obviously an important factor, but I'm still a bit worried about vendor lock-in.

Setting up Google Apps

Setting up Google Apps was pretty straight forward. It turns out that this was the easiest part in the entire process. Apart from reading Terms of Use and other documents written in legalease, the process of registering the domain with Google and reconfiguring the DNS server was piece of cake.

Administration is pretty straight forward, however creating large numbers of users in the web interface is pretty boring. Wish there was a command line tool for that...

Migration

It turns out that the most difficult part in the entire process was the actual data migration.

Mail

I must admit that I must have been a bit naïve when I thought that this was going to be easy. I started to migrate mail using mailutil from University of Washington to copy mail from my server to the IMAP server at Google. Now, that actually worked in large parts, but...

It turns out that GMail has the, very, very, annoying habit of displaying (some) mail in the order the mail arrived at their server. I didn't notice this at first since I copied some older messages to test if everything worked. For some reason it appears as if messages older than ~1 year are displayed correctly in GMail, but the time stamp on newer messages is set to the time the message arrived at the server.

Now, this seems really stupid, but it makes sense not to use the date field in the message since it is sometimes forged by spammers and not always correct anyways. Using the arrival time only works well if the delay between sending the message and the point where it arrives at the destination is small. This obviously works fairly well in most common cases when messages are delivered using SMTP. However, when migrating you want to be able to push messages by some other mean, e.g. IMAP. It turns out that somone thought it would be a brilliant idea to use this concept for IMAP as well as SMTP, ergo messages copied to the server using IMAP were displayed in the web interface as having arrived at the same time as the message was copied to the server.

Having invalid reception times, even though this was only visible in the web interface, is clearly not acceptable. So, how can this be solved?
  • Test/Buy the Premier Edition and using the online migration tool.
  • Using the email uploader tool for Windows from Google.
The first alternative would work, but I don't want the hassle of testing premier for 1 month and then canceling the subscription. The second alternative didn't work in large scale since it was only available for Windows and only supported offline message stores.

Fortunately the mail uploader was open source, so I quickly discovered that Google Apps Mail Migration API was actually available on the free version. This was of course not documented in the API documentation, in fact it clearly stated that the Premier Edition was required.

To sum things up. I hacked together a small script that took care of most of the mail migration. It is available under the hacks section. I haven't been able to figure out how to set the 'replied' flag on imported messages yet, so that won't be migrated.

Calendar

TBD

Contacts

TBD

Comments