Michael Crosby


How to backup gmail the right way

So like many people my life is stored in my gmail. I trust google to keep my data safe but since so many things, from contacts to many different online accounts, are all registered under your gmail address it is important to have that data whenever you need it. There are some services online that will backup your gmail but it is still stored on their servers. I feel a lot better if it was stored locally on my computers so that I have a copy of my emails whenever I need them. So this is how you do it.

Got Your Back

Got Your Back is a python app that will connect to your gmail via oAuth and suck down your messages in simple txt files. They have an *.eml extension but you can open all these files with a text editor and see the contents of your emails. It is fast and also does incremental backups. So after the initial backup of your account, Got Your Back will only download the messages that it doe not have instead of downloading everything each time you do a backup. I use a simple cron job to get all my new messages everyday, automatically without having to think about it.

Installing

You can get to the download page of got your back here. They have downloads for Unix and Windows systems. I have an server setup running debian that handles all my account backups so this tutorial will focus on running gyb (got your back) on Unix systems. So download the source package or use subversion to download the source onto your computer:

svn checkout http://got-your-back.googlecode.com/svn/trunk/ got-your-back-read-only

If you download the zip extract it to whatever folder you want and cd into the directory. Now because gyb uses oAuth to login to your account you will need to initially run this script on a computer with a desktop. Because I use a server over ssh to download my backups I need to first get the oAuth token on a computer with a web browser first.

Getting oAuth Tokens

So now that we are cd into the got-your-back directory run this command to authenticate and get an estimate of the size and number of messages that need to be backed up.

python gyb --email youremail@gmail.com --estimate

So make sure that you input your correct email address. If you are already logged into your email account on your computer and you are trying to backup a different account, you will need to logout first. So gyb will tell you that it will launch a browser, so just hit Enter and you will see a gmail page asking if you want to authenticate gyb to your gmail. Hit yes and then go back to the command line. No hit Enter again if it is still waiting and wait for it to make an estimate of your gmail's size and number of messages.

So now it should give you an estimate. I had 28k messages to download at 800mb. It does not download attachments. If you do an ls now you will see a new file. It will be a file with your email address and an extension of .cfg. This file now holds your oAuth token so if you are going to run the backups on a server over ssh, you will have to copy this file over to your server for it to work. So either zip, tar, or 7z the entire gyb directory with your email's config file in it and transfer it to your server. If you are just going to run it on the same computer then you don't have to do anything yet.

Inital backup

So now that you have the config file and gyb on whatever computer that you want to setup the automated backups on lets run the initial backup. First lets make the gyb.py file runnable.

chmod 755 gyb.py

So run this command to perform the initial backup.

python gyb.py --email youremail@gmail.com --backup

This will perform the initial backup so depending on how many messages your estimate was, this make take awhile.

Cron job for everyday backups

So now that we have an initial backup we will need to do an incremental backup everyday for all of the new messages that we receive. So lets add a cron job with crontab that will run gyb.py everyday. So remember where that file is on your computer because we will need it.

crontab -e

Then input this into your crontab file with the proper changes.

0 2 * * * python /home/michael/got-your-back-read-only/gyb.py --email youremail@gmail.com --backup

So what this does is make a backup at 3am everyday running gyb.py with your specified email. Easy and done. It will now do incremental backups of only your new messages and you will always be able to get that data because it is now backed up locally on your computer or server.

comments powered by Disqus