Gmail downloader



OpenID Login [?]
Identity URL:


Remember me
All timestamps are based on your local time of:

Posted by: stak
Posted on: 2009-10-15 23:04:13

If anybody else would like to download the contents of their Gmailbox into nice little RFC822 message files, here's a quick-and-dirty (read: no error detection or handling) java program to do it. Requires java 1.6 for the console stuff, but can be ported to 1.5 or 1.4 in a pinch.

import java.util.*;
import java.io.*;
import java.net.*;
import javax.net.*;
import javax.net.ssl.*;

public class GmailDownloader {
    public static void main( String[] args ) throws Exception {
        int start = 1;
        int end = -1;
        if (args.length > 0) {
            start = Integer.parseInt( args[0] );
            if (args.length > 1) {
                end = Integer.parseInt( args[1] );
            }
        }

        Console console = System.console();
        String username = console.readLine( "Enter username: " );
        String password = new String( console.readPassword( "Enter password: " ) );

        SocketFactory sf = SSLSocketFactory.getDefault();
        Socket socket = sf.createSocket( "imap.gmail.com", 993 );
        InputStream in = socket.getInputStream();
        BufferedReader br = new BufferedReader( new InputStreamReader( in ) );

        OutputStream out = socket.getOutputStream();
        PrintWriter pw = new PrintWriter( out );
        br.readLine();

        pw.print( "A LOGIN " + username + " " + password + "\r\n" );
        pw.flush();
        br.readLine();

        pw.print( "B SELECT \"[Gmail]/All Mail\"\r\n" );
        pw.flush();
        for (int i = 0; i < 3; i++) br.readLine();
        String numMsgs = br.readLine();
        for (int i = 0; i < 3; i++) br.readLine();
        if (end < 0) {
            StringTokenizer st = new StringTokenizer( numMsgs );
            st.nextToken();
            end = Integer.parseInt( st.nextToken() );
            System.out.println( "Found " + end + " messages" );
        }

        System.out.println( "Downloading messages from [" + start + "] to [" + end + "]" );
        for (int i = start; i <= end; i++) {
            System.out.println( "Downloading message " + i );
            pw.print( "ZZ FETCH " + i + " RFC822\r\n" );
            pw.flush();
            br.readLine();
            String fn = i + ".msg";
            while (fn.length() < 12) {
                fn = "0" + fn;
            }
            PrintWriter file = new PrintWriter( new File( fn ) );
            outer: while (true) {
                String s = br.readLine();
                while (s.endsWith( ")" )) {
                    String t = br.readLine();
                    if (t.startsWith( "ZZ OK" )) {
                        if (s.length() > 1) {
                            file.println( s.substring( 0, s.length() - 1 ) );
                        }
                        break outer;
                    }
                    file.println( s );
                    s = t;
                }
                file.println( s );
            }
            file.close();
        }

        pw.print( "C LOGOUT\r\n" );
        pw.flush();

        in.close();
        out.close();
        socket.close();
    }
}


The above code is in the public domain, so feel free to do whatever with it. To use:

javac GmailDownloader.java
java GmailDownloader


If it dies partway through and you want to resume, just add the message number you want to start at as a parameter:

java GmailDownloader 42

Posted by Eric at 2009-10-16 08:28:04
Am I just being bloody ignorant, or is there something wrong with the way the POP3 or IMAP interfaces do things?
[ Reply to this ]
Posted by stak at 2009-10-16 09:03:04
What do you think is wrong with the interfaces?
[ Reply to this ]
Posted by stak at 2009-10-17 19:57:30
Update: made a modification the end-of-message detection (the part with the close-paren) since some messages had them tacked on the end of the last line instead of a new line by themselves. This fixes a problem where the downloader would get "stuck" on those messages.
[ Reply to this ]

[ Add a new comment ]

 
 
(c) Kartikaya Gupta, 2004-2010. User comments owned by their respective posters. All rights reserved. Secure site.