Connecting Google Reader and podget
For some time, I’ve had a Perl script that runs regularly, backing up my Google Reader subscriptions using the standard OPML format:
#!/usr/bin/perl
#
# Usage:
# backup-google-reader-opml file-to-write-to.opml google.user.name@domain google-password
use strict;
use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
$mech->get("http://reader.google.com")
or die "Cannot reach Google Reader Homepage";
$mech->submit_form(
form_number => 1,
fields =>
{
Email => $ARGV[1],
Passwd => $ARGV[2]
}
)
or die "Cannot submit form";
$mech->get("http://www.google.com/reader/subscriptions/export");
$mech->save_content($ARGV[0]);
However, I recently wrote another script (this time Python) that then takes this OPML, parses out all the URLs that are tagged with ‘podcast’, and outputs a serverlist file for podget (an automated console-based podcast downloader). This enables me to subscribe to a podcast in Google Reader, and have the podcast automatically added to the download list. The script looks like this:
#!/usr/bin/python
#
# Pass in the OPML file as the first command-line parameter. Will output the
# podget serverlist on stdout.
import re
import sys
import xml.dom.minidom
doc = xml.dom.minidom.parse(sys.argv[1])
body = doc.getElementsByTagName("body")[0]
p = re.compile('^\W+')
for outline in doc.getElementsByTagName("outline"):
if outline.getAttribute("text") == "podcast":
for subOutlines in outline.getElementsByTagName("outline"):
title = subOutlines.getAttribute("title")
title = p.sub("", title)
print subOutlines.getAttribute("xmlUrl") + " NoCategory " + title
Feel free to use and adapt to your needs.
Comments