Mac OS X,  Mac OS X Server,  Mass Deployment

Managing Mail and Safari RSS Subscriptions from the Command Line

Safari can subscribe to RSS feeds; so can Mail. Podcast Producer is an RSS or XML feed as are the feeds created by blog and wiki services in Mac OS X Server. And then of course, RSS and ATOM come pre-installed with practically every blogging and wiki tool on the market. Those doing mass deployment and scripting work can make use of automatically connecting users to and caching information found in these RSS feeds. If you have 40,000 students, or even 250 employees, it is easier to send a script to those computers than to open the Mail or Safari client on each and subscribe to an RSS feed.

Additionally, pubsub offers what I like to call Yet Another Scripting Interface to RSS (my acronym here is meant to sound a bit like yessir).  Pubsub caches the feeds both within the SQLite database and in the form of XML files. Because pubsub caches data onto the client it can be parsed more quickly than using other tools, allowing a single system to do much more than if a feed were being accessed over the Internet.

Using pubsub

We’ll start by looking at some simple RSS management from the command line to aid in a quest at better understanding of the underpinnings of Mac OS X’s built-in RSS functionalities. The PubSub framework stores feeds and associated content in a SQLite database. Interacting with the database directly can be a bit burdensome. The easiest way to manage RSS from Mac OS X is using a command called pubsub. First off, let’s take a look at all of the RSS feeds that the current user is subscribed to by opening terminal and simply typing pubsub followed by the list verb:

pubsub list

You should then see output of the title and url of each RSS feed that mail and safari are subscribed to. You’ll also see how long each article is kept in the expiry option and the interval with which the applications check for further updates in the refresh option. You can also see each application that can be managed with pubsub by running the same command with clients appended to the end of it (clients are how pubsub refers to applications whose subscriptions it can manage):

pubsub list clients

To then just look at only feeds in Safari:

pubsub list client com.apple.safari

And Mail:

pubsub list client com.apple.mail

Each of the above commands will provide a URL for the feed. This url can be used to show each entry, or article in the feed. Extract the URL and then you can use the list verb to see each feed entry, which Apple consistently calls episodes both within PubSub, in databases and on the Podcast Producer server side of things but yet somehow calls an entry here (consistency people). To see a list of entries for a given URL:

pubsub list http://googleenterprise.blogspot.com/atom.xml

Episodes will be listed in 40 character hex keys, similar to other ID space mechanisms used by Apple. To then see each episode, or entry, use the list verb, followed by entry and then that key:

pubsub list entry 5fcef167d77c8c00d7ff041a869d45445cc4ae42

To subscribe to a pubsub, use the –client option to identify which application to subscribe in along with the subscribe verb, followed by the URL of the feed:

pubsub --client com.apple.mail subscribe https://krypted.com//LameAssFeed.xml

To unsubscribe, simply use pubsub followed by the unsubscribe verb and then the url of the feed:

pubsub unsubscribe http://example.com/UninterestingFeed.xml

Ofline Databases and Imaging

While these can be run against a typical running system, they cannot be run against a sqlite database that is sitting in all of your users home folders nor can they be run against a database in a user template home on a client. Therefore, to facilitate imaging, you can run sqlite3 commands against  database directly. The database, stored in ~/Library/PubSub/Database/Database.sqlite3.

To see the clients (the equivalent of `pubsub list clients`):

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'SELECT * FROM clients'

To see each feed:

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'SELECT * FROM feeds'

To see each entry:

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'SELECT * FROM entries'

To see the column headers for each:

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Clients)';

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Feeds)';

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Subscriptions)';

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Entries)';

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Enclosures)';

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Authors)';

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(Contents)';

 

 

sqlite3 /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'PRAGMA TABLE_INFO(SyncInfo)';

To narrow an ID down to a specific row within any of these searches add a WHERE followed by the column within the table you’d like to search. For example, if we wanted to only see the article with the identifier of 5b84e609317fb3fb77011c2d26efd26a337d5d7d

sqlite3 --line /Volumes/Image/Username/Library/PubSub/Database/Database.sqlite3 'SELECT * FROM entries WHERE identifier="5b84e609317fb3fb77011c2d26efd26a337d5d7d"'

Note: Sqlite3 can use the –line option to show each entry in an XML feed per line.

Dumping pubsub to be Parsed By Other Tools

Pubsub can also be used as a tool to supply feeds and parse them. You can extract conversations only matching specific patterns and text or email yourself that they occurred without a lot of fanfare. You can also dump the entire feed’s cached data by specifying the dump verb without the entry or identifier but instead the URL:

pubsub dump http://googleenterprise.blogspot.com/atom.xml

Once dumped you can parse the XML into other tools easily. Or to dump specific entries to XML for parsing by another tool using syntax similar to the list entry syntax:

pubsub dump entry 5fcef167d77c8c00d7ff041a869d45445cc4ae42

Because these feeds have already been cached on the local client and because some require authentication and other expensive (in terms of script run-time) processes to aggregate or search, looking at the files is an alternative way of doing so.

Instant refreshes can also be performed using pubsub’s refresh verb followed by a URL:

pubsub refresh

Also, feeds are cached to ~/Library/PubSub/Feeds, where they are nested within a folder with the name of the unique ID of the feed (row 2 represents the unique ID whereas row 1 represents the row). Each episode, or post can then be read by entry ID. Yhose entries are basic xml files.

You can also still programatically interface with RSS using curl. For example:

curl --silent "http://${server}.myschool.org/search/cpg?query=%22random+curse+word%22&catAbbreviation=cpg&addThree=&format=rss" | grep "item rdf:about=" | cut -c 18-100 | sed -e "s/"//g" | sed -e "s/>//g"