SEARCH
TOOLBOX
LANGUAGES
Create a book
Syndication/formats/metafeed

Syndication/formats/metafeed

From Steeple

Jump to: navigation, search

How are we to find out what feeds an institution provides? We don't want to crawl a website and extract the feeds: instead we need a systematic way of discovering the feeds. For systematic resource discovery, we propose to use a 'metafeed', which is a sort of 'master feed' or 'root feed' or 'feed of feeds' (that tells us where to look for the feeds).

The following image shows a master feeds (for the organisation itself), linking out to departmental master feeds, which in turn link to standard RSS feeds.

Image:Master feed.png

Equally well, there could just be one master feed linking to rss feeds, or there could be more levels in this hierarchy.

Often such a metafeed is expressed in the opml format, a outline-style format used by feed readers to exchange a collection of feeds. However, instead of using opml, we advocate using rss or atom. This is because opml is weakly defined, and usage is not uniform. (Of course, a syndication application may well be able to read opml as well, but here we are working on a common, easily usable, optimal format.)

One advantage of choosing rss/atom rather than opml is more uniformity: both the metafeed and the feeds themselves have essentially the same structure, and can be validated and parsed in the same way.


Contents

[edit] 1 RSS

To create a metafeed (i.e. a feed linking to several rss/atom feeds), we start with an example in rss.

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" 
    xmlns:atom="http://www.w3.org/2005/Atom" 
    xmlns:s="http://purl.org/steeple"
    xmlns:media="http://search.yahoo.com/mrss/">
 
<channel>
 
 <title>Podcasts from the University of Oxford</title>
 <atom:id>ox.ac.uk/oit/45204</atom:id>
 <lastBuildDate>Tue, 13 Oct 2009 14:31:48 +0100</lastBuildDate>
 <description>This is the master podcast feed from Oxford University</description>
 
 <atom:link rel="self" href="http://www.ox.ac.uk/.../this_feed.xml"  />
 <atom:link rel="alternate" href="http://www.ox.ac.uk/.../some_page.html" />
 <!-- the following link element is mandatory in an RSS feed -->
 <link>http://www.ox.ac.uk/.../some_page.html</link>
 
 <atom:category scheme="http://purl.org/steeple/organisation" label="Oxford University" />
 <atom:category scheme="http://purl.org/steeple/feedtype"  term="root" />
 
 <media:thumbnail url="http://.../logo.jpg" width="300" height="300" />
 
 <item>
   <title>Distinguished Lecture Series 2009</title>
   <guid isPermaLink="false">ox.ac.uk/oit/45204/234</guid>
   <pubDate>Wed, 07 Oct 2009 13:49:54 +0100</pubDate>
   <atom:updated>2009-09-13T18:30:00Z</atom:updated>
   <description>Distinguished Lecture Series 2009 featuring 10 outstanding speakers</description>
 
   <atom:link type="application/rss+xml" href="http://rss.oucs.ox.ac.uk/modlang/lectures/rss20.xml" rel="http://purl.org/steeple/subfeed" />
   <atom:link type="application/atom+xml" href="http://rss.oucs.ox.ac.uk/modlang/lectures/atom.xml" rel="http://purl.org/steeple/subfeed"  />
   <atom:link rel="alternate" type="text/html" href="http://www.mod-langs.ox.ac.uk" />
   <!-- the following link element is optional in an RSS feed -->
   <link>http://rss.oucs.ox.ac.uk/modlang/lectures/rss20.xml</link>
 
   <!-- additional institutional classification here: -->
   <atom:category scheme="http://purl.org/steeple/division" label="Humanities" term="human" />
   <atom:category scheme="http://purl.org/steeple/department" label="" />
   <atom:category scheme="http://purl.org/steeple/group" label="" />
 
   <media:thumbnail url="http://.../anotherlogo.jpg" width="300" height="300" />
 
 </item>
 
 <item>
<!--
 Another item
-->
 </item>
 
</channel>
 
</rss>

[edit] 2 Atom

We now give a second example of a metafeed, this time using atom rather than rss. Like the rss version, when compared to opml, it provides almost all you need, in a standardised format, that can be validated. As far as the syndication is concerned, the atom version is essentially equivalent to the rss version.

Comments in-line.

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
           xmlns:s="http://purl.org/steeple"
           xmlns:media="http://search.yahoo.com/mrss/">
 
 <title>Podcasts from the University of Oxford</title>
 <summary>This is the master podcast feed from Oxford University</summary>
 <link rel="self" href="http://.../this_feed.xml"  />
 
 <!-- "link rel=alternate" (for the top level metafeed would point to the Oxford web portal -->
 <link rel="alternate" href=".../some_page.html" />
 <updated>2009-09-13T23:55:18Z</updated>
 
 <!-- feed <id> --> 
 <id>ox.ac.uk/oit/45204</id>
 
 <!-- Author/email should be specified, so that feed owners can be contacted, and receive feedback email. -->
 <author>
 <name>OXITEMS, University of Oxford</name>
 <email>oxitems@...</email>
 </author>
 
 <!-- 
Categories for institution / division / department / group. The categories can be specified here, or on an entry-item level below.
For the organisational master feed, only the 'organisation' is required. If this was a master feed for 
a department, the corresponding categories would also be present. The 'dividision' is whatever structure
is present between 'organisation' and 'department', it can be omitted. 
-->
 <category scheme="http://purl.org/steeple/organisation" label="Oxford University" term="http://www.ox.ac.uk" />
 <atom:category scheme="http://purl.org/steeple/feedtype"  term="root" />
 
 <!-- top level image goes here, e.g. Oxford Logo -->
 <media:thumbnail url=".../logo.jpg" width="75" height="50" />
 
 <entry>
   <title>Distinguished Lecture Series 2009</title>
   <id>ox.ac.uk/oit/45204/234</id>
   <updated>2009-09-13T18:30:00Z</updated>
   <published>2009-09-13T23:55:18Z</published>
   <summary>Distinguished Lecture Series 2009 featuring 10 outstanding speakers</summary>
 
   <!-- The key part is the linking to further feeds. 
             We use <atom:link> with appropriate type/rel to link to rss and atom version of the feed,
             depending on which feed types are available. 
             Note that these links may link to actual podcast feeds, or may link to the next level of master feeds.
    -->
   <link type="application/rss+xml" href="http://rss.oucs.ox.ac.uk/modlang/lectures/rss20.xml" rel="http://purl.org/steeple/subfeed" />
   <link type="application/atom+xml" href="http://rss.oucs.ox.ac.uk/modlang/lectures/atom.xml" rel="http://purl.org/steeple/subfeed"  />
 
   <!-- The "link rel=alternate" points to the part of the oxford web portal hosting these feeds, 
             or perhaps to a departmental site. -->
   <link rel="alternate" type="text/html" href="http://www.mod-langs.ox.ac.uk">
 
   <!-- entry level image goes here, e.g. series logo -->
   <media:thumbnail url=".../logo.jpg" width="300" height="300" />
 
   <!-- Categories for institution can be added here and will override categories set above if there is duplication. --->
   <category scheme="http://purl.org/steeple/division" label="Humanities Division" term="human" />
   <category scheme="http://purl.org/steeple/department" label="Some department" />
   <category scheme="http://purl.org/steeple/group" label="Some group within department" />
 </entry>
 
  <!-- More entries can be added here ... -->
</feed>

[edit] 3 Questions

[edit] 3.1 Do you recommend atom or rss?

Note that we do not advocate atom over rss or vice versa, as we feel that this discussion is unhelpful in the present context. (For a detailed comparison, see here.) We propose that both atom and rss can be used. In practice, only the basic atom/rss elements differ, most specific fields do use either yahoo media or indeed atom constructs.

The basic fields correspond as follows:

RSS 2.0 item title description guid pubDate lastBuildDate (channel)
Atom 1.0 entry title summary id published updated

The rss version uses rss constructs where possible. However, the some key elements (such as atom:content and atom:link in the items themselves) still use atom contracts. So you can't do a 'pure' rss version. However, you might still use RSS if you are more familiar with it. If you use atom, you don't need to use any RSS, so your master feed is just atom. However, your other feeds might then still be RSS if that's what you've got. There's no problem in matching, but basically you choose what is more convenient for you!

[edit] 3.2 Why such-and-such element?

Why do you not use <pubDate> pubDate in the channel? Because there is no atom equivalent , and prob not tracked by feed producers anyway!

So how do the institutional categories work? We have got four levels of institutional structure:

  1. organisation,
  2. division,
  3. department,
  4. group.

which are indicated like this:

 <atom:category domain="http://purl.org/steeple/organisation" label="Oxford University" />
 <atom:category domain="http://purl.org/steeple/division" label="Humanities" term="human" />
 <atom:category domain="http://purl.org/steeple/department" label="English Department" />
 <atom:category domain="http://purl.org/steeple/group" label="Medieval English" />

These categories can be present in both the channel part of the feed, as well as in the items. To give some examples:

  • Master feed: Typically, only the organisation will be present in the channel of the master feed, and the the 'division' and 'department' might be used in the item.
  • Research group feed: All four categories might be in the channel of the feed, because they are the same for all items.
  • Search results: If search results are rendered as rss, then all four categories would typically be in the item, because each item returned could belong to a different organisation/division etc.

What does 'division' mean? - essentially any structure that a HEI may have, which sits inbetween the top level organisation and departments.

[edit] 3.3 What about OPML?

Because opml isn't properly specified, it's hard to validate. The use of url/htmlurl/xmlurl isn't necessarily standardised. There are also things that would possibly need to be 'invented' from scratch, such as placing an image as a top level image, or a detailed category scheme. So we don't advocate OPML. Having said that, here is a version of the feed in opml.

[edit] 3.4 How does all of this relate to PCP-style atom feeds?

Incidentally, the new apple podcast producer 2 shipped with snow leopard also uses atom for the podcast library. The above use of the atom specification was written without reference to PCP2, but once I've looked at PCP2, we'll probably revise it a little to harmonise the use of the atom format for this purpose.

[edit] 3.5 What if you get cyclical references?

When writing an aggregator, you need to make sure that cyclical references between master feeds don't lead to loops.

Page imported from here.