Denise Howell has been analyzing the implications of copyright law for RSS/Atom syndication on her new Lawgarithms blog at ZDnet. Clearly stating that many of the issues are currently unsettled, Denise encourages "participatory law" and asks readers to provide "reasons why RSS publishing isn't like free magazine publishing"... There are quite a few reasons, I think. As one digs into this subject of the law of syndication, it will become apparent that there are many, many interesting issues to be dealt with. For instance, one very significant difference is:
The publisher of a magazine has total control over the presentation of content in that magazine. However, the publisher of an RSS/Atom syndication feed has very little, if any, "control" over the presentation of data published in a feed.
The implication of this difference is that we might be making a mistake when we compare web and RSS/Atom syndicated content to normal text-based paper publications. It might be more useful to consider such content as though it were computer programs or even music!
XML, HTML, RSS, Atom, etc. are all members of a class of markup languages that quite intentionally do *not* have fixed presentation formats or styles. Thus, a content creator can never be sure what their content will look like when presented to a user. A given chunk of HTML may have one appearance in Internet Explore 6 yet look slightly different when viewed with IE7, FireFox, or Flock. The same content will look drastically different if viewed using the "text-only" Lynx browser or via the browser on a cell phone or PDA. Additionally, two people, both using the same browser software, might see different presentations as a result of different browser preferences, screen resolution, or the use of local CSS style sheets. In each case, the presentation of the encoded data is adjusted to address the limitations of the display device, user preferences, etc. Such adjustment does NOT happen and in most cases cannot happen with content in magazines. Once ink is laid on paper, the only changes expected are things like yellowing of the paper over time or fading of the ink...
The plasticity or indeterminateness of presentation which is inherent in online content presents a drastically different environment than the one that exists for the "free magazine" publisher that Denise talks about on her site. One big difference comes in the process of recognizing when content has been copied. When you run the page of a magazine through a copier, the result is something that is easily recognized as a direct copy. However, if I do a "print screen" while using the text-only Lynx browser on a typical web page, it is likely that the results will be so different from what that page's creator saw and intended, that it would be difficult for them to recognize the page without more work. Of course, the same "difficulty" in recognition could arise with a paper-based magazine if I copied it by retyping the text and then used different styles in printing it. This would, of course, not be a "copy" but rather, the production of a "derivative work" that was based on the paper magazine. The interesting thing here is that in the online world, ALL presentations of "copies" are actually derivative works -- not mere copies. In fact, in the online world, the original form of a work, the thing that gets "copied," is just binary data which cannot be directly viewed by humans.
Now, much has been said about the "implied license" to "copy" otherwise protected works on the Internet since doing so is a technical requirement of using the works. It seems settled that copying data over network links, into screen buffers, system caches, etc. is not prohibited. However, I think we need to recognize that there is at least an additional implied license here, and that is a license to produce a derivative work -- based on the HTML, RSS, Atom, etc. which is produced by a copyright holder or publisher. This implied license exists since the published content cannot be used without producing a derivative work -- potentially a derivative work that differs greatly from what the publisher expected.
So, there are at least two differences between an RSS/Atom feed and a magazine. The RSS feed, like all web content which is encoded in XML and HTML derivatives comes with two implied licenses which are not associated with the magazine or most other kinds of printed works:
- A limited license to "copy" when doing so is facilitative to viewing the content and is required for viewing
- A limited license to produce derivative works (i.e. determine presentation format and appearance)
These two differences are substantial and thus it might be best for us to stop thinking of paper or printing-related analogs when looking for insight into how copyright applies to web content or Internet syndication feed content. It might be more appropriate, for instance, for us to think about such content as being more like:
- Computer programs: They must be "copied" from disk into memory in order to run and their results depend not only on the data and instructions in the programs themselves but the capabilities of the devices on which they run, personal preferences (e.g. choices of screen resolution, etc.), etc. Note: The fact that many web pages and even syndication feeds contain scripts (sub-programs) strengthens this analogy.
- Music: While sheet music doesn't need to be "copied" in order to be performed, the results of any performance will be determined by the skill of the performer, personal tastes, the instruments used, etc. -- things not under the control of the composer of the sheet music.
- Scripts for plays, movies, etc: Similar to sheet music. Some "performances" will be unrecognizable to the original author...
Of course, copyright law and our natural sense of what is fair are drastically different when applied to printed materials like magazines then they are when applied to things like computer programs, sheet music and scripts. For instance, we might expect that a program writer, composer or playwright has a right to prevent non-required copying; however, it would be very odd to say that copyright law itself forbids the use of software on "slow" computers or forbids "bad" performances of a piece of sheet music. The point here is that in areas other than printed works, we allow the users of copyrighted content a great deal of latitude in how the content is presented or performed. Yet, much of the debate about the use of copyrighted syndication feeds involves people's objections to the way in which their syndicated content is presented...
In the area of performance variation, it might be that syndicated content is much more like sheet music than it is like even a computer program. (Even though XML and HTML are essentially "programs" which are executed or interpreted by other programs.) The similarity to sheet music comes in the aspect of "selective" execution or performance. Imagine a bit of sheet music for piano that is written for two hands. It has both a harmony and a base line. Yet, performers will often just play the harmony -- without the base. This is very similar to what often happens with syndicated content. Usually, not all of the content in a feed is actually "performed" or presented to the user.
If you look at the standards for either RSS or Atom, you'll see that many fields are defined in both, yet, most aggregators will only present to their users a subset of the fields that actually appear in feeds. While a feed might include entries with attachments, several different dates, FOAF elements, Dublin Core extensions, RSS-Media content, etc. many aggregators will only present a single date, a title, a link, and the text from the description or content feeds. Some aggregators will normally present all of the content in the content or description elements but will filter out embedded scripts, images, applets, etc. This is expected behavior in much the same way that we expect that the performer of a piece of sheet music will often omit one or more of the recorded parts.
An extreme example of "partial performance" can be seen in Dave Winer's "NewsRivers". For instance, in order to address the needs of low-bandwidth or small screen devices like cell phones or PDAs, Winer takes a complex, multi-element feed from the New York Times and distills it to little more than a link and some very much abbreviated text. While in the original feed each story carried at least one date, Dave clusters all stories for a single date together. Also, many fields of data that were in the original are not replicated in his "rivers." Is this selective display permitted? Leaving aside the question of modifying the contents of a field (such as the abbreviation of the content field), this sort of selective rendering of content is exactly what is intended by those who wrote the RSS and Atom specifications. Those specifications say much about how to create a valid syndication feed, however, they intentionally say virtually nothing about expectations concerning how the feeds will be formatted or presented.
Naturally, one asks: If one is permitted to eliminate items from the presentation of a syndicated feed, is one permitted to insert items or elements when presenting a feed? If so, are there limits to this right?
My personal feeling is that most of the issues in this area will eventually hinge on questions of intent. The difference between "white hat" and "black hat" uses is usually not found in what was done but rather in why it was done. If you remove ad links or images as part of the process of adapting a piece of content to some specific device or display environment, you're probably ok. However, if you remove the ads and replace them with others, then you've probably got a problem. Similarly, removing elements in such a way that you eliminate attribution to the original publisher or author is likely to be a problem... We'll need to work through a long set of examples before we have patterns for what is reasonable and what is not. The "implied license to perform" web content is limited in just the same way that the "implied license to copy" such content is limited.
Hopefully, I've illuminated at least a few differences between a syndication feed and a "magazine" as Denise requested. I've also tried to suggest that we might all be making a mistake by trying to find similarities between online works and works on paper. The interesting question at this point is: "So what?" What are the implications of these differences and how will they effect what we do and how we do it in the world of syndication? Hopefully, Denise (who is a lawyer) will be able to provide us some guidance here.