Bill de hÓra has blogged an email exchange that he and I had some time ago in which we discussed some of the design decisions we've made at PubSub.com. The particular focus of the discussion was on our use mixed use of XML text-based encodings and ASN.1 binary encodings... As Bill mentions, I've received a good bit of heat over the years for supporting the mixed use of XML and ASN.1. Apparently, it is hard for folk to understand that it is possible to be an "XML-supporter" while still being an "ASN.1-supporter." But, after almost 20 years of this debate, (it really got started with ASN.1 vs SGML back in the 80's), I strongly feel that both types of encoding have a place in many systems. As I've commented elsewhere, on the XML-DEV list, I feel that the XML vs ASN.1 debate is really one of "computer science" vs the "human sciences." ASN.1 is better "computer science" while XML is better "human science." ASN.1 is typically more compact, efficient, etc. than XML while XML is usually much easier for people to produce and thus is often better for solving interop problems. But, both styles of encoding have their place. You simply have to be very careful in deciding which to use for each kind of interface you expose.
If you're interested to learn more about how PubSub works, see: Bill's Under the hood at PubSub. Or, the text below:
Continue reading "XML, ASN.1 and the PubSub.com architecture" »
Bill Burnham observes that RSS is "A Big Success In Danger of Failure". He writes that as the number of RSS "channels" explodes into the millions[1], "What RSS desperately needs are enhancements that will allow users to take advantage of the breadth of RSS feeds without being buried in irrelevant information." Burnham recognizes that search services, like that provided by PubSub.com, are a step in the right direction, but suggests that searches' "closely related cousins, classification and taxonomies, may have just what it takes" to allow users to improve the signal to noise ratio in what they read. If we could only come up with a universally accepted taxonomy and if we could train authors to accurately assign categories or subjects to their posts, it would be much easier to select relevant material from the sea of blog postings. But, while it may sound intuitively obvious that classification and taxonomy systems should be a simple solution to the problem of selecting content, the painful reality learned over decades of research and experimentation in the fields of Library Science and Information Retrieval is that indexing and classification is a terribly hard problem. What we've learned about this problem is very simply stated: There is not, and never will be, a single classification scheme or taxonomy that can satisfy the needs of all, or even most, users. It doesn't even look like we'll be lucky enough to discover an "80%" solution...
Nonetheless, there *is* a way to approach the problem of classification and taxonomy in a way that can lead to a solution... The solution to this problem requires that we focus more on identifying "subjects" rather than classifications, taxonomies or "topics". That probably sounds a bit obscure... I'll try to explain below:
Continue reading "Making Categorization work: (PSI) Published Subject Indicators" »
As Sergey Brin recently said in an Editor & and Publisher interview, "online advertising, especially contextual advertising, is evolving rapidly." One of the most exciting and effective domains for such advertising will be in publish/subscribe services like those we are pioneering at PubSub.com. Because a subscription to PubSub.com implies a "persistent" personal interest in some kind of content, ads placed with us won't suffer the traditional problems pointed out by Fred Wilson: "Contextual advertising doesn't tell you much about a person's behavior." i.e. while an ad might appear on pages with the keyword "mortgage," you can't really tell if those who saw your ad were really interested in mortgages. On the other hand, an ad displayed to someone who has explicitly subscribed to "mortgage rates connecticut" is guaranteed to be seen by someone with a declared, persistent interest in information about mortgage rates in Connecticut. Publish/Subscribe systems, by collecting explicit statements from subscribers about the content they wish to see, are able to deliver targeted content much more effectively and accurately than the existing alternatives can.
Continue reading "Publish/Subscribe: Untapped paid-search opportunity" »
Is TrackBack obsolete? Is it just an artifact of the limitations of the early blogosphere -- before there was support for real-time matching of blog content such as that provided by PubSub.com?
TrackBack allows writers to build links from the blogs or items about which they write back to their comments. The technology exists in order to overcome a fundamental limitation of the web as we know it -- web links are uni-directional. Thus, while it is easy to link from your blog to an entry in another blog and thus easy for readers of your blog to navigate to the other blog, HTTP provides no way for readers of the other blog to discover your link to that blog. For someone to be able to "trackback" from an item in one blog to all the other blogs that refer to it requires either the use of a protocol like TrackBack or a service like PubSub.com that builds the "back links" automatically.
Continue reading "Is Trackback Obsolete? PubSub Enables the bi-directional web" »
Danny Ayers has used the new "Referenced URI" feature at PubSub.com to create a PubSub subscription that tracks blog entries containing references to his blog at: http://dannyayers.com/ . Since this blog entry references Danny's blog, he'll soon see an entry in the RSS feed we maintain for him that points to this entry. Thus, what Danny now has is the equivelant of "Automatic TrackBack." i.e. we've created bi-directional hyperlinks between our blogs by using the PubSub.com matching engine to link us together.
Continue reading "Referencing Danny's URI" »