« Dave Winer: Show me that mathematical proof! | Main | Evolution = Revolution »

April 30, 2006

Comments

Randy Charles Morin

You keep saying it's "not broken". But I keep seeing that PubSub and Technorati continue to fail to index blogs that are obviously pinging both regularly. I even manually ping from time-to-time to force an update and this still doesn't work.

Anybody can see this for themself.

http://www.pubsub.com/site_stats.php?site=kbcafe.com

Note the days with zero posts. I post a lot 5-20 times per day and more every day. Those zeros represent 5-20 dropped pings. Now click on the inlink and outlink totals for the last few days. Blankness. So, there's no actual data behind these numbers.

Now let's examine Technorati...

http://www.technorati.com/search/www.kbcafe.com%2Fiblogthere4im

My primary blog that is updated several times per day and pings Technorati with each post hasn't been updated in 53 days.

As anybody can tell, there's a disconnect between what Bob is saying and the end result. Am I still a bit light on data?

[Randy, we're having some "issues" with our LinkRank and LinkCount applications. At this point is that the statistics we're publishing don't accurately reflect the items we're picking up, processing and "indexing" from the feeds we read. Yes, this is embarrassing... Please note that the various statistics applications we host are layered on top of the basic publish/subscribe content-routing system. Thus, an issue with the statistics does not necessarily indicate an issue with the underlying system. (It might, but in this case it doesn't.) So, it would be reasonable to say that our stats are temporarily screwed up, however, that doesn't make it reasonable to suggest that the core matching system is broken. In fact, in recent weeks, the core system has been running massively better than it has in a very long time.

bob wyman]

Dossy Shiobara

Bob, I read your response to Randy ... is the PubSub "search subscription" service also "layered on top of the basic pub/sub content-routing system" too, because it's broken as well.

http://atom.pubsub.com/2e/79/86e2eeb2212ae45c8ba91cde29.xml

It says last updated "2006-05-18T11:09:36-04:00" but the most recent entry was published "2006-05-15T09:00:02-04:00" -- the most recent entry on my blog is http://dossy.org/archives/000282.html from 5/16 9:05 AM US/Eastern -- two days ago, now.

This entry appears in the feed many times, 18 out of the 31 results, in fact:

del.icio.us/dossy links since May 1, 2006 at 09:00 AM
http://dossy.org/archives/000278.html

Results aren't sorted in reverse chronological order (most recent first) which is odd, too.

And, the whole promise of PubSub.com's service is that it searches ALL of the web, right? If it's not even picking up my own blog's entries in a timely fashion, what level of confidence should I have that it's not missing LOTS of other sites that I don't even know about, yet? The whole aspect of "searching the future and discovering new sources" of PubSub.com gets thrown into doubt if there's little confidence in its accuracy.

Dossy Shiobara

I think I know why my 5/16 wine entry isn't showing up: no use of the word "dossy" anywhere except in the metadata (dc:creator, and URLs). I think PubSub search should match on the URLs though -- would let me find out who's linking to me without waiting for click-throughs on the links to see them in my referer reports. I guess I'm using Technorati for that, though.

The comments to this entry are closed.