« FeedMesh works! (blo.gs data) | Main | "Mention them and they will come!" »

April 17, 2005

Comments

Ben Hyde

I would be eternally grateful if it were possible to track the evolution of that fit over time. See if the fog in the blog sphere is condensing or not.

Cameron Marlow

Interesting. I'm assuming that these are new links made to a specific site in one day? Or is this for hosts? Would you mind divulging the top ranking sites on both charts? It's hard to believe that someone can be making hundreds of links per day.

To get rid of the messy tail on these distributions, you can use either logarithmic binning or the cumulative distribution function. Lada Adamic gives a good introduction here:

http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html

This will give you a much smoother fit, and also make it more apparent whether or not a power law is really in effect. The difference between a power law and many other distributions can be completely determined by the tail.

App

These data are about "pub". How about "sub"? What kind of statistics can you draw for them, given that you have a lot of them?

[Bob Wyman wrote: If you are suggesting that we should provide analysis of the subscriptions that we store and service, I'm sorry but we really don't think that would be appropriate. We are exceptionally cautious about anything that might even hint of an encroachment on the privacy of our users. While there is all sorts of fascinating data that can be extracted from the subscription data base, we think it is best to leave those trails unstudied until we find a way to ensure that we can do such studies without infringing on our users' actual or perceived rights to and expectations of privacy.]

The comments to this entry are closed.