Archive

Posts Tagged ‘Bayesian’

Fight with information overload the same way we fight with spam

December 28th, 2008
Comments Off

Information overload is a real problem in our age. I don’t remember how many years ago I stopped adding new blogs and other XML feeds to my feed aggregator because I couldn’t read all of the posts in the blogs to which I was already subscribed.

About a year ago I had this idea, that the same Bayesian statistical theory that is successfully used to fight with spam can also be applied to feeds aggregation. I actually started creating my own feeds aggregator based on this idea, but I didn’t have enough time for it, so it was frozen.

The main idea seems to be pretty simple and I don’t see why any of the existing aggregators don’t use it already. Basically, you allow the users to say if any particular blog post was interesting for them or not. It should be a binary yes or no thing. You then use Bayesian theory and look at the words in the post that was marked as interesting or not and this way you build a statistical profile for each user of what they like and what not.

After some initial training such system can start showing you only the posts you are really interested in and hide the ones you probably won’t like, thus reducing the information overload dramatically. What’s great here is that even if you’ve missed some of the posts that you’d like, there is good chance that due to blogs interlinking you will still see a link to that post in another blog post which would be statistically considered interesting for you.

There is so much that can be done with this — the system can suggest you blogs (or posts in the blogs) which you are not subscribed to, but there is a high probability you will like them based on your statistical profile.

This can even become a social network where you can find people with interests similar to yours based on their and your statistical profiles.

And this is just for starters.

Care to comment?

Ideas ,