MemeStreams | MemeStreams Discussion

Create an Account

This page contains all of the posts and discussion on MemeStreams referencing the following web page: XML.com: Working with Bayesian Categorizers [Nov. 19, 2003]. You can find discussions on MemeStreams as you surf the web, even if you aren't a MemeStreams member, using the Threads Bookmarklet.

XML.com: Working with Bayesian Categorizers [Nov. 19, 2003]
by eiron at 12:41 am EST, Dec 6, 2003

] There's been some discussion in the blog world about
] using a Bayesian categorizer to enable a person to
] discriminate along various interest/non-interest axes. I
] took a run at this recently and, although my experiments
] haven't been wildly successful, I want to report them
] because I think the idea may have merit.

It seems like a nice idea, but it doesn't this approach probably wouldn't work well with blogs. Spam may be easily classified by bayesian filters. The content of two blog entries, however, could easily contain many common keywords, yet provide significantly different levels of interest to the reader.

Also, if you're going to go through the trouble of structuring a set of articles in a way that they could be parsed by some filter, effectively restricting the article database to a single system, one might as well be using Memestreams, at least in its methods. I believe the results would be more worthwhile.

RE: XML.com: Working with Bayesian Categorizers [Nov. 19, 2003]
by Decius at 4:13 pm EST, Dec 6, 2003

eiron wrote:
] It seems like a nice idea, but it doesn't this approach
] probably wouldn't work well with blogs. Spam may be easily
] classified by bayesian filters. The content of two blog
] entries, however, could easily contain many common keywords,
] yet provide significantly different levels of interest to the
] reader.

I agree... However, thanks for recommending the article. These perl libraries are useful for something else: Figuring out when articles on MemeStreams are associated with the same subject...