« Why SMS GupShup is Bigger than Twitter | Main | The Real Long Tail: Why both Chris Anderson and Anita Elberse are Wrong »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Nasser Manesh


This is the first time I heard abotu Kosmix, and two paragraph down in your blog post I was already sold. I completely identify with the problem you're trying to solve -- it is a pain point and I'm very glad to see somebody has decided to work on it and address it. All the power to you guys! I will write about you.


Just out of curiosity, how does this differ from what ask.com does (http://www.ask.com/web?q=george+carlin&search=search&qsrc=0&o=0&l=dir) or powerset (http://www.powerset.com/explore/go/george-carlin)?


I tried "Storage Deduplication", which is the industry I work in. And I have to say that kosmix results are way better than ask and powerset's. Good job!

White Eagle

It looks like Glue page from Yahoo India

Anand Rajaraman

Jeremy: Take a look at the pages from Ask, Powerset and Kosmix for the same topic e.g., George Carlin, or arthritis. That should explain the difference.

White Eagle: Yahoo Glue has the same idea in terms of 2-dimensional layout. The difference is in the details -- the number and variety of the kinds of information that shows up in each topic page. And the algorithms that decide what should how up on each page.



I really like your blog and follow it regularly. You've some interesting insights into search and advertising. I've played with kosmix a few times.

But the fundamental problem that I face with kosmix like sites is that there is too much noise-to-signal ratio. What I like about Google ( and eventually other search engines have followed Google's design ) is the simplicity and a lower noise-to-signal ratio.

If the user is exposed to too much of information, in this case it would be *clickable* links, IMO, the avg probability of each link getting clicked would be very low. For instance I queried for "lake chelan" as I'm interested in finding a log cabin for the 4th July weekend.

I get a zillion links that tell me various different things at kosmix.

Where as in google, there is an ad at the top that talks about reservations ( I will click it if I'm interested to make a reservation ) and there is a visitor center link in the algo results ( that I will click if I'm in a research mode on what to see )


The problem that Google tries to solve is to minimize this noise-to-signal ratio as much as possible and that is fundamental aspect of ranking to show the user as much relevant info at the top so that he can click and get away from Google to carry on his business.

You can probably characterize this ratio in terms of the dwell time of the user when shown the search results and number of clicks he has to make to reach his desirable destination. IMO, Google scores low on both dwell time and the # of clicks compared to other search engines, making it a very simple user experience.

I guess with so many information sources you've, you could try to build a personalized search engine only exposing those to the user that he might be interested in. You can change his current experience for a given query based on what he did for the previous query, dynamically adapting to his online behaviour.

Anand Rajaraman

Krishna: Thanks for your comment, it raises an important point. If your goal is purely to make reservations, then you are looking for a needle (the reservation page), and Google or Yahoo search is indeed your best bet.

On the other hand, the Kosmix topic page gives you an immersive 360 degree view of Lake Chelan: images, videos, what people are saying, and so on. Exploration always entails some noise, that goes with the territory! And it's part of the fun too.


BTW, the RightTrips link you posted still uses our old product, not the new alpha, so that's not what this article is about. We'll be migrating RightTrips to the new platform soon.


Hello - ran into a blog a week or so ago and found your articles very interesting - very interesting problems that you discuss.

I just tried a couple of searches on kosmix - both obvious, but one of them a loaded word (e.g. Java). The results were quite good and pertinent but I also think that the amount of information generated was overwhelming as the page seemed "too busy" (i.e. too much presented in one page).

Now, as an end user, I may end up being interested in lot of that info, but if I had no idea of that topic to begin with, I may want to "gentler introduction". For example, I may want a synopsis, then be able to drill down to get more info. Here it was sort of like floodgates were opened immediately and I still had the job to sort through stuff - which I would love to have offloaded on to your app (granted all of the info is much more meaningful than what you get in a websearch).

Anyway, excellent blog!



I came here after seeing a reference on GigaOM to the needle-and-haystack analogy. I think the truth is that whether we want to zero in or something or explore depends on the situation. For instance, I feel an overload on my nervous system with too much stuff on the Net, and follow either a "need to know" basis, or some days, I go "random walking" like one does on mountain holidays -- and discover lovely little trails of knowledge.
Key point: An algorithm is a good way to organise the stuff, but can only go so far. I would settle for a fine mix between the Wiki and Kosmix models with the former using the latter...of course, as a media guy, I never believed in Loser Generated Content, which is what I call UGC. Would you like user generated algorithms?
cheers. Madhavan.

Elad Kehat

Very cool product. The idea is sound, and the execution in terms of UI is good IMO.

However, the relevance of your results isn't that good when the query isn't too unique. I think you'll have to work hard on disambiguation. Like Krishna commented above, the signal to noise ratio is important, and when the topic queried on is ambiguous, there's a lot on the page that isn't relevant to me.

I'm really interested on how you're figuring out which applications are relevant to the topic. I had a related problem in my startup (hivesight.com), and would love to compare solutions.


> the Kosmix approach does not work for every query ... How do we make the distinction clear in the product itself? Can we carve out a separate niche from search engines?

Get acquired by Microsoft or Google and let them worry about when it works :)

Jason Adams

Being able to submit feedback on each part of the results would be nice. "Was this page helpful?" I have found that some parts are very helpful and others are so wrong I hesitate to say the page was helpful at all. If I could pinpoint the part that wasn't helpful and reinforce the part that was, I'd be more willing to provide feedback after searching.

Anand Rajaraman

Jason: That's a great observation. We plan to allow feedback on parts of the page shortly. Agree it would be very useful.


The problem we're solving is fundamentally different from search

No, it's not fundamentally different from search. It is fundamentally different from what search has become over the past 5 or 10 years. But prior to the rise of internet search engines, "search" itself, as a discipline or field of study (i.e. "information retrieval") did include exactly what you are doing here.


Wow, this is such a cool product. It has features that are similar to Clutsy.com which I have started using (It has a two dimesional model). I'm keen to see how this product will be developing and will definitely be making more use of it as I get very irritated with search engiens like Google - with their poor relevancy sometimes.

The comments to this entry are closed.