Note: This post is about a new product we're testing at my company Kosmix.
Search engines are great at finding the needle in a haystack. And that's perfect when you are looking for a needle. Often though, the main objective is not so much to find a specific needle as to explore the entire haystack.
When we're looking for a single fact, a single definitive web page, or the answer to a specific question, then the needle-in-haystack search engine model works really well. Where it breaks down is when the objective is to learn about, explore, or understand a broad topic. For example:
- Hiking the Continental Divide Trail.
- A loved one recently diagnosed with arthritis.
- You read the Da Vinci code and have an irresistible urge to learn more about the Priory of Sion.
- Saddened by George Carlin's death, you want to reminisce over his career.
The web contains a trove of information on all these topics. Moreover, the information of interest is not just facts (e.g., Wikipedia), but also opinion, community, multimedia, and products. What's missing is a service that organizes all the information on a topic so that you can explore it easily. The Kosmix team has been working for the past year on building just such a service, and we put out an alpha yesterday. You enter a topic, and our algorithms assemble a "topic page" for that topic. Check out the pages for Continental Divide Trail, arthritis, Priory of Sion, and George Carlin.
The problem we're solving is fundamentally different from search, and we've taken a fundamentally different approach. As I've written before, the web has evolved from a collection of documents that neatly fit in a search engine index, to a collection of rich interactive applications. Applications such as Facebook, MySpace, YouTube, and Yelp. Instead of serving results from an index, Kosmix builds topic pages by querying these applications and assembling the results on-the-fly into a 2-dimensional grid. We have partnered with many of the services that appear in the results pages, and use publicly available APIs in other cases.
Here are some of the challenging problems that we had to tackle in building this product:
- Figuring out which which applications are relevant to a topic. For example, Boorah, Yelp, and Google maps are relevant to the topic "restaurants 94041". WebMD, Mayo Clinic, and RightHealth are relevant to "arthritis". If we called each application for every query, the page would look very confusing, and our partners would get unhappy very quickly! I'll write more on how we do this in a separate post by itself, but it's very, very cool indeed.
- Figuring out related topics in the Related in the Kosmos section on each Topic page. For example, you can start from the Priory of Sion and laterally explore Rosslyn Chapel or the Madonna of the Rocks.
- Figuring out the placement and space allocation to each element in the 2-dimensional grid. Going from one dimension (linear list) to two dimensions (grid) turns out to be quite a challenge, both from an algorithmic and from a UI design point of view.
In this alpha, we've taken a first stab at tackling these challenges. We are still several months from having a product that we feel is ready to launch, but we decided to put this public alpha out there to gather user feedback and tune our service. Many aspects of the product will evolve between now and then: Do we have the right user interaction model for topic exploration? Do we put too much information on the topic page? Should we present it very differently? How do we combine human experts with our algorithms?
Most importantly, the Kosmix approach does not work for every query! Our goal is to organize information around topics, not answer arbitrary search queries. How do we make the distinction clear in the product itself? Can we carve out a separate niche from search engines?
We hope to gain insight into all these and more questions from this alpha. Please use it and provide your feedback!