Datawocky

On Teasing Patterns from Data, with Applications to Search, Social Media, and Advertising

Stop Email Overload and Break Silos Using Wikis, Blogs, and IM

Email is the central nervous system of most modern organizations, from startups to large corporations. Every communication, from the most important (planning for the big client meeting tomorrow) to the most trivial (fresh donuts in the kitchen) takes place through the corporate email system. The results: email overload and lowered productivity for the entire organization. Employees are tethered to their email via Blackberries even over the weekend, leading to communications burnout.

The biggest single reason for this is the inherent nature of email itself: it is a point-to-point communication medium. The sender has to decide both the content of the message as well as who the recipients are. If the recipient list is too large, it contributes to email overload. If it is too small, that could lead to communication gaps and "informational silos" in the organization, where one group in the company doesn't really know what the other group is doing. Another problem is that each email message is a single unit, making it hard to track conversations among multiple parties. Many email readers thread conversations, but that is done at a syntactic rather than semantic level. Finally, putting everything in email makes it difficult to build institutional memory.

We hit the email wall at my company Kosmix recently. When we were less than 30 people, managing by email worked reasonably well. The team was small enough that everyone knew what everyone else was doing. Frequent hallway conversations reinforced relationships. However, once we crossed the 30-person mark, we noticed problems creeping in. We started hearing complaints of email overload and too many meetings. And despite the email overload and too many meetings, people still felt that there was a communication problem and a lack of visibility across teams and projects. We were straining the limits of email as the sole communications mechanism.

We knew something had to be done. But what? Sri Subramaniam, our head of engineering, proposed a bold restructuring of our internal communications. He led an effort that resulted in us relying less on email and more on wikis, blogs, and instant messaging. Here's how we use these technologies everyday in running our business.

Blogs for Status Reports

Each employee and each project has a dedicated blog. People can post as often as they wish to their personal or project blog, but they are required to post at least one weekly status update. All blogs are visible to everyone in the company. Anyone can subscribe to the feed for any particular team or individual blog. So for example, Josh in engineering can follow the blog of Mike in sales, if he's curious what Mike is up to. This results in complete 360 degree visibility throughout the organization. People can also post comments on these blogs. Someone might post a problem they are facing, and others can post comments providing suggestions. This results in automatic grouping of conversations based on topics of interest.

The biggest advantage of the blog approach is that it is a publish-subscribe mechanism. I don't need to decide who to direct my communication to;  I just post on my blog. Anyone in the company who is interested in what I'm doing can subscribe to my blog to be notified of updates. And if someone just has a passing interest, they can always read my blog periodically without subscribing to it. This approach also breaks silos, for example, between engineering and marketing, or between marketing and sales. Sometimes the best product ideas come from sales people. And sometimes the best sales ideas come from engineers.

No one is required to read any particular blog, with two exceptions:

  1. Managers are expected to read the status updates of their team members and post feedback.
  2. People working on a project are expected to read each other's blogs.

The blog approach has reduced email overload at Kosmix and even reduced the number of time-consuming "status update" meetings.  Most important, the blog serves as an institutional memory -- an electronic record of our business. Conversations do not get lost in the ether but are recorded and can be searched at any time in the future by new people on a project or new company employees.

The Wiki for Persistent Information

While blogs are great for status updates and discussions around ideas, they are not the best place to put items that serve as reference material: for example, documentation, specs, reports, and so on. The problem is that blogs are in reverse chronological order, and each blog can have just one author, preventing collaborative editing. For these situations, we use a wiki. The internal corporate wiki has sections corresponding to each project and each functional group in the company. Documentation, specs, and reports go into the wiki.

The other critical section on the wiki is the Team section. Every employee has a homepage on the wiki, with a recent photo, describing their responsibilities at work and interests outside of work. As the team grows, and you see a new face at the office, this is a quick way of finding out who that person is.

Instant Messaging for Spontaneous Discussions

As Kosmix has grown, we now have people working from more than one physical location. In addition, we promote a culture of people working from home whenever it is compatible with their job responsibilities. Thus, we need a substitute for the face-to-face hallway conversations that cannot happen because someone is working from home or from another location. Email is not the best option because it is asynchronous and thus loses the spontaneity of a hallway chat.

Instant messaging fills this need very well indeed. The entire Kosmix team is on IM. Each team member is required to set the "status" message on their IM client during normal sane working hours to indicate where they are working from. They can also post a "Do not disturb" message to indicate that they don't welcome interruptions at the moment. Instant messaging leads to quick resolution of many issues without spawning interminable email threads.

If I needed to have everyone in the company on my IM buddy list, that would be a very long buddy list indeed. To avoid this problem, every team member's IM handle and status are displayed on their wiki homepage. You can initiate an IM session with anyone from their wiki homepage.

Convention over Configuration

The effects of the communication restructuring have been immediate and very visible. They include a lot less email and almost none on weekends; better communication among people; and 360 degree visibility for every member of the Kosmix team. After we instituted these changes, everyone on the team feels more productive, more knowledgeable about the company, has more spare time to spend on things outside of work.

Kosmix is certainly not the first company to use internal blogs, wikis, and IM for corporate communication. Google has been using blogs for status reports for a while now. The big difference is the conventions we have established about we use these tools. For example, one of the common complaints I've heard about Google's use of internal blogs is that most employees feel no one reads their blog. I've heard of one case where an employee for several weeks posted a status report that read: "Is anyone reading this?" By establishing conventions around expected read-write patterns, we have avoided this problem so far at Kosmix.

No doubt as Kosmix grows further, even this model will break down at some point and we will have to look for new communication models. I'll post an update when that happens! In the meantime, please do share your experiences of innovative corporate communication practices.

Implementation Notes

We use twiki for our wiki and blog software at Kosmix. The wiki functionality in twiki is great, but it took quite a bit of customization work from our indefatigable Sri Subramaniam to make it work well as a blogging platform too. We are planning to release our twiki tweaks as open source in the next couple of months, once we have a chance to package them neatly.

Another great option for blogs is WordPress, which allows you to host blogs internal to your company. We went with twiki because of the integrated wiki/blogging solution.

We have standardized on Yahoo! Instant Messenger for instant messaging. However, the other IM products such as MSN Instant Messenger and Google Talk have comparable functionality. I would suggest you pick the one most people in your company already use for personal communication.

July 21, 2008 | Permalink | Comments (16) | TrackBack (0)

Why Google Doesn't Provide Earnings Forecasts

Most public companies provide forecasts of revenue and earnings in the upcoming quarters. These forecasts (sometimes called "guidance") form the basis of the work most stock analysts do to make buy and sell recommendations. Much to the consternation of these analysts, Google is among the few companies that have refused to follow this practice. As a result, estimates of Google's revenue by analysts using publicly available data, like comScore numbers, have often been spectacularly wrong. Today's earnings call may be no different.

A Google executive once explained to me why Google doesn't provide forecasts. To understand it, you have think about the engineers at Google who work on optimizing AdWords. How do they know they're doing a good job? We know that Google is constantly bucket-testing tweaks to their AdWords algorithms. An ad optimization project is considered successful if it has one of two results:

  • Increase revenue per search (RPS), while not using additional ad real estate on the search results page (SERP).
  • Reduce the ad real estate on each SERP, while not reducing RPS.

The tricky cases are the ones that increase RPS, while also using more ad real estate. It then becomes a judgment call on whether they should be rolled out across the site. If Google were to make earnings forecasts, the thinking went, there would be huge temptation to roll out tweaks in the gray area to make the numbers. As the quarters roll by, the area of the page devotes to ads would keep steadily increasing, leading to longer term problems with customer retention.

Of course, this doesn't mean there is no earnings pressure. In reality, whether they issue guidance or not, Google's stock price does depend on whether they continue to deliver robust revenue and earnings growth. So implicitly, there is always pressure to beat the estimates. And for the first time, as Google's stock has taken a hammering in recent months, I've heard about hiring slowdowns at Google. So there is definitely pressure to cut costs as well. It will be interesting to observe the battle between idealism and expediency play itself out, with its progress reflected in the ad real estate on Google's search results. It's easy to be idealistic with the wind behind your back; the true test is whether you retain the idealism in the face of headwinds. Time will tell.

This brings us to today's earnings call. In my experience, the best predictor of Google earnings has been Efficient Frontier's excellent Search Engine Performance Report. EF is the largest ad agency for SEM advertisers, and manages the campaigns of several large advertisers on Google, Yahoo, and Microsoft. As I had noted earlier, in Q1 an estimate based on their report handily beat other forecasts, most of which use ComScore data. (Disclosure: My fund Cambrian Ventures is an investor in EF.)

EF's report for Q2, released this morning, indicates a strong quarter for Google. Google gained more than its fair share of advertising dollars in Q2 2008. For every new dollar spent on search advertising, $1.10 was spent on Google, at the expense of Yahoo and Microsoft. In addition, Google's average cost-per-click (CPC) increased by 13.8% in Q2 2008 versus Q2 2007, while click volume and CTR increased as well. And, there was strong growth overseas as well, which should help earnings given the weak dollar.

I don't have the time right now to do the math and figure out whether the robust performance was sufficient to beat the Street's estimates. You should read the report for yourself and make that call.

Update: Google's results, although robust, were below expectations. The biggest moment in the earnings call for me was this quote from Sergey (via Silicon Alley Insider):

Sergey said the company may have overdone its quality control efforts in the quarter (reducing the number of ads), and the reversal of this could provide a modest accelerator to Q3

Quality efforts "overdone"? Apparently those pressures are telling after all, and Google is going abandon their principles a wee bit to venture into the grey zone. Is is the start of a slippery slope?

July 17, 2008 in Advertising, Search | Permalink | Comments (1) | TrackBack (0)

The Real Long Tail: Why both Chris Anderson and Anita Elberse are Wrong

A new study by Anita Elberse, published in the Harvard Business Review, raises questions about the validity of Chris Anderson's Long Tail theory. If you're related to Rip Van Winkle, the Long Tail theory suggests that the dramatically lower distribution costs for media (such as music and movies) enabled by the internet has the potential to reshape the demand curve for media. Traditionally, these businesses have been hits-driven, with the majority of revenue and profits being attributable to a small number of items (the hits). Anderson argues that the internet's ability to serve niches cost-effectively increases the demand for items further down the "tail" of the demand curve, making the aggregate demand for the tail comparable to that for the head.

Anderson's insight resonated instantly with the digerati. It is said that Helen of Troy's face launched a thousand ships; the Long Tail theory certainly launched more than a thousand startups, all with an obligatory Long Tail slide in their investor pitches. Recently, however, there has been a creeping suspicion that the data don't support the theory; the backlash has been spearheaded, among others, by Lee Gomes of the Wall Street Journal. In her piece, Anita Elberse does a deep dive into the data and concludes that the Long Tail theory is flawed.

Anderson has posted a rebuttal on his blog, pointing out a problem with Elberse's analysis: defining the head and tail in percentage terms. There is some truth to Anderson's rebuttal. But the heart of Elberse's criticism lies not in the definition of the head and the tail. It's in using McPhee's theory of exposure to conclude that positive feedback effects reinforce the popularity of hits, while dooming items in the tail to perpetual obscurity. She presents data from Quickflix, an Australian movie rentals service showing that movies in the tail are rated on average lower than movies in the head. Thus, movies in the tail are destined to remain in the tail. Elberse exhorts media executives to concentrate their resources on backing a small set of potential blockbusters, rather than fritter it away on niches.

The big problem with this argument is that it conflates cause and effect. Before the internet, distribution was expensive, and there was no way for consumers to provide instant feedback on products. Consumers then got little choice in the matter of what items were readily available and what items were hard to find. Thus, the hits were picked by a few studio executives, publishers, or record producers who "greenlighted" projects they thought had hit potential. But when distribution is cheap, and consumer feedback loops are in place, the items that a lot of consumers like become popular and move into the head. It's not that items in the tail are inherently rated lower; items are in the tail precisely because they are rated lower.

It's as if we're comparing two systems of government, a hereditary aristocracy and a democracy, by comparing the sizes of the ruling elite in the two cases. That misses the point entirely. What matters is not the size of the ruling elite, it's how they got there. So, the big change wrought by the internet is not so much to change the shape of the demand curve for media products, as Anderson claims; nor has there been no change whatsoever, as Elberse posits. The big change is not in what fraction of the demand is in the head, it's in how the items that are in the head got there in the first place. Any change in the shape of the curve itself is incidental.

There's another market where we are seeing this phenomenon play out: the market for Facebook (and MySpace) apps. In earlier years, it took a lot of capital to get a company off the ground. The companies that got funded were the ones with good business plans who could convince VCs to take the plunge based on the people, the plan, and potentially some intellectual property. But it doesn't take much capital to write a Facebook app, leading to a proliferation of them. This paves the way for the expected inversion. Facebook users don't use the apps that VCs fund. Instead, Facebook users decide which apps they like, and VCs fund the ones, such as Slide and RockYou, that gain popularity.

It is instructive to look at the Facebook Facebook app trends study published by Roger Margoulas and Ben Lorica at O'Reilly Research. The study shows that at last count, there were close to 30,000 facebook apps. Usage, however, is highly concentrated among the top few apps, a classic example of a hits-driven industry (see graph) -- no long tail. However, these hits have been produced by the collective action of millions of Facebook users, rather than by a small set of savvy media executives. And there's a lot of churn: new applications join the winners and old winners die and are buried in the tail.

The real Long Tail created by the internet is not the long tail of consumption, but the long tail of influence. Earlier, the ability to influence the decisions on who the winners and losers were rested with a few media executives. Now every social network user has some potential influence, however small, on the result. The long tail of influence, combined with instant feedback loops, leads to a short tail of consumption. The Facebook app market is a leading indicator of the path the entire media industry will take in years to come.

Update: Chris Anderson has posted a rebuttal in the Comments. Thanks Chris! Please do read his comment and my response. Chris points out that Facebook apps still follow a power law distribution. It doesn't matter how long the tail is, what matters is how heavy it is. The area under the long tail is a function of both length and depth, and depends crucially on the power law exponent. For the mathematically minded, the details are here.

July 09, 2008 in Social Media, Venture Capital | Permalink | Comments (8) | TrackBack (0)

Searching for a Needle or Exploring the Haystack?

Note: This post is about a new product we're testing at my company Kosmix.

Search engines are great at finding the needle in a haystack. And that's perfect when you are looking for a needle. Often though, the main objective is not so much to find a specific needle as to explore the entire haystack.

When we're looking for a single fact, a single definitive web page, or the answer to a specific question, then the needle-in-haystack search engine model works really well. Where it breaks down is when the objective is to learn about, explore, or understand a broad topic. For example:

  • Hiking the Continental Divide Trail.
  • A loved one recently diagnosed with arthritis.
  • You read the Da Vinci code and have an irresistible urge to learn more about the Priory of Sion.
  • Saddened by George Carlin's death, you want to reminisce over his career.

The web contains a trove of information on all these topics. Moreover, the information of interest is not just facts (e.g., Wikipedia), but also opinion, community, multimedia, and products. What's missing is a service that organizes all the information on a topic so that you can explore it easily. The Kosmix team has been working for the past year on building just such a service, and we put out an alpha yesterday. You enter a topic, and our algorithms assemble a "topic page" for that topic. Check out the pages for Continental Divide Trail, arthritis, Priory of Sion, and George Carlin.

The problem we're solving is fundamentally different from search, and we've taken a fundamentally different approach. As I've written before, the web has evolved from a collection of documents that neatly fit in a search engine index, to a collection of rich interactive applications. Applications such as Facebook, MySpace, YouTube, and Yelp. Instead of serving results from an index, Kosmix builds topic pages by querying these applications and assembling the results on-the-fly into a 2-dimensional grid. We have partnered with many of the services that appear in the results pages, and use publicly available APIs in other cases.

Here are some of the challenging problems that we had to tackle in building this product:

  1. Figuring out which which applications are relevant to a topic. For example, Boorah, Yelp, and Google maps are relevant to the topic "restaurants 94041". WebMD, Mayo Clinic, and RightHealth are relevant to "arthritis". If we called each application for every query, the page would look very confusing, and our partners would get unhappy very quickly! I'll write more on how we do this in a separate post by itself, but it's very, very cool  indeed.
  2. Figuring out related topics in the Related in the Kosmos section on each Topic page. For example, you can start from the Priory of Sion and laterally explore Rosslyn Chapel or the Madonna of the Rocks.
  3. Figuring out the placement and space allocation to each element in the 2-dimensional grid. Going from one dimension (linear list) to two dimensions (grid) turns out to be quite a challenge, both from an algorithmic and from a UI design point of view.

In this alpha, we've taken a first stab at tackling these challenges. We are still several months from having a product that we feel is ready to launch, but we decided to put this public alpha out there to gather user feedback and tune our service. Many aspects of the product will evolve between now and then: Do we have the right user interaction model for topic exploration? Do we put too much information on the topic page? Should we present it very differently? How do we combine human experts with our algorithms?

Most importantly, the Kosmix approach does not work for every query! Our goal is to organize information around topics, not answer arbitrary search queries. How do we make the distinction clear in the product itself? Can we carve out a separate niche from search engines?

We hope to gain insight into all these and more questions from this alpha. Please use it and provide your feedback!

June 26, 2008 in Search | Permalink | Comments (15) | TrackBack (0)

Why SMS GupShup is Bigger than Twitter

Matt Marshall at VentureBeat liked my post on SMS GupShup, and asked me to write a follow-up guest post for VB. That post appears on VentureBeat today. Leaving aside questions of technology and scaling, I ask why SMS GupShup is bigger and growing faster than Twitter. My hypothesis:

Microblogging is a nice-to-have in developed economies, like the US. It's a must-have in developing economies like India, China, and Egypt.

In essence, microblogging is semi-synchronous publish-subscribe messaging. It’s publish-subscribe because it decouples senders and their reader(s), who can choose which senders to follow at any point in time. It is semi-synchronous because readers can choose either to follow it synchronously (via various desktop tools, or their mobiles), or read it later. In the Western world, the penetration of PCs is almost universal, so we have other PC-dependent messaging options such as blogging (asynchronous publish-subscribe); email (asynchronous point-to-point); instant messaging (synchronous point-to-point). Yes, none of them offers quite what Twitter does, but the majority of people in the majority of situations can make do with the conventional options.

Contrast this with the situation in third-world nations: PC penetration is incredibly low, while mobile penetration is incredibly high. For example, India has about 40 million PCs but 10 times as many cell phones. This makes short text messages sent via SMS the main written communication mechanism. Blogging, email, and IM are just not options, so microblogging becomes the main form of publishing, communication, and self-expression.

You can read the full post on VentureBeat.

June 19, 2008 in India, Mobile | Permalink | Comments (1) | TrackBack (0)

India's SMS GupShup Has 3x The Usage Of Twitter And No Downtime

I recently started using Twitter and have become a big fan of the service. I've been appalled by the downtime the service has endured, but sympathetic because I assumed the growth in usage is so fast that much might be excused. Then I read this TechCrunch post on the Twitter usage numbers and sympathy turned to bafflement - because I'm intimately familiar with SMS Gupshup, a startup in India that boasts usage numbers much, much higher than Twitter's, but has scaled without a glitch.

I'll let the numbers speak for themselves:

  • Users: Twitter (1+ million), SMS GupShup (7 million)
  • Messages per day: Twitter (3 million); SMS GupShup (10+ million)

Actually, these numbers don't even tell the whole story. India is a land of few PCs and many mobile phones. Thus, almost all GupShup messages are posted via mobile phones using SMS. And almost every GupShup message is posted simultaneously to the website and to the mobile phones of followers via SMS. That's why they have the SMS in the name of the service. Contrast with Twitter, where the majority of the posting and reading is done through the web. Twitter has said in the past that sending messages via the SMS gateway is one of their most expensive operations, so the fact that only a small fraction of their users use the SMS option makes their task a lot easier than GupShup's.

So I sat down with Beerud Sheth, co-founder of Webaroo, the company behind GupShup (the other founder Rakesh Mathur is my co-founder from a prior company, Junglee). I wanted to understand why GupShup scaled without a hitch while Twitter is having fits. Beerud tells me that GupShup runs on commodity Linux hardware and uses MySQL, the same as Twitter. But the big difference is in the architecture: right from day 1, they started with a three-tier architecture, with JBoss app servers sitting between the webservers and the database.

GupShup also uses an object architecture (called the "objectpool") which allows each task to be componentized and run separately - this helps immensely with reliability (can automatically handle machine failure) and scalability (can scale dynamically to handle increased load). The objectpool model allows each module to be run as multiple parallel instances - each of them doing a part of the work. They can be run on different machines, can be started/stopped independently, without affecting each other. So the "receiver", the "sender", and the "ad server" all run as multiple instances. As traffic scales, they can just add more hardware -- no re-architecting. If one machine fails, the instance is restarted on a different machine.

In read/write applications, the database is often the bottleneck. To avoid this problem, the GupShup database is sharded. So, the tables are broken into parts. For e.g., users A-F in one instance, G-K in another etc. The shards are periodically rebalanced as the database grows. The JBoss middle-tier contains the logic that hides this detail from the webserver tier.

I'm not familiar with the details of Twitter's architecture, beyond knowing they use Ruby on Rails with MySQL. It appears that the biggest difference between Twitter and GupShup is 3-tier versus 2-tier. RoR is fantastic for turning out applications quickly, but the way Rails works, the out-of-the-box approach leads to a two-tier architecture (webserver talking directly to database). We all learned back in the 90's that this is an unscalable model, yet it is the model for most Rails applications. No amount of caching can help a 2-tier read/write application scale. The middle-tier enables the database to be sharded, and that's what gets you the scalability. I believe Twitter has recently started using message queues as a middle-tier to accomplish the same thing, but they haven't partitioned the database yet -- which is the key step here.

I don't intend this as a knock on RoR, rather on the way it is used by default. At my company Kosmix we use an RoR frontend for a website that serves millions of page views every day; we use a 3-tier model where the bulk of the application logic resides in a middle-tier coded in C++. Three-tier is the way to go to build scalable web applications, regardless of the programming language(s) you use.

Update: VentureBeat has a follow-up guest post by me, with some more details on SMS GupShup. Also my theory on why SMS GupShup is growing faster than Twitter: Microblogging is a nice-to-have in places with high PC penetration, like the US, but a must-have in places with very low PC penetration, like India.

Disclosure: My fund Cambrian Ventures is an investor in Webaroo, the company behind SMS GupShup. But these are my opinions as a database geek, not as an investor.

June 14, 2008 in India, Internet Infrastructure, Mobile | Permalink | Comments (40) | TrackBack (0)

Change the algorithm, not the dataset

Mayank Bawa over at the Aster Data blog has posted a great riff on one of my favorite themes: using simple algorithms to analyze large volumes of data rather than more sophisticated algorithms that cannot scale to large datasets.

Often, we have a really cool algorithm (say, support-vector machines or singular value methods) that works only on main-memory datasets. In such cases, the only possibility is to reduce the data set to a manageable size through sampling. Mayank's post illustrates the dangers of such sampling: in businesses such as advertising, sampling can make the pattern you're trying to extract so weak that even the more powerful algorithm cannot pick it up.

For example, say 0.1% of users exhibit a certain kind of behavior. If you start with 100 million users, and then take a 1% sample, you might think you are OK because you still have 1 million users in your sample. But now just 1000 users in the sample exhibit the desired behavior, which may be too small for any algorithm to pick apart from the noise. In fact, it's likely to be below the support thresholds of most algorithms. The problem is, these 0.1% of users might represent a big unknown revenue opportunity.

Moral of the story: use the entire data set, even it if it is many terabytes. If your algorithm cannot handle a dataset that large, then change the algorithm, not the dataset.

June 12, 2008 in Data Mining | Permalink | Comments (5) | TrackBack (0)

How Google Measures Search Quality

This post continues my prior post Are Machine-Learned Models Prone to Catastrophic Errors. You can think of these as a two-post series based on my conversation with Peter Norvig. As that post describes, Google has not cut over to the machine-learned model for ranking search results, preferring a hand-tuned formula. Many of you wrote insightful comments on this topic; here I'll give my take, based on some other insights I gleaned during our conversation.

The heart of the matter is this: how do you measure the quality of search results? One of the essential requirements to train any machine learning model is a a set of observations (in this case, queries and results) that are tagged with "scores" that measure the goodness of the results. (Technically this requirement applies only to so-called "supervised learning" approaches, but those are the ones we are discussing here.) Where to get this data?

Given Google's massive usage, the simplest way to get this data is from real users. Try different ranking models on small percentages of searches, and collect data on how users interacted with the results. For example, how does a new ranking model affect the fraction of users who click on the first result? The second? How many users click to page 2 of results? Once a user clicks out to result page, how long before they click the back button to come back to the search results page?

Peter confirmed that Google does collect such data, and has scads of it stashed away on their clusters. However -- and here's the shocker -- these metrics are not very sensitive to new ranking models! When Google tries new ranking models, these metrics sometimes move, sometimes not, and never by much. In fact Google does not use such real usage data to tune their search ranking algorithm. What they really use is a blast from the past. They employ armies of "raters"  who rate search results for randomly selected "panels" of queries using different ranking algorithms. These manual ratings form the gold-standard against which ranking algorithms are measured -- and eventually released into service.

It came as a great surprise to me that Google relies on a small panel of raters rather than harness their massive usage data. But in retrospect, perhaps it is not so surprising. Two forces appear to be at work. The first is that we have all been trained to trust Google and click on the first result no matter what. So ranking models that make slight changes in ranking may not produce significant swings in the measured usage data. The second, more interesting, factor is that users don't know what they're missing.

Let me try to explain the latter point. There are two broad classes of queries search engines deal with:

  • Navigational queries, where the user is looking for a specific uber-authoritative website. e.g., "stanford university". In such cases, the user can very quickly tell the best result from the others -- and it's usually the first result on major search engines.
  • Informational queries, where the user has a broader topic. e.g., "diabetes pregnancy". In this case, there is no single right answer. Suppose there's a really fantastic result on page 4, that provides better information any of the results on the first three pages. Most users will not even know this result exists! Therefore, their usage behavior does not actually provide the best feedback on the rankings.

Such queries are one reason why Google has to employ in-house raters, who have been instructed to look at a wider window than the first 10 results. But even such raters can only look at a restricted window of results. And using such raters also makes the training set much, much smaller than could be gathered from real usage data. This fact might explain Google's reluctance to fully trust a machine-learned model. Even tens of thousands of professionally rated queries might not be sufficient training data to capture the full range of queries that are thrown at a search engine in real usage. So there are probably outliers (i.e., black swans) that might throw a machine-learned model way off.

I'll close with an interesting vignette. A couple of years ago, Yahoo was making great strides in search relevance, while Google apparently was not improving as fast. Recall then that Yahoo trumpeted data showing their results were better than Google's. Well, the Google team was quite amazed, because their data showed just the opposite: their results were better than Yahoo's. They couldn't both be right -- or could they? It turns out that Yahoo's benchmark contained queries drawn from Yahoo search logs, and Google's benchmark likewise contained queries drawn from Google search logs. The Yahoo ranking algorithm performed better on the Yahoo benchmark and the Google algorithm performed better on the Google benchmark.

Two learnings from this story: one, the results depend quite strongly on the test set, which again speaks against machine-learned models. And two, Yahoo and Google users differ quite significantly in the kinds of searches they do. Of course, this was a couple of years ago, and both companies have evolved their ranking algorithms since then.

June 11, 2008 in Data Mining, Search | Permalink | Comments (15) | TrackBack (0)

Angel, VC, or Bootstrap?

Note: I wrote this piece a couple of weeks back, inspired by Greg Linden's blog post (see below). Inc then picked up the piece and asked me not to publish it until it appeared on the Inc website. The article appears on the Inc website today with some minor edits.

Greg Linden was one of the key developers behind Amazon's famous recommendations system -- the system that recommends books, movies, and other products to Amazon customers based on their purchase history. He subsequently went to Stanford and picked up an MBA. In January 2004, he launched a startup named Findory to provide everyone with a personalized online newspaper. You cannot imagine anyone who could be more qualified to make a startup like this a success. Yet Findory shut down in November 2007. In a brilliant post-mortem, Greg says his big mistake was to bootstrap his company while trying to raise funding from venture capital firms; he just couldn't convince them to invest. He should have raised his funding from angel investors instead.

This is an important decision every startup founder has to make -- where to raise their funding. The three viable sources at the very early stages of a company are:

  • Friends and family. Yourself, if you can afford it.
  • Angel investors. Usually wealthy individuals, but includes outfits such as Y Combinator. (My firm Cambrian Ventures is also in this category, although we are currently not actively seeking investments; we're too busy running our own company Kosmix.)
  • Venture Capital (VC).

To understand which option is best for your startup, you need to understand how investors evaluate companies. While investors evaluate companies across a range of criteria, three that stay consistent are: Team, Technology, and Market. Angels and VCs evaluate them in different ways. Here's how.

How Venture Capitalists Evaluate Startups

  • Market. Venture Capitalists want to invest in companies that produce meaningful returns in the context of their fund size, which typically is in the hundreds of millions of dollars. To interest a VC firm, a company needs to be attacking a large market opportunity. If you cannot make a credible case that your startup idea will lead to a company with at least $100 million in revenue within 4-5 years, then a VC is not the right fit for you. It's often OK to use consumer traction as a substitute for market opportunity -- many VCs will accept a large and rapidly growing user base as sufficient proof that there is a potentially large market opportunity.
  • Team. Venture Capitalists use simple pattern matching to classify teams into two buckets. A founding team is deemed "backable" if it includes one or more seasoned executives from successful or fashionable companies (such as Google) or entrepreneurs whose track record includes a least one past hit. Otherwise the team is considered "non-backable."
  • Technology. Venture Capitalists are not always great at evaluating technology. To them, technology is either a risk (the team claims their technology can do X; is that really true?) or an entry barrier (is the technology hard enough to develop to prevent too many competitors from entering the market?) If your startup is developing a nontrivial technology, it helps to have someone on the team who is a recognized expert in the technology area -- either as a founder or as an outside advisor.

Here's the rule of thumb: to qualify for VC financing, you need to pass the Market Opportunity test and at least one of the other two tests. Either you have a backable team, or you have nontrivial technology that can act as an entry barrier.

How Angels Evaluate Startups

There are many kinds of angels, but I recommend picking only one kind: someone who has been a successful entrepreneur and has a deep interest in the market you are attacking or the technology you are developing. Other kinds of angels are usually not very high value. Here's how angels evaluate the three investment criteria:

  • Market. It's all right if the market is unproven, but both the team and the angel have to believe that within a few months, the company can reach a point where it can either credibly show a large market opportunity (and thus attract VC funding), or develop technology valuable enough to be acquired by an established company.
  • Team. The team needs to include someone the angel knows and respects from a prior life.
  • Technology. The technology is something the angel has prior expertise in and is comfortable evaluating without all the dots connected.

Here's the angel rule of thumb: you need to pass any 2 out of the 3 tests (team/technology, technology/market, or team/market). I have funded all 3 of these combinations, resulting in either subsequent VC financing (e.g., Aster Data, Efficient Frontier,  TheFind), or quick acquisitions (Transformic, Kaltix -- both acquired by Google).

I've written about the stories behind the Aster Data investment and the Transformic investment previously on my blog. In both cases, notice how my personal relationship with the founders, as well as my passionate belief in the technology, played big roles in the investment decisions.

Friends and Family or Bootstrap

This is the only option if you cannot satisfy the criteria for either VC or angel. But beware of remaining too long in this "bootstrap mode." An outside investor provides a valuable sounding board and prevents the company from becoming an echo chamber for the founder's ideas. An angel or VC can look at things with the perspective that comes from distance. Sometimes an outside investor can force something that's actually good for the founder's career: shut the company down and go do something else. That decision is very hard to make without an outside investor. My advice is to bootstrap until you can clear either the angel or the VC bar, but no longer.

Back now to Greg Linden and Findory. By my reckoning, Findory passes the team and technology tests from an angel's point of view -- if you pick an angel investor who has some passion for personalization technology. The company doesn't pass any of the VC tests. Given this, Greg should definitely have raised angel funding. My guess is that this route would likely have led to a sale of the company to one of many potential suitors: Google, Yahoo, or Microsoft, among many others. Of course, hindsight is always 20/20! I have deep respect for Greg's intellect and passion and wish him better luck in his future endeavors.

For further reading, I highly recommend Paul Graham's excellent article How to Fund a Startup.

June 08, 2008 in Venture Capital | Permalink | Comments (10) | TrackBack (0)

Twittering live from All Things D Conference

Kara Swisher and Walt Mossberg sure know how to throw a conference. The All Things D conference organized by them is my all-time favorite among tech conferences. This year's edition, D6, is happening at Carlsbad (near San Diego) this Tuesday through Thursday. As in the past, there's a stellar line-up this year too, including Bill Gates, Steve Ballmer, Jeff Bezos, and Mark Zuckerberg.

The highlight at last year's conference was Bill Gates and Steve Jobs sharing the stage after many, many years, A big part of the charm is the eclectic mix of attendees and speakers, including George Lucas and Martha Stewart.

I'm spending the day today driving down from the bay area to Carlsbad. If you read this blog and are at the conference, please say hello.

I'll be twittering live from the conference. If you'd like to follow me, my twitter id is anand_raj.

May 26, 2008 in Mobile | Permalink | Comments (1) | TrackBack (0)

« Back | More Posts »

About

  • Anand Rajaraman
  • Datawocky

Recent Posts

  • Stanford Big Data Course Now Open to the World!
  • Goodbye, Kosmix. Hello, @WalmartLabs
  • Retail + Social + Mobile = @WalmartLabs
  • Creating a Culture of Innovation: Why 20% Time is not Enough
  • Reboot: How to Reinvent a Technology Startup
  • Oscar Halo: Academy Awards and the Matthew Effect
  • Kosmix Adds Rocketfuel to Power Voyage of Exploration
  • For Startups, Survival is not a Strategy
  • Google Chrome: A Masterstroke or a Blunder?
  • Bridging the Gap between Relational Databases and MapReduce: Three New Approaches

Recent Comments

  • mona on Stanford Big Data Course Now Open to the World!
  • Voyager on Stanford Big Data Course Now Open to the World!
  • Gautam Bajekal on Stanford Big Data Course Now Open to the World!
  • online jobs on Not all powerlaws have long tails
  • rc helicopter on Not all powerlaws have long tails
  • tory burch outlet on Goodbye, Kosmix. Hello, @WalmartLabs
  • SHARETIPSINFO on Goodbye, Kosmix. Hello, @WalmartLabs
  • Almeda Alair on Goodbye, Kosmix. Hello, @WalmartLabs
  • discount mbt on Retail + Social + Mobile = @WalmartLabs
  • custom logo design on Retail + Social + Mobile = @WalmartLabs

Archives

  • September 2014
  • May 2011
  • April 2011
  • April 2009
  • February 2009
  • December 2008
  • November 2008
  • September 2008
  • July 2008
  • June 2008

More...

Blogroll

  • The Numbers Guy
  • Paul Kedrosky's Infectious Greed
  • Life in the Bit Bubble
  • Kosmix Blog
  • John Battelle's Searchblog
  • GigaOM
  • Geeking with Greg
  • Efficient Frontier Insights
  • Data Mining Research
  • Constructive Pessimist, Cynical Optimist

 Subscribe in a reader

Subscribe to Datawocky by Email

Popular Posts

  • Are Machine-Learned Models Prone to Catastrophic Errors?
  • Why the World Needs a New Database System
  • Why Yahoo Glue is a Bigger Deal than You Think
  • The story behind Google's crawler upgrade
  • Affinity and Herding Determine the Effectiveness of Social Media Advertising
  • More data usually beats better algorithms, Part 2
  • More data usually beats better algorithms
  • How Google Measures Search Quality
  • Angel, VC, or Bootstrap?
  • India's SMS GupShup Has 3x The Usage Of Twitter And No Downtime

Categories

  • Advertising (6)
  • Data Mining (11)
  • Entrepreneurship: views from the trenches (2)
  • India (5)
  • Internet Infrastructure (3)
  • kosmix (2)
  • Lewis Carroll (1)
  • Mobile (6)
  • Search (10)
  • Social Media (2)
  • Venture Capital (4)
See More

Twitter Updates

    follow me on Twitter