« Enumerating User Data Collection Points | Main | Traveling: In India this week »


Feed You can follow this conversation by subscribing to the comment feed for this post.


"Team B got much better results, close to the best results on the Netflix leaderboard!!"

I am deeply skeptical of this claim. If this is indeed true, why is this team not on the leaderboard? If they got close to 9% improvement, what is stopping them from blending with results from other published algorithms and claiming the $$$?

Tuomo Stauffer

I have to say I'm not amazed. We did same kind of ratings a long time ago, guess where - right, in insurance. Part of risk management. Is a red headed under 30 less risk than a blond at the same age? Better - what and how costly will be their next accident. Trying to predict human behavior, the cause and the results has been there a long time. The same was done for example ships world wide we insured but there wasn't just the information of the shipping company, we did background checking and information collection of the captain, the crew, companies using/shipping the goods, etc, the more information, the more accurate the risk estimates.
To the topic, the more information you get, external or internal, the better the estimate, proven in my mind (and in your insurance rates.)
Now - this creates an interesting dilemma, how much information you can get and how much information collection the targets will tolerate? Another subject!


This article pinpoint something that has been true for a long time: more data usually beats better algorithms. Therefore, assuming that the data mining algorithmns are not the issue (assuming good science behind them, which I have found in all the major software vendors), the issue then becomes the quality of the interactive visualization tool that allows end-users to make better decisions. Fed Chairman Bernanke, when at Princeton, published a paper that is complimentary to this issue.

Will Dwinnell

Regarding "More data usually beats better algorithms"

I would say rather that "more data and better algorithms are two ways to seek better performance". Which (by itself) will provide the greater improvement can only be decided on a case-by-case basis. The experience directly described in this article is, after all, only a single observation.


How is A better than B without benchmarking
and without criterions of comparison?


Is X quantity of data better than the best algorithm?
I want better still it!
So, More DATA = More required BANDWITH.
Is it better with trillions of TeraBytes of data with a bandwith of gigabytes per second?
It NEVER terminates! It got worser!

Jebs House

I'm not sure what the hold-up is... maybe they have re-thought their stance on how this is going to actually make the company any money. Or perhaps their lawyers pointed out the liability of providing agents a platform to stick their feet in their mouth. Whatever it is, it's hardly something I'd claim as being "Well done".


I the usual person also live as everything, but sometimes happens, that it would be desirable that that of the best. At present I do not have girl and to me is very sad, that at me such here a black strip in a life! I am assured, that I not one such, sometimes I look video from the given site, there I always find for myself that that brand new and interesting can be and you there that that for myself will find!

[url=http://adultscreensavers.freevar.com]Only the earnest entreaty for all is more senior 18 years! [/url]

I will be glad to hear your opinions on this favourite my site!)))
[b]P.S. Never surrender friends![/b]

Noclegi Karpacz

Great useful blog,so I must subscrible this.Finaly I found what I search...

The comments to this entry are closed.