Microsoft’s announcement of its “Bing” search engine was a dream come true for many bloggers. Content at last! Finally, something useful to write about! Finding Fault has been observing the feeding frenzy.1
Mostly, the bloggers do one or two searches, and get a screenshot or two to drive traffic to their blog, as if nobody else could repeat the same search and see for themselves. Then they ignore Yahoo and declare Google or Bing the winner of their benchmark.
Finding Fault is hereby pleased to find fault and point out three major flaws that we have observed in numerous blog postings.
The First Small Flaw
The bloggers are judging Bing’s results by how Google-like they are. They do a Google search and a Bing search, then they look to see how similar they are. If Bing gives you essentially the same hits as Google does, then Bing must be good.2
Did we judge Google by how similar to Altavista it was?
That was a rhetorical question, but we will forgive you for answering no.
We judged Google by noticing that: Oftentimes we weren’t sure what we were looking for, but Google seemed to know.
That was why Google impressed. It made Google Googly. It was the quantum leap.
Yahoo eventually claimed to have caught up with Google, and now, apparently, so has Bing, some bloggers report.
But we see no new quantum leap. And without this new quantum leap, we expect no search victory over Google. The blogger benchmarks are close to worthless.
The Second Bigger Flaw
The second Googly property of Google was that it seemed to give us the truth, so far as anybody could define truth. The only people who seemed to complain about Google lying were the spammers and the search engine optimizers, which only made the rest of us even more sure that Googliness was Truthliness.
And so far, Google in its searches has not let us down. If we are looking for something bad about Google, we generally do a Google search, with the implicit belief that Google won’t lie to us even about itself.3
If you don’t realize how unique a quality this is, you haven’t been paying attention.
Yahoo, too, has adopted this philosophy.4
Finding Fault has seen no reason to believe that Bing at its core has this essential quality of Truthliness. Although it does look like it does. But we suspect Bing is faking sincerity.
Why do we suspect this?
Because, ultimately, Bing is just another of Microsoft’s faces. We won’t bore you by listing the many ways in which Microsoft and its officers and employees have acquired a reputation for insincerity.
This is why we at Finding Fault don’t trust Bing, and this is why we think the bloggers who evaluate Bing on the basis of a few search results now, today, are missing an important point.
They should be looking for reasons why Microsoft and its officers and employees won’t in the long run give us the sanitized truth according to Microsoft.5 If we see anybody find such a reason, we will let you know.
The Third Fatal Flaw
The biggest flaw in the blogger benchmarks of Bing and Google, the fatal one, is that they make the wrong assumptions, look at things in isolation, and don’t do real research.
They are looking at search as a service in isolation and as a service provided to customers. This is wrong, wrong, wrong. The service that Google ultimately provides is not search to its users, but rather, advertising to its customers. This is why Google engineers obsess over the number of kbytes sent to the browser, and over the grey-ness of text and the blue-ness of links, and over page load times greater than a small fraction of a second. Because, ultimately, the search service must attract viewers for the advertising that Google sells, and cause them to not use it just once, but to keep coming back, and keep clicking.
And all this is why you cannot meaningfully do one or two tests, either yourself or with a random sample of users. Suppose you ask your random sample to test both Google and Bing and tell you which gives them better results. And suppose, just for the sake of argument, all of the users say they get better results with Bing. Would that mean anything?
Unfortunately, no. They are using Bing because you told them to.6
If they are doing something you told them to, then it doesn’t matter how good or bad it is—you don’t know what they would you do if you didn’t tell them what to do. If 100% of your sample of users prefer Bing when told to use it, but only 12% of them would use it if you didn’t tell them to use it, then your test results mean nothing.
In fact, just the users’ knowing that you are studying them may skew the results and make them unreliable.7
What you ought to do, Dear Blogger, is take a large enough random sample of users, let them do searches over a period of weeks, make them use Google and Bing and Yahoo and that other new one that sounds like Quill, and also Ask, and Mahalo, and a few others. This way they don’t know which search engine you are really evaluating, so they have no incentive to try to please you.
Collect a lot of data. Analyze it and publish results, so your subjects don’t feel they wasted their time.
Then throw it all away, because it means nothing.8
Now, after the experiment is over, you must surreptitiously monitor how these subjects do searches. This is the real, covert experiment. Which search engines do they now prefer to use when they think nobody is watching? And how has this changed since before the experiment?
And most importantly: Which ads do they click on?
The test subjects are now trying to please only themselves, not you, and their behavior now will tell you far more than their behavior during the overt experiment.
Those of you who want to take time off from mindless blogging and do some real experiments, we would love to see you publish your results.9
- If you are not one of the crowd, Dear Blogger, do not be offended. It’s not you, it’s all those other bloggers. ↩
- Granted, we have seen a few—very few—exceptions to this. You know who you are. We won’t bother you. ↩
- Google the company, as opposed to Google the search engine, has been known to stretch the truth. Recently, when asked to make HTTPS mandatory for access to mail, Google responded that defaulting to HTTPS would slow down users. In the opinion of Finding Fault, Google should have more accurately said that a high volume of the relatively expensive HTTPS might overload Google’s SSL hardware. That would slow down users, it’s true, but the slowing down would really come from overloaded equipment, not from HTTPS itself. We still don’t think this would skew Google’s search results. See article “HTTPS security for web applications” in blog “Google Online Security Blog” dated 2009-06-16 by Alma Whitten http://googleonlinesecurity.blogspot.com/2009/06/https-security-for-web-applications.html visited 2009-07-14. They’re writing letters to one another in PDF, by the way. So that’s the secret to getting a reply out of Google. ↩
- But Yahoo keeps rejecting Opera. Just saying. ↩
- Bing lets you view its old background images only if you install Microsoft’s Silverlight software. Expect more persuasion in the long run in a politically correct direction. Before reading on, try quick web searches for each of the following phrases, without typing the quotes: “microsoft sabotaged quicktime”, “microsoft sabotaged java”, “microsoft sabotaged opera”, “microsoft sabotaged beos”, “microsoft sabotaged drdos”, “microsoft sabotaged opendocument”, “microsoft sabotaged firefox”, “microsoft sabotaged hosts-file”, and “microsoft wga deception”. The “microsoft sabotaged hosts-file” search will tell you that Microsoft routes lookups for some of its own domains differently than the rest, and suddenly the following speculation no longer sounds so remote: “What Google’s chief executive, Eric Schmidt, has to fear more than anything else is that he’ll awake one day to learn that the Google search engine suddenly doesn’t work on any Windows computers: something happened overnight and what worked yesterday doesn’t work today. It would have to be an act of deliberate sabotage on Microsoft’s part and blatantly illegal, but that doesn’t mean it couldn’t happen. Microsoft would claim ignorance and innocence and take days, weeks or months to reverse the effect, during which time Google would have lost billions.” Article “Chrome vs. Bing vs. You and Me” dated 2009-07-12 by Robert X. Cringely in online periodical “The New York Times” http://www.nytimes.com/2009/07/13/opinion/13cringely.html?_r=1 visited 2009-07-14. ↩
- That’s not the only reason why it might not mean very much. Users can’t always reliably rate search results, because, paradoxically, they see only what the search engine shows them. “It came as a great surprise to me that Google relies on a small panel of raters rather than harness their massive usage data. … The second, more interesting, factor is that users don’t know what they’re missing. … [In the case of informational queries] there is no single right answer. Suppose there’s a really fantastic result on page 4, that provides better information [than] any of the results on the first three pages. Most users will not even know this result exists! Therefore, their usage behavior does not actually provide the best feedback on the rankings.” Article “How Google Measures Search Quality” dated 2008-06-11 by Anand Rajaraman, based on his conversation with Peter Norvig, previously Director of Search Quality at Google. http://anand.typepad.com/datawocky/2008/06/how-google-measures-search-quality.html visited 2009-07-14. ↩
- Though it’s much maligned, most authorities agree that the Hawthorne Effect is real. See article and references therein in “Hawthorne effect” in online encyclopedia “Wikipedia, the free encyclopedia” http://en.wikipedia.org/wiki/Hawthorne_effect visited 2009-07-13. ↩
- Actually, we are exaggerating here for effect. These tests will provide valuable usability feedback. ↩
- We are stocking up with a seven-year supply of food and water. ↩