Skip to main content

I have a story today that comes from my predilection to "self-syndicate", meaning that I post my stories far and wide, in the same way a newspaper columnist is syndicated nationally—or beyond.

After I post, I know others will also post my stories to their sites, a topic that was itself the subject of a recent conversation.

To keep track of it all, I use the Google...but I recently wondered if that’s actually the most effective tool for the job—or not—so as an experiment I recently challenged several search engines to go out and seek the same search term.

We find out today...and the results are, indeed, interesting.

So here’s the rules of the game: on the afternoon and evening of November 29th, I posted my story "On Stimulating The Future, Or, "It's The Ytterbium, Stupid!"" on 27 sites. The next morning I conducted the searches you’ll see referenced in this discussion using as a search term the exact words of the title, in quotes, just as it appears above. During the course of writing this story, we’ll revisit the same sites to see if the results have changed.

So the first search was conducted on Google, which found 849 results.

The reason that happens is because the tags associated with (or the proper nouns that appear in) a story often trigger websites to place that material on pages with other stories with matching tags or names, as you can see from this example at RootsWire. (The story appears twice because it was updated after it was posted.)

This creates lots of iterations of the same title on the same site under different categories, a situation other search providers seek to reduce; this being the one of the points behind all those recent ads for Microsoft’s Bing search engine.

A quick note about "search consistency": seeking for the same search term at Google on multiple occasions will yield different results each time, even if the two searches are conducted immediately after one another. For example, my search this morning found 852 results—and then, just a few minutes later, 653. (By the way, if you click on these links now, some other number of results will appear, which is its own comment on consistency.)

We next visit Bing, where 16 results were initially found. Interestingly, some of the links were the ones I placed, but 6 of the 16 were multiple iterations of the same story on three sites.

As with Google, visiting Bing today might yield 57 links—or 2530, or 152, or 26—and despite Bing’s advertising claims that they make searching simpler by eliminating Internet "clutter", a huge number of the links I’m seeing here are links to the weather in virtually every city in Maine; all of these linked back to "Weather Underground" weather reporting pages...and all of those pages were from the same basic address: insert name here.wunderground.com.

Next was Yahoo!, reporting 887 results (and then, after clicking through a few pages, 1580). There was an interesting variation to the pattern of what they found, however: more results from the first 50 were links to the original 27 postings than appeared to be the case with either Google or Bing.

The search today found 860 results...four times in a row...which is by far the most consistent results reporting so far—even if the results from the other day were completely different.

Lycos found 67 iterations of the posting...or 50......and then 49...with roughly a dozen of the first 30 listings being "duplicative" entries, which is fairly consistent reporting. Returning to the site today, the search engine found 69 listings—and it was also able to do that four times in a row....which makes it at least the "consistency equal" of Yahoo!

Dogpile (a product of the fine folks at Infospace) aggregates results from Google, Yahoo!, Bing, and Ask.com into one set of results...and for some reason the first result on the second page was for a futures trading opportunity ("I’m shocked to discover there’s gambling here...!").

That said, Dogpile "sniffed out" 40 results, with many of those being "duplicate" instances of the same story from the same site. Conducting the same search today yields 6 additional results—all of which appear to be duplicates of the previous 40.

On four further attempts to search, the original 40 results were found.

WebCrawler, another Infospace property, located 38 results; again, the results are highly duplicative. It is not possible to enter the entire search term at this site, instead, the term...

"On Stimulating The Future, Or, "It's The Ytterbium,

...was used.

Four additional searches were conducted today, with 38 results found each time.

(Because the page-naming conventions of both Dogpile and WebCrawler insert an ! into the page names upon which results are presented, they can't be linked here, and you'll just have to visit the pages on your own.)

Altavista found 904 iterations of the story, then 17,800 on today’s search. There is an option to either search "Worldwide" or "USA", the Worldwide search, conducted immediately after today’s USA search, found, oddly enough, 2460 results—and for at least the first several pages, which was as far as I looked, the results were the same as for the USA search.

Four additional searches, conducted today, located the same 17,800 results.

One strange idiosyncrasy of the site is that it won’t actually display those 17,800 results: instead, it only displays the first several pages of results (in this case, 7 pages), and then just stops, with no additional pages made available beyond that point. There is an "advanced settings" page available, but it does not offer any solution for this problem.

Ask.com displays 240 results on the first search—and they were the only site to report the listing on the Times of India site right there on the front page (which, if you return to the site, is no longer the case)—but on the down side, 1/3 of the results on that first page were "sponsored results".

After the first page, ½ of all results are "sponsored", and the results are highly duplicative. By page 10 of the results, as few as 2 of the 13 results on the page are not sponsored.

Today’s searches located 452 links, then 449, then 452, and then (take a guess...) 449.

Ever heard of Duck Duck Go? Neither had I before this story. They feature an unusual format that displays some results, and then, when you click "more results", displays those below the first results on the same page; a pattern that continues until all results are displayed.

The Duck located 35 results on the first attempt, with no duplicates. The "wunderground" domain was represented—but only once.

Apparently recognizing that their searches are not going to give every result, the site encourages you to also search at YouTube, flickr, twitter, amazon, and Google.

Turning off the "safe search" feature yields 43 results, including BhamLinks.com (a news aggregator from Birmingham, Alabama), Pshcye’s Links, ("Esoteric Subjects on the Web"), and the "Li-Ion" page from the Journalism that matters site (Li-Ion, by the way, is the abbreviation used to describe lithium ion batteries.)

Conducting additional searches on the site today yields the same 43 results.

Finally, Cuil. I had never heard of this site before...and apparently, they’ve never heard of me, either, with zero results reported for my query. Searching the "127 billion web pages" they purport to scan today provided no results again during four additional checks—which makes this site the most consistent of the search engines I examined.

I conducted a test search for pizza (with no quotes). 809,000,000 results were found...but only two were displayed on the "All Results" tab: one for Pizza Hut, one for the Wikipedia entry for pizza (which featured the story of how pizza was introduced into Pakistan, of all things). Even more odd: on the same page you can look up "Pizza franchises" and other pizza related results "categories", and there’s a "Timeline for Pizza" with entries like: "2004 Melbourne, Australia" and "1993 Pizza was".

All of this appears to be at odds with the intent of the site’s operators:

"Popularity is useful, but has dominated search results so heavily that it gets harder and harder to find the page you want, especially if your search is a complex one. Cuil respects popular pages and recognizes that for many simple searches, popularity is an easy answer to your question. But for a deeper search, establishing relevancy is more than a numbers game. Cuil prefers to find all the pages with your keyword or phrase and then analyze the rest of the content on those pages..."

Those are the results: so, what about conclusions?

The first conclusion we can reach about all of this is that the number of results that any search engine locates on any particular visit are highly variable—and so much so that the number of results presented appears to be virtually random (with the notable exception of Cuil, which seems to be consistently unable to find anything).

With that said, if I was quickly looking for this particular story, it appears that some of the odd search engines might be the best choices, including Lycos, Duck Duck Go, and Ask.com.

On the other hand, if the idea was to determine how far a story has been distributed, Google seems to be the winner.

There is another reason to use search engines, that being to find information about a topic that you currently don’t know enough about; this test is not well suited to answer the question of which search engine is best for that purpose...and it's a test that we'll save for another day.

So that's today's story: we visit quite a few search engines, we learn that the results you get are almost always entirely unpredictable, and, in what might be the most important lesson of the day, we're learning that deifying Tiger Woods can backfire on you, big time.    

Originally posted to fake consultant on Fri Dec 04, 2009 at 10:43 PM PST.

Poll

which is the real church?

17%4 votes
17%4 votes
43%10 votes
8%2 votes
13%3 votes

| 23 votes | Vote | Results

EMAIL TO A FRIEND X
Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

  •  while i expected each engine... (9+ / 0-)

    ...to have different results, i did not expect the extreme variability that each engine presents on multiple visits.

    "...this election has never been about me. it's about you."--barack obama

    by fake consultant on Fri Dec 04, 2009 at 10:43:01 PM PST

  •  I was a fan of MetaCrawler (2+ / 0-)
    Recommended by:
    dirigo, fake consultant

    (boolean type search) until Google came along. Since then, it's all that I use.

    "The human eye is a wonderful device. With a little effort, it can fail to see even the most glaring injustice." Richard K. Morgan

    by sceptical observer on Fri Dec 04, 2009 at 11:00:31 PM PST

  •  Except (3+ / 0-)

    Very few searches are ever conducted for full titles. About the only time I do it is for song lyrics, and even then I often have the title wrong or search on a piece of lyric I remember. A good search engine accommodates that, and also allows for typos in the search target.

    What's more important is how well a search engine performs when people who want information about the subject of your article search, which your stats don't measure at all. In essence what you tested with the full title (esp in quotes) is database indexing, not semantic ability or how well full-text is indexed and retrievable.

    Bitte sag mir wer das Märchen vom Erwachsen sein erfand

    by badger on Fri Dec 04, 2009 at 11:06:39 PM PST

  •  There are times I want a very specific result (4+ / 0-)

    and I can't figure out how to eliminate all of the possibilities the search engines comes up with.

    Is there a way to get only the results with the exact words or phrases I want without the search engine deciding for me that I also want something else (which I can't always predict ahead of time)?

    Public option or corporate option...pick one.

    by Jane Lew on Fri Dec 04, 2009 at 11:21:46 PM PST

    •  on google... (4+ / 0-)

      ...(and for other searches that allow "boolean" searching), you can use the "minus" sing, like this:

      pizza, -hut

      it's not perfect, but that is the correct method.

      "...this election has never been about me. it's about you."--barack obama

      by fake consultant on Fri Dec 04, 2009 at 11:37:01 PM PST

      [ Parent ]

      •  sorry... (4+ / 0-)

        ...i meant to say "minus sign".

        an additional hint: don't put a space between the minus and the term you want to exclude.

        "...this election has never been about me. it's about you."--barack obama

        by fake consultant on Fri Dec 04, 2009 at 11:38:23 PM PST

        [ Parent ]

      •  I know how to do this and I find it really (2+ / 0-)

        tiresome. Why should I have to list all the things I don't want. That is really stupid. The list of all the things I don't want can be quite long and often times quite unpredictable.

        I often can not predict what  Google  will decide is a related word before I search. Once, Google, for some reason, decided that "Indiana" was a related word to "Indian." Never in a thousand years could I have predicted that one.

        What I want is a way to say,

        Please don't give me any results other than the exact search words or spellings I am entering.

        Let's say I want to search for a person named "Dan R. Hovey." I do not want to search for "Dan Hovey" or "Daniel Hovey." There seem to be hundreds of Daniel Hoveys and there is a guitarist named Dan Hovey whom I really don't want. I only want Dan R. Hovey. He is also Daniel Randall Hovey, but that is a separate search. If I do a search for

        "Dan R. Hovey"  -"Dan Hovey" -"Daniel Hovey"

        Google returns exactly one site. I know there are other sites out there with "Dan R. Hovey" because I have found them by searching other ways.

        Well, I just read the other comments, and  wondering if below has a great answer. I have tried it, and it is exactly what I want.

        WOW!

        also, for Google

        putting a '+' in front of a word forces an exact match : +homerun will match just "homerun" as one word and not include "home run" as two words.

        Public option or corporate option...pick one.

        by Jane Lew on Sat Dec 05, 2009 at 08:30:04 AM PST

        [ Parent ]

        •  i was also not aware... (1+ / 0-)
          Recommended by:
          Jane Lew

          ...of how the + could be used, and i have also been searching within quotes at times when a + would have done just the trick.

          why?

          i blame the tool itself, and the designers' choices.

          i do not know how to fix my fridge, nor my washing machine, and i would be hopelessly lost in life if i couldn't do virtually any repair necessary for my computer and its associated storage to remain functional.

          for most consumers, being able to repair a computer shouldn't be part of the ownership process, but it is uniquely so, and uniquely annoying.

          but a lot of that is because the fridge can't do word processing while running my home music, showing me a video, letting me respond to comments, and keeping up my incoming email.

          i can't change the graphics card on my fridge, either, to any of several dozen choices. in fact, that is part of the deal we don't often appreciate: the computer is the only consumer appliance that allows you to add components from several manufacturers as you choose.

          the only comparable situation is an automobile, where parts from many sources can be combined to customize the car--but even then, adding a gm engine to a subaru is not an easy task...as opposed to changing a motherboard so that your intel-based machine now runs an amd processor, which can be done in as little as 15 minutes.

          but even beyond that...computers are dumb.

          they are still struggling with "what do i do if i don't have instructions for that?", and that is both a sort of holy grail for designers and the cause of nearly everything we're talking about here today.

          "...this election has never been about me. it's about you."--barack obama

          by fake consultant on Sat Dec 05, 2009 at 11:12:58 AM PST

          [ Parent ]

          •  I too had been using quotes with variable results (1+ / 0-)
            Recommended by:
            fake consultant

            I am so glad I wasn't afraid to look stupid by posting the question.

            I am with you about fixing a computer. It is the reason I had children. They understand the thing in a way I never will. I was born before television; that is my excuse and...I am sticking to it.

            I think there are other examples of combining several components like a quality stereo system or perhaps a really good camera. If you put in surround sound with your TV you are mixing components too.

            Public option or corporate option...pick one.

            by Jane Lew on Sat Dec 05, 2009 at 02:49:36 PM PST

            [ Parent ]

            •  you are correct... (1+ / 0-)
              Recommended by:
              Jane Lew

              ...about the stereo, with one big exception:

              when i purchase a computer, i'm looking for a western digital hard drive, i'll specify how much ram i want, and if i played computer games with any serious intent i'd want a specific graphics card.

              this would be a discussion i'd have with the sales rep, after which the rep would give me a price quote, and we would either do a sale...or not.

              this process is not part of the purchase of a stereo, even though you might well combine "pre-built" components from a variety of manufacturers.

              "...this election has never been about me. it's about you."--barack obama

              by fake consultant on Sat Dec 05, 2009 at 08:37:58 PM PST

              [ Parent ]

              •  Yes, you can set up a stereo (1+ / 0-)
                Recommended by:
                fake consultant

                by just plugging parts together, but audiophile level  
                systems can require a physics major (which my husband has).

                It has been a long time since we had a real "sound system," but I seem to remember interminable discussions about which components were the best ones for our usage and which ones worked best with each other and why.  There were trade offs. There were even discussions about which were the best cables and why? It was all very technical and boring. I really could hear the difference between cables, but the effort that went into making a decision about them was mind numbing. It involved making our own experiment setting up a situation in which we tested both kinds with the only variable being the size of the cables.

                Public option or corporate option...pick one.

                by Jane Lew on Sat Dec 05, 2009 at 09:25:30 PM PST

                [ Parent ]

                •  i'm going to guess... (1+ / 0-)
                  Recommended by:
                  Jane Lew

                  ...that at one time the words "macintosh mono amp" were floating around the kitchen table--and i'd bet that even today, his first association when hearing the word "mac" might not be a computer.

                  "...this election has never been about me. it's about you."--barack obama

                  by fake consultant on Sun Dec 06, 2009 at 01:54:51 AM PST

                  [ Parent ]

    •  You might want to read this how-to on Google (6+ / 0-)

      "The human eye is a wonderful device. With a little effort, it can fail to see even the most glaring injustice." Richard K. Morgan

      by sceptical observer on Fri Dec 04, 2009 at 11:44:36 PM PST

      [ Parent ]

    •  also, for Google (4+ / 0-)

      putting a '+' in front of a word forces an exact match : +homerun will match just "homerun" as one word and not include "home run" as two words.

      Number ranges can be done as n...m, as "movies 1920..1925"  There's also the recently added date options, accessed with the Show options... at the top of the search results, to filter by how recent the entry is.

      More tips at
      http://www.google.com/...

  •  Re: your poll... the REAL Church... (2+ / 0-)

    I can do magic. If you want miracles, well... that's gonna take a little longer.

    by Clive all hat no horse Rodeo on Sat Dec 05, 2009 at 01:09:20 AM PST

  •  Explain the Weather Underground again? (2+ / 0-)

    Terrific review; I work on and wonder about "search engine optimization" all the time, and this is useful data.  
     
    But I'm puzzled about why all those Maine weather pages would show up as results for a phrase search, particularly a phrase search containing some distinctly non-weather-related terms.

    Any ideas?

    And remember: If you don't like the news, go out and make some of your own. - Scoop Nisker, the Last News Show

    by North Madison on Sat Dec 05, 2009 at 03:42:38 AM PST

  •  I use IXQUICK first, Google as last resort... (2+ / 0-)

    ...I want to get initial results different from most users, plus IXQUICK has most privacy (does not log IPs).

    •  couple comments... (0+ / 0-)

      ...i was not aware of ixquick or start page before this story, when y'all brought them to my attention--and for both i would note what i see as the same "fatal flaw": while 17,000+ results are reported, the pages only go two deep, which means 16,980 of those results are invisible to me.

      there is presumably a workaround that i missed to fix this, but for the first-time user it's a bit off-putting.

      an additional comment, entirely based on "style": because the allure of the search engine is (to paraphrase) "the power of searching ten search engines", i might have spelled the name iXQUICK.

      IXQUICK reminds me of the roman numeral nine, instead of the ten i think they were going for; i think the lower-case i would have been more stylistically appropriate.

      "...this election has never been about me. it's about you."--barack obama

      by fake consultant on Sat Dec 05, 2009 at 11:34:12 AM PST

      [ Parent ]

  •  Ok, I just had to click on the link (2+ / 0-)

    to the website for the Church of Tiger Woods.  Wow.  Just wow.

    I sure am glad my life is safe from the radar screen of the "pastor."  

    I am in no way defending anything that may or may not have happened in the Tiger Woods incident, but this guy is really a piece of work.  Deifies a mere mortal, then excoriates him when he turns out to be human.  Judgmental much?

    •  that is a risk... (2+ / 0-)
      Recommended by:
      sceptical observer, Jane Lew

      ...we take on this site as well.

      after all, there could have just as easily been a church of howard dean...or john edwards...or hillary...as a church of tiger woods.

      i am forever telling people that jesus ain't running for office, that no candidate is a diety, and that mere mortals tend to carry an amalgam of brilliant ideas and personal flaws in their makeups--usually both at the same time.

      i'm gonna go easy on this guy--'cause i screw up a lot myself--but it is a cautionary tale that we, around here, would do well to remember.

      "...this election has never been about me. it's about you."--barack obama

      by fake consultant on Sat Dec 05, 2009 at 09:47:47 AM PST

      [ Parent ]

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site