Skip to main content

The full list of files is below, and, as always, at this page. Soon I'll also include a file of tags sorted by frequency. Don't know why I didn't do it before; it's a very interesting list. As expected, "Iraq" and "George W. Bush" head it up. Here's the top couple of dozen or so...

      8970 Iraq
      8258 George W. Bush
      4491 bush
      3791 Recommended
      3661 2006
      3308 Democrats
      3128 Republicans
      2550 war
      2526 Senate
      2490 Congress
      2148 2006 Elections
      2120 Dick Cheney
      2061 Iran
      2060 media
      1956 torture
      1871 ELECTIONS
      1865 Israel
      1838 Joe Lieberman
      1711 NSA
      1627 Samuel Alito
      1588 Karl Rove
      1520 ned lamont
      1440 9-11
      1422 terrorism
      1413 Iraq War
      1366 immigration
      1317 Hurricane Katrina

This pretty much sums up the Daily Kos focus, doncha think. "George W. Bush" is in second place, but we have near 4500 just plain "bush" tags. In fact, there are 403 tags that match "bush" (case insensitive). Here's the top few:

      8258 George W. Bush
      4491 bush
      1152 Bush Administration
       326 Bush Regime
       206 George Bush
       183 George W Bush
       144 President Bush
        77 Jeb Bush
        70 Laura Bush
        63 BushCo
        55 George H. W. Bush
        44 Impeach Bush
        42 John Ellis Bush
        34 Bush Doctrine
        32 Bush family
        28 Barbara Bush
        21 George H.W. Bush
        15 President  Bush

We can definitely clean up a few of those "bush " tags. Let's also look at the last entry on that first list "Hurricane Katrine". How many other Katrina tags are there?

      1317 Hurricane Katrina
       319 Katrina
       105 Katrina Blog Project
        10 Katrinacrat
         7 Katrina vanden Heuvel
         3 Hurricane  Katrina
         2 Katrina response
         2 Katrina Vanden Heuval
         2 Katrina Evacuees
         2 Hurrican Katrina
         1 Telling all about Katrina's impact...Wildcard issue in 2006
         1 Right-wWng comments on Katrina
         1 Medicare Katrina
         1 Katrina- 9-11
         1 Katrina relief
         1 Katrina papers
         1 Katrina morgue
         1 Katrina donation
         1 Katrina cough
         1 Katrina aid
         1 Katrina Warnings
         1 Katrina Report
         1 Katrina Recovery
         1 Katrina Gate
         1 Katrina Blanco
         1 Katrina Anniversary
         1 Katrina 11
         1 KATRINA CORRUPTION
         1 Impeach katrina
         1 Hurricane Katrina. New Orleans
         1 Hurricaine Katrina

Forget the last entry, how about the first: "Iraq". There are 362 "Iraq" tags. Here's the first few:

      8970 Iraq
      1413 Iraq War
        96 Iraq War Grief Daily Witness
        67 Iraq Debacle
        62 War in Iraq
        49 White House Iraq Group
        37 Iraq civil war
        23 Iraq elections
        17 Iraq Occupation
        15 Iraq reconstruction
        15 Iraq Constitution
        14 Iraqi Elections
        12 iraq war grief daily witness photos
        12 Iraqi Army
        12 Iraq casualties
        11 Iraqi civil war
        10 Iraqis
        10 Iraqi Constitution
        10 Iraq Invasion
         7 Iraqi casualties
         7 Iraqi
         7 Iraq veteran
         6 War on Iraq
         6 Iraqi government
         6 Iraqi election
         6 Iraqi War

Ok, well, enough of this. Before I include today's file list, I have a question. I'm using Safari on Mac OS X and the http://meta.dkosopedia.com/... site, crude though it is, looks fine for me. But I submitted it to browsershots.com (great idea, btw) and it looks like IE on Windows doesn't draw the outside body div borders at all. This wouldn't bother me at all, of course, except that IE on Windows is the world's dominant combo, so I should probably do something about it. Other than saying "forget it, Windows sucks," what am I doing wrong CSS-wise?

all tagstags-all.txt.zip47950
new tags (since 20061002)tags-new.txt.zip234
tags starting with whitespacetags-whitespace.txt.zip2
tags of 5 or more wordstags-multiword-5.txt.zip1127
tags containing a periodtags-period.txt.zip1439
tags containing a semicolontags-semicolon.txt.zip0
tags used oncetags-used-once.txt.zip29488
tags from 20061002 not found nowtags-old-gone.txt.zip251
soundex codes mapped by a single tagtags-single.txt.zip841
soundex codes starting with Atags-soundex-a.txt.zip1956
soundex codes starting with Btags-soundex-b.txt.zip2416
soundex codes starting with Ctags-soundex-c.txt.zip3695
soundex codes starting with Dtags-soundex-d.txt.zip2552
soundex codes starting with Etags-soundex-e.txt.zip1722
soundex codes starting with Ftags-soundex-f.txt.zip2093
soundex codes starting with Gtags-soundex-g.txt.zip1858
soundex codes starting with Htags-soundex-h.txt.zip1854
soundex codes starting with Itags-soundex-i.txt.zip1864
soundex codes starting with Jtags-soundex-j.txt.zip1706
soundex codes starting with Ktags-soundex-k.txt.zip828
soundex codes starting with Ltags-soundex-l.txt.zip1773
soundex codes starting with Mtags-soundex-m.txt.zip3360
soundex codes starting with Ntags-soundex-n.txt.zip1986
soundex codes starting with Otags-soundex-o.txt.zip1010
soundex codes starting with Ptags-soundex-p.txt.zip3623
soundex codes starting with Qtags-soundex-q.txt.zip78
soundex codes starting with Rtags-soundex-r.txt.zip2773
soundex codes starting with Stags-soundex-s.txt.zip4427
soundex codes starting with Ttags-soundex-t.txt.zip2908
soundex codes starting with Utags-soundex-u.txt.zip871
soundex codes starting with Vtags-soundex-v.txt.zip683
soundex codes starting with Wtags-soundex-w.txt.zip1620
soundex codes starting with Xtags-soundex-x.txt.zip11
soundex codes starting with Ytags-soundex-y.txt.zip142
soundex codes starting with Ztags-soundex-z.txt.zip141

Originally posted to dKosopedia on Tue Oct 03, 2006 at 07:53 PM PDT.

EMAIL TO A FRIEND X
Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

  •  again, (4+ / 0-)
    Recommended by:
    Rita in DC, ETinKC, Halcyon, johnsonwax

    any other tag files you might like to see?

    Forget the myths the media's created about the White House. The truth is, these are not very bright guys, and things got out of hand. -- Deep Throat

    by The Centerfielder on Tue Oct 03, 2006 at 07:52:41 PM PDT

    •  One of the things I'm going to look at soon (2+ / 0-)
      Recommended by:
      Halcyon, Buffy Orpington

      is seeing if the dKosopedia article titles can form a protected tag space. Can you produce a list of those?

      My reasoning is three-fold:

      1. It's a free mechanism for users to edit and manage the protected tag space. All the bits are already implemented, so the programmers don't need to do anything new.
      1. tags could automatically generate links to the dKosopedia. Want more info on Patriot Act? Just click through. The reverse could also happen where dKosopedia links back to dKos for recent diaries (just pushing the RSS).
      1. It should make it easier for the community to add content to dKosopedia instead of just piling in the diaries - that's a good thing. By adding new protected tags, users should also add content to support that tag.

      -6.00, -7.03
      "I want my people to be the most intolerant people in the world." - Jerry Falwell

      by johnsonwax on Tue Oct 03, 2006 at 09:25:26 PM PDT

      [ Parent ]

  •  Haven't seen earlier diaries (1+ / 0-)
    Recommended by:
    Rita in DC

    sorry, so this is probably a silly question:

    what are you proposing to do with all of this  research?  I assume the goal is more uniform tags to make searches easier.

    I can see how more standard formatting rules can be set up, but it's less easy to see how standardization would be enforced with so many thousands of tags.  Some sort of automation? A drop-down list from which to choose? Permission needed to add a new tag?

    It's a great idea, I'm just curious what you're thinking for implementation.

  •  Tool update. (4+ / 0-)

    Tool continues to come together.

    The lists you have above are exactly what I expect to automate. The ability to take 'Katrina' and get all tags containing 'Katrina' so that they can be condensed efficiently. Some should condense, some should split, some should remain.

    I wouldn't bother trying to condense the various 'George W. Bush' tags with that many entries. Wait for a tool to do that.

    I'd concentrate on the tags with small numbers - you can actually do some damage there, and automation isn't quite as effective for those.

    -6.00, -7.03
    "I want my people to be the most intolerant people in the world." - Jerry Falwell

    by johnsonwax on Tue Oct 03, 2006 at 08:43:19 PM PDT

  •  "A drop-down list from which to choose?" (2+ / 0-)
    Recommended by:
    Halcyon, Buffy Orpington

    as maryru asks above:

    Yes, yes, yes! Menus and submenus. That's the only way that the tags will stay clean, IMO. Too many users--yes, TUs--just can't seem to resist the cutesy, snarky, or impractical tags, and too many users are probably well-intentioned but just don't seem to "get" what makes for a useful tag. And then there are the typos.

    •  I'm forming a set of recommendations (3+ / 0-)
      Recommended by:
      SarahLee, Rita in DC, Halcyon

      for us to pour over. Taken as a whole, they should drastically cut down on the tag proliferation. There will still be some getting through, but a decent cleanup tool should make that easy to deal with as well.

      -6.00, -7.03
      "I want my people to be the most intolerant people in the world." - Jerry Falwell

      by johnsonwax on Tue Oct 03, 2006 at 08:58:18 PM PDT

      [ Parent ]

  •  A fervent request re the tag cloud (2+ / 0-)
    Recommended by:
    SarahLee, Buffy Orpington

    Please have mercy on those of us who are still on dialup connections and allow the alphabetically sorted tag cloud to be accessed one letter at a time!

    I can't load the whole tag cloud, period.

    Thanks in advance.

    •  Perhaps that can already be done, but (1+ / 0-)
      Recommended by:
      Buffy Orpington

      I can't figure out how.

      •  Sorry (0+ / 0-)

        That's not within my power. But you can download one of the files listed above to get tags by first letter. Hmmm. Of course. I should zip up the alltags page and save that along with the  above files. Then you can download, unzip, and load locally.

        Forget the myths the media's created about the White House. The truth is, these are not very bright guys, and things got out of hand. -- Deep Throat

        by The Centerfielder on Wed Oct 04, 2006 at 03:45:08 AM PDT

        [ Parent ]

      •  Hi Rita, (0+ / 0-)

        I'm nominally one of the Tag Cleanup leaders. dKosopedia/The Centerfielder forgot to link to our wiki page in this diary. (I was out of town when it was posted, am doing catch-up now). Please feel free to post suggestions/comments on the 'discussion' page.

        Thanks for your input. If you click through on the meta.dKosopedia link in the diary, you bring up a small table that is updated daily, showing status of number of Tags, and a breakdown of frequency of various types of tags.

        I'm in complete agreement with you about the problem for people on dial-up, who would like to avail themselves of the 'AllTag' URL, but can't even load that page. There are 29,000 one-off Tags out of a total of ~48,000. We plan to delete as many of the the one-offs as possible. I don't know if the remainder would load on dial-up. If not, then an alternative set of URLs needs to be created.

        johnsonwax (way over my head; I'm a software ignoramus) has offered to create the new Tag software, and will be conferencing with kos and some of the other software people soon, to work out the parameters. Any suggestions you may care to contribute are most welcome. I check the dKosopedia page daily, so please create a dKosopedia account and offer your input.

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site