I spent about two hours today cleaning up the tags section of this website.  Why?  Well, to begin with, I was looking for something and was unable to locate it.  Secondly, I guess I'm a sucker for punishment. Third, I like techie stuff so I felt like playing around with it.  And finally, it needed to be done.  Now it's by no means complete, but that is because this is not a one person job.

I'd like to take a moment to revisit the tagging subject with the community, because in less than two weeks the system has become an absolute freaking mess.  If a chunk of us would make the effort, it can be quickly turned into a very useful tool.

First, if you have not done so already, please go read the tagging tips.  When you're done, come back.  Ok, now that's out of the way do this: load the tags section.  I'm on a cable modem and it takes me nearly fifteen seconds to load the page.  That may not sound like a lot, but take a moment to scroll - page by page - to the bottom of the post.  It takes 28 clicks of my "Page Down" button to scroll through the entire thing.  That's ridiculous.

The tagging system was supposed to make locating information easier.  But as usual, when you have a user base this size, it's gotten out of control very quickly.  So can we have a discussion about common sense tagging?  I'm just going to go through a few things I noticed while doing the clean up today in the hopes that at least some of you will read this and take some initiative to help get it under control.

First, let's revisit the tagging tips, shall we?


  1. Use combinations of simple tags rather than inventing complex ones. For instance, use tags CIA, LEAK and INVESTIGATION, instead of CIA-LEAK-INVESTIGATION.

  2. Try to think of what tags people might use to search for something and use those. For example, PLAME, KARL ROVE, PATRICK FITZGERALD, BOB NOVAK, TREASON, OUTING, OPERATIVE might all be good tags for an entry on the Valerie Plame outing.

  3. Try to re-use existing tags.

  4. Keep it simple. Don't use tags that are redundant.

  5. For election blogging, add the year, state and office. So the Colorado governor's race in 2006 is tagged: "2006, governor, Colorado". Also add the dKos style abbreviation of the race (two digit state abbreviation and race). So a governor's race would be "CA-Gov", a Senate race "CA-Sen", and a congressional race would be "CA-06".

  6. Stop with the "cutesy" tags. This is a tool to help organize content, not show how clever you are with keywords like "HUNTERRIFIC" to express how great Hunter's diary was.

I didn't find too many abuses of tip #1, so I'll leave it be and move on to tip #2.  Most folks are doing pretty good on this point, but some people are going a bit overboard. Does every diary about Valerie Plame need a "Bob Novak" tag?  No, especially if the article does not relate directly to her outing.  For example, say I post a diary about how Valerie Plame & Joe Wilson's lives have changed since the outing. I'm not going to tag it with Bob Novak, because the article doesn't mention him.  It's one of those gooey profile stories.  It needs tags like "Valerie Plame" and "Joe Wilson".  All I'm saying here is that your article usually only needs a few tags, not ten or fifteen.

Moving on to tips #3 and 4.  Those are where most people are not using any common sense at all.  #3 says "reuse existing tags".  That would imply that people should take a moment when tagging (especially because the handy "Tagging Tips" link is right there next to the tag box) to see if there's already a tag for the subject of your diary.  Regarding tip #4, come on people, use some common sense.  For those who've forgotten, the definition of redundant is "more than is needed, desired, or required".  Think about this: do we really need both a "sixties" AND a "60s" tag?  Do we really need both "KKK" and "Klan"? Do we really need both "DFA" AND "Democracy for America"?  No.  Remember, tagging is supposed to make things easier.  It's not supposed to create such bloat that it becomes dysfunctional.  Now since the list is so long right now, I know searching can be a PITA.  Just press "control-F" on your keyboard (or I think "apple-f" for mac users) and a little search box will either pop up in IE or load at the bottom left of mozilla.

Tip #5 seems to be catching on, so I'll move along.

Regarding tip #6, we've still got a few problems there.  We're all grown ups, right?  Well, heh, most of us anyway.  Why tag stuff with "yankees suck"?  Even if they do suck, it's just a juvenile thing to do.

And one more thing. I am normally not the grammar police, but in this case it's imperative that you spell your tags correctly!  Right now I see "Amrstriong Williams" and earlier I cleaned up about five variations of the name John Shalikashvili.  Yea, I know that's not an easy one, but realistically, if you're unsure of how to spell something, please go to google or and make sure you get it right.

Now there's certainly room for some grey areas in all this, as metadiaries tend toward subjectiveness.  I think that the system is fucked up already.  Others may disagree.  But I'm saying this from the perspective of a web and mail admin and a longtime user who'd like to see this feature work as it was intended.  So that's where I'm coming from.  And yea, I think it would be great if this made the rec list so the entire community could engage in this discussion.

If we all practise just a bit of due diligence the tagging system can become a thing of utility and beauty. I'm simply asking that the community please pitch in and follow the guidelines that were laid out when tagging was implemented.  And of course, if you feeling like cleaning up some useless tags, then by all means please do so.

And remember, bad tagging practise makes baby metajeebus weep.

Originally posted to anna on Fri Oct 21, 2005 at 03:43 PM PDT.


Do you promise to practise due diligence when tagging posts?

60%12 votes
10%2 votes
5%1 votes
20%4 votes
5%1 votes

| 20 votes | Vote | Results

  •  heh (4.00)
    i made baby metajeebus weep.

    "Democrats: Always standing up for what they later realise they should have believed in." -Jon Stewart, the Daily Show

    by anna on Fri Oct 21, 2005 at 03:46:15 PM PDT

    •  You and me (4.00)
      both, sister.

      Though you did a much better job in your diary.  I've been spending cough an hour or more a day cleaning tags.  Just getting around to it tonight, but it sounds like maybe you've got it all.

      One point of disagreement:  I think both DFA and Democracy for America (to use your example) are fine.  For me, the ultimate goal is maximum searchability.  I can easily see people searching for DFA or Democracy for America but not necessarily both.

      This does not excuse people who tag something with both "rove" and "karl rove."

      blog | -6.13, -5.95 | ... random acts of whatnot.--Hank Hill

      by folkbum on Fri Oct 21, 2005 at 04:29:40 PM PDT

      [ Parent ]

      •  crap (none)
        i completely missed your diary yesterday.  sorry for being... drum roll... redundant!

        you make a good point, that's why i said there are grey areas.  i dunno, IMO, having two categories like that is the definition of redundant. i'm trying to put myself in the shoes of a casual user or someone who stumbles here looking for information.  if i was new to the netroots, i might not know what DFA is.  i would probably search for "democracy for america" instead.

        i'm also looking at this from a user-friendliness standpoint because that's what i do all day at work: try to make stuff more user friendly.  simple (or, less kindly, dumbed-down) usually equals user friendly.

        "Democrats: Always standing up for what they later realise they should have believed in." -Jon Stewart, the Daily Show

        by anna on Fri Oct 21, 2005 at 04:49:58 PM PDT

        [ Parent ]

        •  asd (none)
          this is probably really hard to do, but it would be great if, to use this example, you tagged something with "Democracy for America" when you clicked to submit it said "did you mean DFA?"  or if you spell it wrong or whatever.

          that would prevent a lot of bad tags and tend to get people all using the same tag for the same thing.

  •  You're "it." (none)

    Mother Nature bats last.

    by pigpaste on Fri Oct 21, 2005 at 04:12:42 PM PDT

  •  Redundancies & Spelling (none)
    I'd say that getting standardized is the biggest challenge.

    A Note About Finding and CTRL-F:
    if you use Firefox you can set the advanced option to "begin finding when you begin typing"

    this checkbox will save you the time of doing the CTRL-F thing

    "Infinite love is the only truth. Everything else is illusion." -David Icke
    (-6.25 ,-4.51)

    by Dr Seuss on Fri Oct 21, 2005 at 04:22:14 PM PDT

  •  Thank you for the hard work. (none)
    Umm... I'm not having anything like the performance problems you're experiencing.  Did you fix them or could it be something else (my ISP has been really flaky for the last week so I'm just asking).
  •  Totally with you. (none)
    I have done more than one database cleanup in my life.  Free text is dangerous, dangerous, dangerous....

    In a tags thread recently I suggested that it would be very handy to have an option to automatically add the tag from the list to whatever you had been reading.  For example, a little + sign after the word, perhaps....saving some typos and carpal tunnel, both.

    I know that wouldn't solve everything, but it would make some tagging easier and less error prone.

    Thanks for cleaning up.  I appreciate it.  

  •  unintelligent design (none)
    dailykos is a self-disorganizing system

    thanks for trying though!

  •  tags are bs feature creep (none)
  •  LOC Subject headings (none)
    Complex, maybe crazy (street railway? why not trolley?) but truly needed. The Library of Congress subject headings occupy at least one fat volume if not two and that's not including geographic headings or personal names.

    What you're doing is the work that has to be done to maintain authority tables in cataloguing databases. Monumental work but once it's done and people use what's in place--amazing data access. Kudos!

  •  Needs to copy flickr more (none)
    People done understand how to use comma seperated lists.  You are complaining about Karl Rove turning into "Karl" and "Rove" when I'll bet if you watch, those people are not being malicious, but entring the thing library style - "Rove, Karl"

    Get rid of the busted way of entering tags and the system will improve.

