Skip to main content

DK Tech Team Update



Adorable dog, wearing tags, overlooking a river
She's a Character, and she's wearing Tags.
It's the perfect stock photo for an otherwise
dry bit of technical trivia!
Kos announced a while back that we're working on a new image uploading and library feature that should make it far easier and more fun to use images in diaries - and then to manage and share them with other kossacks. For them to be sharable, we need to be able to search, and tags can be a powerful tool when organizing. But they'd be more powerful if they were a bit better organized.

My current task is to get to work reorganizing and cleaning DK's 200,000+ tags into a bit more usable form. Or more precisely, to lay the programming and logic groundwork to automate as much as possible, and then when that is ready, to redeploy a group of Tag Librarians to guide it going forward with the shiny new tools.

Fun With Tags


The first part of this was to rethink allowable characters for tags. Some characters in tags don't work properly within URLs, some are only used poorly, and some make it hard for other users to reuse them or find them. Tags are most effective when they are useful for multiple diaries - and going forward, for multiple images.

The rules for new tags going forward are:

Allowed:
English letters and numbers - ie, characters A-Z, a-z, 0-9

-    Dash: Election tags need it, and generally useful
'    apostrophe: possesives are useful and widely used
$    dollar: so you can talk about the $6M Man  
.     dot/period, for something.com or Dr. Seuss
()    parentheses, as in 501(c)(4)
     space character is allowed in the middle of a tag
&    Ampersand, as in Cheers & Jeers and S&P 500

Not allowed:
%    I know you'll miss this one, but this is a reserved character in URLs, and it breaks.
:    it's reserved in URLs and there are not many uses currently.
"    Not many uses that can't be met with single quotes
!    Only used occasionally
?    breaks urls
/    breaks urls
,    our tag separator
;    all mistakes and html entities
<    no good uses
>    no good uses
+    The only tag that will miss this is Google+
=    Reserved character
_    not many uses
*    only good use is M*A*S*H
^    no good uses
\    general escape character, no good use
|    no good uses
{}    no good uses
@    Only a few rare uses now

 no non-ascii UTF-8 chars. In human speak, this means special characters from other alphabets like ñ, ç, é, none of the cute symbols like ♥, etc.

These are the rules for new tags. What about old tags?


Right this moment, old tags in violation of these guidelines will still be in the system. They can remain on current diaries and they can be assigned to new diaries. The only change is that new tags must follow the guidelines.

The next step will be to clean up old tags that run afoul of these guidelines (starting first with the ones that break URLs). This will be done in the database, aliasing variant characters and spellings over to the appropriate tags. This will make searching easier, so you won't have to guess whether to look for #ows or Occupy Wall Street or OccupyWallStreet or @occupy or the like.

Once that's all tidied up, we'll be building some new and better tools to manage and edit tags. You may or may not have noticed that tag pages have a place for a description, a picture, and other metadata. Tags that are marked as people can have a last-name-first form. The tool will make it easier to mark one tag as a variation on another, so that "healthcare" always becomes "health care." And it will be easy to see what new tags are being created and used.

I plan to recruit and deputize a group to help with this, somewhat akin to the current Rescue Rangers. Tag Librarians once existed and helped manage tags with a dreadful paucity of tools - this next iteration should have a clearer mission and a much better interface to work with. (The original Tag Librarians were very helpful in seeding the current tag data and in helping me to design the DK4 tag data structure and underlying functionality, no small effort.) When I'm ready for help there, I'll put out the call. Maybe I'll even let you replace the picture of my cat gracing the Pooties tag with one of yours.

PS: DK Tech is Hiring


Thanks to the greatly successful subscription drive, the DK technical team is hiring some additional Ruby on Rails developers. Check the new "Jobs" link in the footer below for details.

Originally posted to Tag Librarians on Thu Feb 09, 2012 at 10:42 PM PST.

Also republished by J Town, Cranky Users, and KosBusters!.

EMAIL TO A FRIEND X
Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site