Today's processing of tag data shows 49610 tags, of which almost 200 are new. We'll hit 50000 tags very soon now. Over 30000 tags are only attached to one diary, and there are an unknown number of orphaned tags. At the end of this diary I'll look at a few of the new tags and the pointer to today's files in case anyone wants to look at the current tagset more closely.
First, though...
Discussion on my
previous diary and on dKosopedia's dk{{DailyKos Tag Cleanup Project}} and dk{{Tag Editors Workspace}} pages leads to a pretty obvious conclusion. The unfettered ability to create tags just plain doesn't work. I want to suggest a new approach.
This suggestion requires a new group of users, separate from Trusted Users. Now, I firmly believe that one of the unique things about dKos is that for the most part the power belongs to the people. Anyone can post a diary, and it's the population that determines the worth, through the recommend button, of that diary. The "worth" of users themselves, through comment recommendation, is similarly determined. Everyone has an equal vote. But that anyone can create any or as many tags as they like, while keeping with the dKos spirit, is unworkable. Instead...
- There should exist a list of "Approved Tags". These are the only tags which you can attach to a diary without fail.
- Attached tags not on the "Approved Tags" list get put on a "Pending Tags" list. These "Pending Tags" can be approved in one of two ways:
- Human intervention - there should exist a group of people, the "Tag Librarians", if you will. Any Tag librarian will have the power to approve a tag. This group is not the same as the Trusted User group.
- Programmatically - To handle rapidly breaking news -- let's say we're invaded by Venusians -- pending tags can be approved automatically if a certain number of diarists start using it. Approval thusly granted can be rescinded by the Tag Librarians.
The tough part will be seamless integration of Approved Tags and diary creation. I can see some sort of dropdown list populated via an ajax query in response to typing in a desired tag. This list will have either
- suggested tags from the Approved Tags list which most closely match the desired tag (paging peeder and jotter), or
- a hierarchical menu system of Approved tags. This will require some work by the Tag Librarians to discover a taxonomy.
This is a starting point.
What about the current mess of tags? Well, some cleanup is definitely required, but at this point I think it's futile until some restrictions, either the above or some other scheme, are placed on tag creation. After tag creation is restricted, then we'll talk. But as the title of this diary suggests, I see the deletion of many tags in our future.
One last thing. Some tag cleaners (I still like the moniker "The Dermatologists") have said that when they've cleaned tags that they get yelled at for restricting freedom and creativity. Well, with all respect, when it come to tags, screw your freedom and creativity.
Ok, now let's look at some of yesterday's new tags. They pretty much confirm the symptomology. Some are new, possibly valid, tags. The rest are spelling errors and malformed. Look at the first dozen or so. The new tag in a block is the first entry, marked with an asterisk. If I didn't find a good match, I've marked none under it.
D236 [1% Doctrine] 1 *
O516 [one percent doctrine] 7
P625 [1 Percent Doctrine] 1
T516 [The One Percent Doctrine] 1
[110th Congress] (1) *
none
[3rd Infantry Division] (1) *
none
[911smear] (1) *
none
[Aaron McGruder] (1) *
none
A152 [Abu Mazin] (1) *
A152 [Abu-Mazen] 1
[Acquifer] (1) *
N216 [N-aquifer] 1
[Addications] (1) *
A323 [Addict] 1
A323 [addiction] 50
A323 [addictions] 1
C253 [Cocaine Addiction] 1
D623 [drug addict] 1
D623 [drug addiction] 1
U232 [U.S. addiction.] 1
[Andrew Fastow] (1) *
F230 [fastow] 1
[anniversery] (1) *
A516 [anniversary] 19
[Appearences] (1) *
not only misspelled but confusing
A351 [Automobiles GM] (1) *
A300 [Auto] 1
A351 [Automobile] 3
A351 [automobiles] 34
A353 [Auto Industry] 17
A353 [auto-industry] 1
A353 [Automotive] 8
A353 [Automotive Industry] 7
A362 [auto workers] 1
A362 [autoworkers] 1
So, 3 potentially valid new tags, 2 that could be anything ("911smear" and "Appearences"), and some misspellings and such. These are all things that could be handled by my proposed scheme. (Is anyone else surprised that "Aaron McGruder" hasn't been tagged until now?)
Oh, and one last new one from today: "Cahvez".
C120 [Cahvez] (1)
C120 [Chavez] 65
H221 [Hugo Chavez] 134
This reveals 65 "Chavez" tags that should be "Hugo Chavez". That's a lot of diaries to clean up by hand.
Another one last thing:
[va-1] (1)
[va-2] (1)
[va-3] (1)
[va-4] (1)
[va-5] (1)
[va-6] (1)
[va-7] (1)
[va-8] (1)
[va-9] (1)
Sigh.
Ok, enough. Today's run, as always, the latest run can always be found
here. Bookmark it if you like.