Skip to main content

View Diary: Election Race Diary Roundup (11/11 - Final 2006 Edition) (92 comments)

Comment Preferences

  •  Rectifying the count (0+ / 0-)

    Just waited a few days before asking about your final numbers.  How are you getting your count of 6827 diaries? I can't find that many from the 78 diaries of the Election Roundup series.  What am I missing?

    After adding in the latest diaries (since just before the election) and modifying my code to find at least some of the non standardly formatted etag/diary pairs present, I find 6818 etag-sid pairs that represent 6603 unique diaries; over 200 diaries occur more than once.

    I'm hoping to move this list of etag-sid pairs along to the tagging crew so that the etags assigned by the Roundup group can be added systematically.  In addition, I was wondering if you had or would recommend an additional tag indicating that a diary had been published in an election roundup.  If so, what would be your recommendation for that tag?

    It may be necessary to revisit a small number of diaries whose tag is not easily mined out using automated means.

    This is the final comparison of the Roundup and Kos DB tagging related to elections.

    Union of Roundup and KOSDB tags: 1011 diaries: 9124

    Roundup tags: 600, only in roundup: 296
    Roundup diaries: 6603, only in roundup: 2592

    KOSDB tags: 715, only in KOSDB: 411
    KOSDB diaries: 6532, only in KOSDB: 2519

    Tags found in both Roundup and KOSDB: 304
    Diaries found in both Roundup and KOSDB: 4013
    Tags found in one or the other but not both: 707
    Diaries found in one or the other but not both: 5111

    # 38 days 'til the light starts to return

    by jotter on Mon Nov 13, 2006 at 07:29:37 PM PST

    [ Parent ]

    •  Wow, Jotter... you're way out of my depth here (1+ / 0-)
      Recommended by:
      Alma

      To answer the simpler questions, there were 81 diaries in the series total (including post-election); the number I came up with was basically a running tally I kept throughout the life of the series. My math is usually very good, but I couldn't swear that there wasn't an error somewhere along the line.

      Just curious, though - are you saying that over 200 diaries in our series appeared more than once? We were careful to try to remove duplicates within a day but never compared them between days, so I guess it's possible but I wouldn't have thought there would be nearly that many.

      At the risk of exposing my complete naivete, a lot of what you're writing above might as well be in a different language. I'd love to help but if I tried to answer your other questions I'd more than likely be sticking my foot in my mouth.

      If it's worth the time and you want to try to break it down for me I'd be happy to try to help.

      Make a difference today. Who better than you?

      by sidinny on Tue Nov 14, 2006 at 06:44:17 PM PST

      [ Parent ]

      •  thanks for the answers (1+ / 0-)
        Recommended by:
        Alma

        and sorry if I was being incomprehensible - been down in the mines too long I guess.

        So I'm three short on diaries - that's the first thing I need to fix up.

        After I find the 3 missing diaries I'll check in again.

        Thanks again!

        # 37 days 'til the light starts to return

        by jotter on Tue Nov 14, 2006 at 10:20:27 PM PST

        [ Parent ]

        •  I don't think you were being... (1+ / 0-)
          Recommended by:
          Alma

          ... incomprehensible at all. Well, not necessarily anyway. I just got such tunnel-vision on this thing that I got to pay attention to precious little elsewhere on the site during that period. So you've probably given perfectly understandable explanations that I've just missed.

          And yeah, the one thing I can say for sure is that it was 81 diaries. I started 8/22 and we missed one day thereafter until 11/11.

          Make a difference today. Who better than you?

          by sidinny on Wed Nov 15, 2006 at 04:51:44 AM PST

          [ Parent ]

          •  81 (1+ / 0-)
            Recommended by:
            Alma

            Thanks, its always the last few that cause problems, and I needed to know they were there so I knew how hard to look.

            I have them all now - numbers to follow quickly.

            # 37 days 'til the light starts to return

            by jotter on Wed Nov 15, 2006 at 09:09:59 AM PST

            [ Parent ]

          •  once again (1+ / 0-)
            Recommended by:
            Alma

            From 81 Election Roundup diaries, I could recover 7511 links to Daily Kos stories/diaries.  

            Excluding any with the phrase "Election Race Roundup" or Election Race Diary Rescue Roundup" in the title left 6873, which after removing duplicates left 6849 unique diaries, compared to your number of 6827.  

            I don't think we can get closer without comparing final lists.  Probably a few  extra non round up diaries were mentioned along the way that I counted and you didn't.  But at least I'm getting as many diaries as you, instead of fewer.

            There were 24 diaries mentioned more than once.  These were mentioned in 23 diaries. Two such diaries had 8 mentions of duplicates, 11 had two, and 10 had one.  

            Now the hard part.  The whole point of the exercise was to pull out the race assignments the Roundup team made to each diary, for use as tags.

            I now can parse out 6881 such assignments, involving 6782 diaries and 573 races.  That means I've missed assignments for 60-70 diaries.  I took a look at those, and it appears they are mostly due to formatting discrepancies.  That caused my parsing to fail, but means the assignments can be retrieved.  I haven't done that yet.  

            Here are the numbers to date comparing roundup with database tagged election diaries.

            Roundup tags: 573, only in roundup: 264
            Roundup diaries: 6782, only in roundup: 2341

            KOSDB tags: 715, only in KOSDB: 406
            KOSDB diaries: 6532, only in KOSDB: 2091

            Tags found in both Roundup and KOSDB: 309
            Diaries found in both Roundup and KOSDB: 4441

            Tags found in one or the other but not both: 670
            Diaries found in one or the other but not both: 4432

            Union of Roundup and KOSDB tags: 979 diaries: 8873

            What I hope to do with this, once the final group are manually curated, is to hand off this list to a group of people interested in tagging, and make sure that all the diaries from the election roundup series get the tags you assigned as well as a tag indicating it was part of the election roundup series.

            Once that is done it will be possible to compare all  diaries with a particular race tag to those that got rounded up.

            Do you see any problem with that?  

            # 36 days 'til the light starts to return

            by jotter on Wed Nov 15, 2006 at 01:21:01 PM PST

            [ Parent ]

            •  Okay, think I got the basics... (0+ / 0-)

              Looks like you're just adding another layer to the cross-referencing on these. I thought of doing something like that by adding our own tag designation to diaries in the early stages but it seemed way too labor intensive. Makes sense, though, if you have the folks willing to do it.

              My thought is that adding something simple yet distinctive like 2006 (or '06) ERR (or ERDR - both have been used) would probably do the trick. And I certainly don't have a problem with anything, I think it's amazing that you're putting all the time into looking at this. I just wish we had been able to work with you on the front end, we probably could have organized things to make your life a lot easier.

              If there's anything else, just give a yell (or drop an email, it might be easier).

              Make a difference today. Who better than you?

              by sidinny on Thu Nov 16, 2006 at 11:18:24 AM PST

              [ Parent ]

              •  I don't have any "folks" (1+ / 0-)
                Recommended by:
                Alma

                but I'm hoping to faciitate an automated clean up.

                I can't find any existing ERR / ERDR tags in use - what am I doing wrong?  I tried 2006 or '06 in front or after with and without a space.

                You guys were too busy mining to pay attention to formatting - that's fine.  Besides it turns out it was really amazingly good as is.

                I'll let you know of any progress.

                # 35 days 'til the light starts to return

                by jotter on Thu Nov 16, 2006 at 02:01:21 PM PST

                [ Parent ]

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site