[Coco] sorting Delphi messages

Lothan lothan at newsguy.com
Tue Sep 15 00:52:35 EDT 2009


I think you're heading in the right direction. The only way to really 
preserve the threads is to load the bulk of the messages in its entirety 
into either a database or a dictionary in memory. The database idea might 
work, but then it becomes a monstrous task to rebuild the threads back to a 
single parent since the nesting level is potentially infinite. Even though 
it's possible, it's not likely worth the effort unless you load all the 
Message Ids into something like a balanced tree or a red-black tree 
(basically a self-balancing tree).

If I remember those thread correctly (bearing in mind it's been something 
like 20 years since I've look at them), most of the threads are relatively 
short so it shouldn't be much of a problem to ignore the extreme cases.

Let me know if you need any help with them and I'll do what I can.

--------------------------------------------------
From: "Roger Taylor" <operator at coco3.com>
Sent: Monday, September 14, 2009 11:56 PM
To: "CoCoList for Color Computer Enthusiasts" <coco at maltedmedia.com>
Subject: [Coco] sorting Delphi messages

>
> I'm at the point now where a bunch of messages have been loaded into a 
> SortedList which is a key-based dictionary/array that sorts automatically 
> while items are added.
> I'm tagging each message with it's Delphi-generated ID (found in the 
> message header), and the message PARENT (optionally found in the subject 
> line of the message header).
>
> Both the ID and PARENT are stored in the list along with the body, 
> subject, date, etc.  I then enumerate through the list which returns all 
> the message IDs in ORDER from low to high.
> I can check whether the message is a POST (has no parent), or REPLY 
> (contains a parent message ID)
>
> Here's where the work really comes.  I need a new sorted list that orders 
> the keys by Thread Starter, Thread Participants
>
>
> Original Post
>   Reply to original
>   Reply to original
>   Reply to some reply to original
>   Reply to reply #2 above
>   Reply to #4 above
> Original Post
>   same list of replies and replies to those replies (if any)
> Original Post
> ... etc.
>
> Any messages that eventually lead back to the original post ID just need 
> to be sorted that way in the new list so I can show them on the site to be 
> Comments of the Original Message, with no nesting used.  I feel like the 
> nesting levels may be too deep to make it worthwhile to recreate the 
> entire thread depth visually.
>
> Another issue is "the missing parent" syndrome since these archives are 
> taken out of a slice of time.  Any message that points to a nonexistent 
> parent should be remarked as a Thread Starter and the whole sorting 
> process restarted.  I'm going to work on the missing parent thing tonight.
>
> Out of ~1800 messages just for December 1993 of Delphi, you can see that 
> no sorting can be done by hand.
> Does anyone have a formula that comes to mind for finding all messages 
> that ultimately lead back to the same parent?
>
> Here is just a small cut of the debug readout I'm using to get a better 
> idea of things.  The first message I have (by Delphi ID) for Dec 1993 is 
> of course, something from a thread but not the master of the thread.  Any 
> messages shown to be "a reply to #####" and the ##### is less than 82350 
> is clearly leading to a missing message, so those have to be remarked as 
> having a parent of 0 which makes it a Thread Starter now just to give the 
> message a place in the WordPress blog scheme.
>
> The list below is only sorted by message ID, not any considerations for 
> threads.
>
> 82350: is a reply to 82331
> 82351: is a reply to 82309
> 82352: is a reply to 82342
> 82353: is a reply to 82345
> 82354: is a starter post
> 82355: is a reply to 82343
> 82356: is a reply to 82344
> 82357: is a reply to 82351
> 82358: is a reply to 82352
> 82359: is a reply to 82336
> 82360: is a reply to 82355
> 82361: is a reply to 82360
> 82362: is a starter post
> 82363: is a reply to 82359
> 82364: is a reply to 82326
> 82365: is a reply to 82306
> 82366: is a reply to 82306
> 82367: is a reply to 82362
> 82368: is a reply to 82363
> 82369: is a reply to 82315
> 82370: is a reply to 82342
> 82371: is a starter post
> 82372: is a reply to 82297
> 82373: is a starter post
> 82374: is a reply to 82310
> 82375: is a reply to 82312
> 82376: is a reply to 82371
> 82377: is a reply to 82340
> 82378: is a reply to 82376
> 82379: is a reply to 82342
> 82380: is a reply to 82378
> 82381: is a reply to 82371
> 82382: is a reply to 82375
> 82383: is a reply to 82371
> 82384: is a starter post
> 82385: is a reply to 82304
> 82386: is a reply to 82384
> 82387: is a starter post
> 82388: is a reply to 82382
> 82389: is a reply to 82384
> 82390: is a reply to 82373
> -- 
> ~ signature section for all e-mails-
> ~ While I have seen Many annoying taglines over the years, I've never 
> really complained, but here's mine:
> ~ Roger Taylor
> ~ http://www.americafedup.com
>
>
> --
> Coco mailing list
> Coco at maltedmedia.com
> http://five.pairlist.net/mailman/listinfo/coco
> 



More information about the Coco mailing list