* A Filter to Speed up nmh Message Scans
@ Martin McCormick
0 siblings, 0 replies; only message in thread
From: Martin McCormick @ UTC (permalink / raw)
To: blinux-list
I just wrote a little C program that I use on large
message folders to speed up the process of scanning large numbers
of messages. You have to set up a format file to only pass the
message number plus the subject. My filter ignores the message
number because it always changes, but if there are more than two
messages with the same subject, you only hear the first scan.
It silently skips all the rest of the lines with the same subject
and then wakes up when the subject changes.
I have everything you need to make it run in a uuencoded
file that decodes to form a file called subjects.tar.gz. When
you uudecode it, it unpacks to a directory called subjects. In
there is a file called doc.txt and the source called subjects.c .
I tell you how to build it and what its limitations are .
The main thing that throws it off is when several people
post subjects that are essentially the same subject, but have
been re-spelled or otherwise reworked. My filter does a few
tricks to get around common variations, but it is not
sophisticated at all. All I do is to force all words in the
subject to upper case, remove all whitespace and punctuations.
That still isn't enough, but anything else gets in to the realm
of very complex. This is quick and dirty.
The uuencoded file is 60 lines long and I could post it
to the list, but that's not fair to those who don't care.
Martin McCormick WB5AGZ Stillwater, OK
OSU Center for Computing and Information Services Network Operations Group
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~ UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
A Filter to Speed up nmh Message Scans Martin McCormick
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).