Mail Processing Literate Program

John Hurst
`ajh@csse.monash.edu.au`

Version 1.4.8

20 Jun 2001

1. Abstract

This document describes files used for processing email.

1		Abstract
2		Maintenance History
3		Introduction
4		More Detailed Description
5		Description of The Environment
6		Program Requirements
	6.1	R1: uuencoded mail
	6.2	R2: mime encoded mail
7		Design Document
8		User Manual
9		Literate Data
10		Literate Definitions
11		Literate Code
	11.1	`.procmailrc`
	11.1.1	Null Filters
	11.1.2	Forwarding Filters
	11.2	XML Perl filtering
	11.3	R1: uuencoded mail
	11.3.1	The uudecoder filter `uudecoder`
	11.3.2	Manning the Microsoft Ramparts
	11.4	R2: mimencoded mail
	11.4.1	The mime filter `mimedecoder`
	11.5	Obsolete Components
12		Literate Build Scripts
13		Literate Tests
14		Bibliography
15		Glossary
16		Indices
	16.1	Files
	16.2	Chunk Names
	16.3	Identifiers

2. Maintenance History

09 Jul 1998	John Hurst	1.0	initial version
10 Jul 1998	John Hurst	1.1	develop literate program: requirement R1
17 Sep 1998	John Hurst	1.2	requirement R2; warning messages; revamped metamail processing
18 Sep 1998	John Hurst	1.2.1	fixed some bugs arising out of changes in version 1.2
21 Sep 1998	John Hurst	1.2.2	mimedecoder now handle upper and lower case triggers, and text versions of docs sent to self now identify the source file.
23 Sep 1998	John Hurst	1.2.3	fixed metamail path
25 Sep 1998	John Hurst	1.2.4	some literate program revisions, and fixed bug with file name matching in `mimedecoder`.
05 Oct 1998	John Hurst	1.2.5	add tests for file existence in uudecoder.
12 Oct 1998	John Hurst	1.2.6	include .dot as a word document trigger.
01 Dec 1998	John Hurst	1.2.7	expand documentation on the Microsoft issue.
29 Mar 1999	John Hurst	1.2.8	add rtf files to the list of dammed, and revised warning message.
27 Apr 1999	John Hurst	1.2.9	turn auto-replies off at JohnR's request
27 Apr 1999	John Hurst	1.2.10	escape blanks within filenames
28 Aug 1999	John Hurst	1.3.0	revise for hawthorn
15 Dec 2000	John Hurst	1.4.0	revise for xlp; change warning
20 Dec 2000	John Hurst	1.4.0	revise copy circulation for ajh2, mh
08 Jan 2001	John Hurst	1.4.1	revise to reinclude attachment processing
18 Jan 2001	John Hurst	1.4.2	experimented with lock files on mime handling: discarded.
22 Feb 2001	John Hurst	1.4.3	add XML-Perl handling
28 Feb 2001	John Hurst	1.4.4	add mimedecode file name filtering
07 Mar 2001	John Hurst	1.4.5	add mimedecode file name filtering
29 Mar 2001	John Hurst	1.4.6	extract files to attachments directory
05 Apr 2001	John Hurst	1.4.7	forward .doc attachments to ajh2
19 Apr 2001	John Hurst	1.4.8	reinstate forwarding of ALL mail to ajh2 for trial of mozilla

3. Introduction

This document defines scripts and code to interface to the procmail program, for the purposes of managing mail

4. More Detailed Description

This document started from a realization that I needed to keep changing the files I used to process incoming email, such as .procmailrc and the like. I found that I was continually rethinking much of what had been done before, so a literate program to handle it all seemed like a good idea. This has proved true in practice.

In handling electronic mail, a great deal of stuff arrives that could have some pre-processing performed upon it. For example, mail might be presorted into folders, mail from some addresses or containing certain patterns in its contents (such as ``make money fast!'') might be discarded, while other items might undergo some form of processing before being presented to the user.

There is a standard Unix tool to perform this, know as procmail. Unfortunately, its syntax is baroque, and it is not easy for novice users to get it right. Worse, wrong procmail scripts may discard mail never to be retrieved. What we describe here is a suite of programs to tackle this task, and document just what is happening so that the mail filtering is handled in as perspicacious a fashion as possible.

5. Description of The Environment

My work as ADT means that I receive a significant volume of email, which, although not large in absolute terms, does mean that some smart front end tools to ease the task of reading and handling mail is warranted. The environment includes a university administration that is not computer literate, and consequently there is much inappropriate use of the technology. Keeping a lid on this is one of the aims of this suite of literate programs.

6. Program Requirements

6.1. R1: uuencoded mail

When an email arrives that contains a uuencoded document, the filter should perform the decode, and then forward mail stating the name of the file, and where the file has been placed.

It is also appropriate that if the document is a Word document, a polite message informing the sender of potential delays in accessing the document is returned.

6.2. R2: mime encoded mail

A similar scenario to uuencoded mail, as described above. The only tool I can find to perform mime decoding is metamail, which isn't very unix-friendly.

7. Design Document

CS: It should examine each aspect of the design in enough detail to convince the reader that you had reasons for your choices, and didn't do anything just because that was the first way that came into your head. I like to divide it into sections, with each section presenting one design point, discussing the pros and cons of different approaches to that point, and ending with the approach I decided to take, supported by explicit references to the requirements (e.g. "I chose this sort routine over that one because its performance scales better, and scalability to larger data sets is requirement 2.3.1"). If you find you have no justification for one or more points of the design, that should tell you something (something bad) about the program you've written. Pictures of your data structures and literature citations for any tricky algorithms can go in this section (with cross-references to the code that implements them). It's good if this section can stand somewhat on its own, so it can be used in design reviews, or distributed separately to people who need to understand the system but don't need to understand the code.

8. User Manual

There is not much to say from a user perspective. The programs are designed to be as transparent as possible.

To install this suite of programs, you will need to revise various paths and the like. The following table lists those things that will need changing in this document. They are distributed throughout the document to keep related components together, but this list allows us to cross-reference them. Isn't literate programming wonderful?

Most of these come from the section LiterateDefinitions.

<perl location 1>	absolute pathname for the perl interpreter
<logfile location 5>	absolute pathname for where logging messages are to be stored
<warning 6>	absolute pathname for the warning message which is sent to Word .doc senders. You may also want to change the file named ```warning`'' as well (see <warning 6>).
<decoder binaries 14>	absolute pathname for procmail to find the binaries defined by this literate program suite. Usually your binaries directory, but it can be anywhere.

9. Literate Data

CS: The nice thing about a language-independent tool like noweb is that you can document anything with it, not just code. I like to have a section or chapter showing actual samples of input and output that the program is supposed to read and write. This gives me a good place to talk about range limits on data (and how we handle out-of-range conditions). Having a concrete example of typical data in mind helps the reader understand the code that processes the data when s/he gets to it. You put this stuff in "code" chunks rather than in tables or figures so you can extract it into files and use it in your tests (see below). (You may want to have only short samples here and put full-blown test data in an appendix, so as not to halt the momentum of the presentation before the reader gets to the code.)

10. Literate Definitions

In this section we define basic parameterizations of things. Anything which might reasonably be changed should go in this section.

Several programs use the perl system: define where to find the perl interpreter:

<perl location 1> =
<perl location for <$1 hawthorn 0> 4><>

Macro referenced in scraps <21>, <33>.

<perl location for indy03 2> =
/usr/monash/gnu/bin/perl<>

Macro never referenced.

<perl location for central 3> =
/usr/bin/perl<>

Macro never referenced.

<perl location for hawthorn 4> =
/usr/bin/perl<>

Macro referenced in scraps <1>, <1>.

We also record logging details on a logfile:

<logfile location 5> =
/home/ajh/Mail/wordlog<>

Macro referenced in scraps <21>, <33>.

And have a polite warning message for Word users

<warning 6> =
/home/ajh/literate/mail/warning<>

Macro referenced in scrap <28>.

11. Literate Code

11.1. `.procmailrc`

This is really the heart of this literate program: developing the .procmailrc file that drives procmail.

".procmailrc" 7 =
PATH=           /usr/bin:/usr/ucb:/bin:/usr/bin:<decoder binaries 14>
MAILDIR=        ${HOME}/Mail
DEFAULT=        ${MAILDIR}/inbox/.
LOGFILE=        ${MAILDIR}/procmail.log
VERBOSE=        on       # on logs diagnostics in LOGFILE
SHELL=          /bin/sh

#HOST=hawthorn
<>

File defined by scraps <7>, <8>, <9>, <10>, <13>.

We start with some variable definitions.

PATH: A list of directories to search for any executable encountered during procmail processing.
MAILDIR: The base directory for all mail files and folders.
DEFAULT: The file/directory into which mail is placed by default. Defined here to be an mh folder inbox, by virtue of the trailing "." (an lovely little procmail "convention": would that the manual was less cryptic and arcane!)
LOGFILE
VERBOSE
SHELL
HOST

11.1.1. Null Filters

".procmailrc" 8 =
:1H
* ^Subject: *Cron *<ajh@junee> */home/ajh/bin/master-synch
/dev/null
<>

File defined by scraps <7>, <8>, <9>, <10>, <13>.

discard all master-synch messages before proceeding.

11.1.2. Forwarding Filters

".procmailrc" 9 =
:1Bc
<procmail mimencode content pattern 31>
<procmail mimencode handling 32>
:0c
! ajh2
<>

File defined by scraps <7>, <8>, <9>, <10>, <13>.

Process any mime attachments. The 1 means there is one condition in the <procmail mimencode content pattern 31>, the B means to match the condition (egrep) over the body, and the c means to continue processing even if this matches.

The second part is an appended action (A), and signifies that mail matching the <procmail mimencode content pattern 31> is also forwarded to ajh2. As of version 1.4.8, this reverts to forwarding ALL messages

".procmailrc" 10 =
:1Bc
<procmail xml-perl content pattern 17>
<procmail xml-perl handling 18>
<>

File defined by scraps <7>, <8>, <9>, <10>, <13>.

Process xml-perl mailing list.

The following is temporarily macroed out.

<old.procmailrc 11> =

:1Bc
<procmail uuencode content pattern 19>
<procmail uuencode handling 20>

:1Bc
<procmail pilot-link content pattern 15>
<procmail pilot-link handling 16>

<>

Macro never referenced.

"old.procmailrc" 12 =
:1H
^X-Submit-Course:.*
{
  ERRORS=/tmp/prm$$

  :0w              # process student assessment submissions
  |(/usr/bin/submiti > $ERRORS 2>&1)

  :0               # bounce unsuccessful submissions
  |/usr/bin/submitb $ERRORS

}
<>

".procmailrc" 13 =
#
# ---- Bounce anything from addresses listed in .killfile
:1
? fgrep -i -f $MAILDIR/.killfile
{
:0hfw # Note: with `h' flag, no -k option for formail
| (formail -rA"X-Loop: ajh@cs.monash.edu.au" ; cat $MAILDIR/.rejectuser)

:0
! -oi -t
}
<>

File defined by scraps <7>, <8>, <9>, <10>, <13>.

<decoder binaries 14> =
${HOME}/mail-handling/process<>

Macro referenced in scrap <7>.

<procmail pilot-link content pattern 15> =
^Subject: *[Pilot-Unix]<>

Macro referenced in scrap <11>.

<procmail pilot-link handling 16> =
/usr/lib/mh/rcvstore +MCs/Pilot<>

Macro referenced in scrap <11>.

11.2. XML Perl filtering

<procmail xml-perl content pattern 17> =
.*perl-xml@listserv<>

Macro referenced in scrap <10>.

<procmail xml-perl handling 18> =
Docs/XMLPerl<>

Macro referenced in scrap <10>.

11.3. R1: uuencoded mail

uuencoded mail is detected with a content pattern that recognizes lines of the form:

begin 777 filename.ext

where 777 are the permission bits.

Here's the pattern for Microsoft word documents, the only file types recognized at the moment.

<procmail uuencode content pattern 19> =
begin.*\.([dD][oO][cC]|rtf)<>

Macro referenced in scrap <11>.

<procmail uuencode handling 20> =
| uudecoder <>

Macro referenced in scrap <11>.

11.3.1. The uudecoder filter `uudecoder`

The filter for this task reads from standard input, and writes a file <uudecode mail file 22> containing a copy of the input. The encoded file name is extracted, and the mail file run through the decoder. The resultant decoded file is saved, and also passed through catdoc for remailing to me.

"uudecoder" 21 =
#!<perl location 1>
open(LOG,">><logfile location 5>") || die "cannot open <logfile location 5>";
$_=`/bin/date "+%Y %b %d %T"`; chop;
print LOG "$_ uudecoder started\n";
<uudecoder: read from standard input and store somewhere 23>
<uudecoder: pass constructed file through uudecoder 26>
<uudecoder: report appropriate information and log details 27>
$_=`/bin/date "+%Y %b %d %T"`; chop;
print LOG "$_ uudecoder ended\n\n";
close(LOG);
<>

<uudecode mail file 22> =
/home/ajh/Mail/ajhdoc<>

Macro referenced in scraps <23>, <26>.

<uudecoder: read from standard input and store somewhere 23> =
open(OFILE,"><uudecode mail file 22>");
while (<>) {
  print OFILE $_;
  <uudecoder: check and collect file name 24>
  <uudecoder: check and collect From field 25>
}
close OFILE;
<>

Macro referenced in scrap <21>.

Buried within the incoming mail is a line that triggered the execution of this script (see <procmail uuencode content pattern 19>), so we examine each line as it goes past to capture the file name details. The pattern is the word begin, starting in column 1, followed by a 3 octal digit permission attribute, then a file name.

A later version might use this recognition as a flag to turn on output to the OFILE, since lines to this point are ignored by uudecode, and might as well be discarded now.

We currently recognize only .doc files: maybe a later version will expand this.

<uudecoder: check and collect file name 24> =
if (/^begin [0-9]* (.*)$/) {
  $name=$1; $text=$name; 
  if ($text=~s/.[Dd][Oo][Cc]//) {$text="$text.txt";}
  else {$text="NotADocFile";}
  print LOG "found file name=>$name<=, text version=>$text<=\n";
}
<>

Macro referenced in scrap <23>.

<uudecoder: check and collect From field 25> =
if (/^From: (.*)$/) {
  $from=$1;
  print LOG "msg received from $from\n";
}
<>

Macro referenced in scraps <23>, <39>.

When the mail file is decoded, the encoded file appears in the current directory with the file name $name. This value is extracted as we scan the standard input, in steps <uudecoder: read from standard input and store somewhere 23> and <uudecoder: check and collect file name 24>.

<uudecoder: pass constructed file through uudecoder 26> =
`uudecode <uudecode mail file 22>`;
<>

Macro referenced in scrap <21>.

<uudecoder: report appropriate information and log details 27> =
$fullpath="/home/ajh/attachments/$name";
if (-f $name) {
  `mv $name $fullpath`;
  print LOG "$name has been extracted to the attachments directory\n";
  `echo "This document extracted from $fullpath" >$text`;
  `echo >>$text`;
  `/home/ajh/mail-handling/process/word2x $fullpath`;
  $fullpathtxt=$fullpath; $fullpathtxt=~s/.doc$/.txt/;
  `cat $fullpathtxt >>$text`;
  `formail -i "Reply-To: $from" <$text | mail ajh`;
  print LOG "mailed $text to ajh\n";
} else {
  print LOG "For some unknown reason, $name does not exist!\n";
}
<send sender a warning 28>
<>

Macro referenced in scrap <21>.

11.3.2. Manning the Microsoft Ramparts

I'm totally fed up with Microflabby software (or is it Microsoft flabware?). Annoy the user a la Lloyd.

<send sender a warning 28> =
$from =~ s/.*<//;
$from =~ s/>//;
#`mail $from <<warning 6>`;
print LOG "<warning 6> (potentially) sent to $from\n";
<>

Macro referenced in scraps <27>, <48>, <49>.

There is a growing (perhaps not rapidly enough!) community that views the increasing dominance of Microsoft in the computing world as a serious threat to the ``diversity of species'' in the computing world. I am all for competition and survival of the fittest, but there does need to be an adequate gene pool (to carry the analogy perhaps as far as one might) to ensure that robust, adaptable, and accurate software remains available within the computing community.

Accordingly, the following warning message attempts to give an alternate perspective as to how to avoid propogating the Microsoft gene pool. See also my related documents on web pages.

As Jay Sekora (http://www.aq.org/~js/, js@aq.org) stated in the pilot-unix mailing list on Tue, 01 Dec 1998 23:58:16 says:

Actually, I think diversity is a pretty healthy thing in software projects as it is in genetic populations. Some of the projects may die out, some of them will interbreed and the best features will spread. Some of them may just hang on by a thread for a while, but then be resistant to some problem that kills off some of the others. Centralized planning is not a big strength of open-source software (the bazaar versus the cathedral), but I think that's a *good* thing.

"warning" 29 =
This is an automatically generated message.

Your mail to John Hurst contained an attachment, which I cannot easily
read on my desktop computer (it is a Linux machine).  The mail has
been automatically forwarded to my iBook, where I can run Microsoft
Office.

Please be aware that I do not read mail on that machine as regularly
as I do on my desktop machine, so if your mail is urgent, you are
invited to reply to this message with a plain text version of your
message.

--John Hurst
--  Associate Dean (Teaching), Faculty of Information Technology
--  Associate Professor, School of Computer Science and Software Engineering
--rm G23, Building 63
--Monash University, Clayton, VIC 3168                       ~ ~~~&#:
--ajh@cs.monash.edu.au  +61 3 990 55192          _..___  ---____@___H__
--(mob 0407 569 041) (fax +61 3 990 55146)       |_____[_|_________[__]_
--http://www.csse.monash.edu.au/~ajh              oo oo  oo O--O--O o=o


<>

For the Makefile, we need to change the permission bits on the constructed uudecoder file.

<uudecode installation 30> =
install-uudecoder: process.tangle
        chmod 755 uudecoder
        touch install-uudecoder
<>

Macro referenced in scrap <52>.

11.4. R2: mimencoded mail

<procmail mimencode content pattern 31> =
Content-[tT]ransfer-[eE]ncoding:.*[bB][aA][sS][eE]64<>

Macro referenced in scrap <9>.

<procmail mimencode handling 32> =
| mimedecoder <>

Macro referenced in scrap <9>.

11.4.1. The mime filter `mimedecoder`

"mimedecoder" 33 =
#!<perl location 1>
#$lockfile = "/home/ajh/Mail/mimedecode-lock";
#while ( -f $lockfile ) { sleep 5 };
#`touch $lockfile`;
open(LOG,">><logfile location 5>") || die "cannot open <logfile location 5>";
<mimedecoder: get date and time 34>
print LOG "\n$_ mime decoder started\n";
<mimedecoder: read from standard input and store somewhere 37>
<mimedecoder: pass constructed file through metamail 38>
<mimedecoder: report appropriate information and log details 40>
<mimedecoder: get date and time 34>
print LOG "$_ mime decoder ended\n";
close(LOG);
#`unlink $lockfile`;
<>

The mime filter is called by procmail when we recognize the pattern <procmail mimencode content pattern 31> in the incoming mail. It writes some logging messages, reads standard input (the incoming mail text), and saves that to an intermediate file. This is necessary, as we want to pass the mail through the metamail filter, which does most of the hard work for us. We then look at the output of metamail, and perform some processing based upon what we find there.

Because the script doesn't seem to be properly working as yet, in a way that seems very asynchronous, I've added a lock around the whole thing (but note that it does have a race condition). I'm not convinced it is at all useful.

<mimedecoder: get date and time 34> =
$_=`/bin/date "+%Y %b %d %T"`; chop;
<>

Macro referenced in scrap <33>.

This stuff just computes the date and time for use in the log file.

<mimedecode mail file 35> =
/home/ajh/Mail/ajhmdoc<>

Macro referenced in scraps <37>, <38>.

define where the mime document is saved

<mimedecode list file 36> =
/tmp/metamail.lis<>

Macro referenced in scraps <38>, <40>.

define where the temporary file used in processing the mime document is kept.

<mimedecoder: read from standard input and store somewhere 37> =
open(OFILE,"><mimedecode mail file 35>");
while (<>) {
  print OFILE $_;
  <mimedecoder: check and collect From field 39>
}
close OFILE;
<>

Macro referenced in scrap <33>.

<mimedecoder: pass constructed file through metamail 38> =
print LOG "recognized mime encoded word document\n";
unlink "<mimedecode list file 36>";
`/usr/bin/metamail -d -w <<mimedecode mail file 35> ><mimedecode list file 36>`;
<>

Macro referenced in scrap <33>.

remove any previous file, then invoke metamail

<mimedecoder: check and collect From field 39> =
<uudecoder: check and collect From field 25>
<>

Macro referenced in scrap <37>.

<mimedecoder: report appropriate information and log details 40> =
open(METAMAIL,"<mimedecode list file 36>");
$foundfile=0; $founddescr=0;
while (<METAMAIL>) { # </METAMAIL> XML canceller
  <mimedecoder: look at Content Description 41>
  <mimedecoder: recognize mime encoded documents 42>
  <mimedecoder: handle wrotefile line from metamail 43>
}
close METAMAIL; 
<>

Macro referenced in scrap <33>.

Sometimes a filename comes through in the Content-Description field. Extract it if there is one. I don't believe this is necessary though, since metamail will pick up the real filename as necessary. I've left this here just in case: it won't do any harm, since we will see a filename under the <mimedecoder: handle wrotefile line from metamail 43>.

<mimedecoder: look at Content Description 41> =
if (m#Content-Description: (.*)$#) {
  $filename=$1; 
  if ($filename) {
    <escape non-filename characters in {$filename} 44>
    $founddescr=1;
    print LOG "mimedecodes: filename=>$filename<=,description: $_";
  }
}
<>

Macro referenced in scrap <40>.

<mimedecoder: recognize mime encoded documents 42> =
$lower=lc($_);
if ($lower=~m#application/msword# || m#application/octet-stream#) {
  $foundfile=1;
  print LOG "mimedecodes: $_";
}
<>

Macro referenced in scrap <40>.

I had an && $foundfile appended to the following condition, but it was missing some relevant files, and I couldn't see its purpose, so I've taken it out. I'll probably have to put it back again later, hence this comment.

<mimedecoder: handle wrotefile line from metamail 43> =
if (/^Wrote file (.*)$/) {
  $tmp=$1; 
  <discard bad filenames and skip to next 46>
  <escape non-filename characters in {$tmp} 44>
  print LOG "mimedecodes Wrotefile: $_";
  if ($founddescr == 0) {
    $filename=$tmp; $filename =~ s#/tmp/##;
    print LOG "mimedecodes: revised filename=>$filename<=\n";
  }
  <check for bad filenames and filter 45>
  $fullpath="/home/ajh/attachments/$filename";
  `mv $tmp $fullpath`;
  print LOG "extracted file $fullpath from $tmp\n";
  $foundfile=0; $founddescr=0;
  if ($filename =~ /.([Dd][Oo][CcTt])$/) {
    <mimedecoder: handle word document 48>
  }
  if ($filename =~ /.([Rr][Tt][Ff])$/) {
    <mimedecoder: handle rtf document 49>
  }
}
<>

Macro referenced in scrap <40>.

<escape non-filename characters in <X#1> 44> =
<X#1 0> =~ s/ /\\ /g;
<>

Macro referenced in scraps <41>, <43>.

Other systems use various characters in filenames that are not valid Unix filename characters, so we escape them as necessary.

<check for bad filenames and filter 45> =
$filename=~s/(\.[^.]*)//; # remove extension
$ext=$1;
$_=$filename;
<munge very long file names 47>
$filename=$_.$ext; # replace extension
<>

Macro referenced in scrap <43>.

Two main things concerning filenames:

Some files we just want to discard. Match these filenames and exit the loop.
Some filenames are too long. Apply a munging algorithm and continue on.

<discard bad filenames and skip to next 46> =
if (/^Card for .*$/) {next};
if (/^mm\..*$/) {next};
<>

Macro referenced in scrap <43>.

<munge very long file names 47> =
s/ //g; # remove blanks
s/\\//g; # remove escapes
s/&//g; # remove ampersands
s/;//g; # remove semicolons
s/\(|\)//g; # remove parentheses
if (length($_.$ext) > 31) {
  my $oldname=$_;
  while (length($_.$ext) > 31) {
    if (!s/(a|e|i|o|u)([^aeiou]*)$/$2/) {last}; # remove vowels from end
  }
  while (length($_.$ext) > 31) {
    if (!s/[^aeiou]$//) {last}; # remove consonants from end
  }
  print LOG "Bad file name \"$oldname\" munged to \"$_\"\n";
}
<>

Macro referenced in scrap <45>.

We have sussed out that this is a word document, given that it has been mime-encoded as an `application/msword' (strong evidence), or as an `application/octet-stream' (weaker evidence, but then, what would you expect from brain damaged software anyway?).

Extract a filename for the text version (.txt extension in place of .doc), feed it through catdoc and mail the output back to me, so that I get a readable version. Send a dialectic message back to the sender, as well.

Note that at the moment I don't recognise .rtf files. I might have to change this in future.

<mimedecoder: handle word document 48> =
$text=$filename; $text=~s/.[Dd][Oo][CcTt]/.txt/;
`echo "This document extracted from $fullpath" >$text`;
`echo >>$text`;
`/home/ajh/mail-handling/process/word2x $fullpath`;
$fullpathtxt=$fullpath; $fullpathtxt=~s/.doc$/.txt/;
print LOG "adding $fullpathtxt to mailing\n";
`cat $fullpathtxt >>$text`;
`formail -i "Reply-To: $from" <$text | mail ajh`;
print LOG "mailed $text to ajh\n";
<send sender a warning 28>
<>

Macro referenced in scrap <43>.

rtf files are neither here nor there. I can't use catdoc to handle them, so just send the dialectic message.

<mimedecoder: handle rtf document 49> =
<send sender a warning 28>
<>

Macro referenced in scrap <43>.

For the Makefile, we need to change the permission bits on the constructed uudecoder file.

<mimedecoder installation 50> =
mimedecoder: process.tangle
        chmod 755 mimedecoder
install-mimedecoder: mimedecoder
        touch install-mimedecoder
<>

Macro referenced in scrap <52>.

11.5. Obsolete Components

"never-used" 51 =
<>

12. Literate Build Scripts

The makefile relies upon an additional file MakefileImplicit which defines standard nutweb build operations. default defines the name of the literate program (here process), while {\tt flags} define any parameters to the nutweb tangle and weave operations (``-1 ${HOSTNAME}'' is assumed by MakefileImplicit).

"Makefile" 52 =
default=process
flags=
include ${HOME}/etc/MakeXMLLiterate
install: install-procmailrc install-uudecoder install-mimedecoder
install-procmailrc: process.tangle
        cp .procmailrc ${HOME}
        touch install-procmailrc
<uudecode installation 30>
<mimedecoder installation 50>
all:    install process.dvi
Makefile: ${default}.tangle
clean: 
        rm process.tangle
        rm install-procmailrc install-uudecoder install-mimedecoder
<>

13. Literate Tests

Here are some test files to use against the processing filter above. Note that in the following test files, @ signs have been converted to double ats, to avoid swallowing by the literate processor. The first test file, test-mail, checks the uudecoder part. It should generate a file availrsv.doc. %\showcodefalse \ifshowcode \else \begin{centre}\it (code has been omitted from listing.) \end{centre} \fi \showcodetrue

14. Bibliography

CS: The bibliography should include references not only to books and journal articles (which I actually hardly ever need, unless I've implemented an especially tricky data structure or algorithm), but more importantly to internal memos, system design documents, software manuals, standards, file format specifications, and other documentation a programmer would find useful. This is not a place to exercise restraint -- anything that might be useful should be listed here, because it probably won't ever have been officially noted anywhere else. In many cases the only way your successor will even know that a certain helpful reference exists is if s/he sees it listed here.

15. Glossary

CS: Whether I include one or both, or a combined list, or break out abbreviations separately, depends on how many entries of each type I need and who I expect will be reading the document.

16. Indices

CS: I prefer to have two: one for "code stuff", such as the names of variables, subroutines, data types, etc., that a programmer would want, and a separate one for the "text stuff", which might be read by a non-coder who is skimming the documentation.

Three sets of indices can be created automatically by nutweb: an index of file names, an index of macro names, and an index of user-specified identifiers. An index entry includes the name of the entry, where it was defined, and where it was referenced.

16.1. Files

".procmailrc" <7>, <8>, <9>, <10>, <13>
"Makefile" <52>
"mime-test" <54>
"mimedecoder" <33>
"never-used" <51>
"old.procmailrc" <12>
"test-mail" <53>
"uudecoder" <21>
"warning" <29>

16.2. Chunk Names

<X ?> Referenced in scrap <44>.
<check for bad filenames and filter <45>> Referenced in scrap <43>.
<decoder binaries <14>> Referenced in scrap <7>.
<discard bad filenames and skip to next <46>> Referenced in scrap <43>.
<escape non-filename characters in \#1 <44>> Referenced in scraps <41>, <43>.
<logfile location <5>> Referenced in scraps <21>, <33>.
<mimedecode list file <36>> Referenced in scraps <38>, <40>.
<mimedecode mail file <35>> Referenced in scraps <37>, <38>.
<mimedecoder installation <50>> Referenced in scrap <52>.
<mimedecoder: check and collect From field <39>> Referenced in scrap <37>.
<mimedecoder: get date and time <34>> Referenced in scrap <33>.
<mimedecoder: handle rtf document <49>> Referenced in scrap <43>.
<mimedecoder: handle word document <48>> Referenced in scrap <43>.
<mimedecoder: handle wrotefile line from metamail <43>> Referenced in scrap <40>.
<mimedecoder: look at Content Description <41>> Referenced in scrap <40>.
<mimedecoder: pass constructed file through metamail <38>> Referenced in scrap <33>.
<mimedecoder: read from standard input and store somewhere <37>> Referenced in scrap <33>.
<mimedecoder: recognize mime encoded documents <42>> Referenced in scrap <40>.
<mimedecoder: report appropriate information and log details <40>> Referenced in scrap <33>.
<munge very long file names <47>> Referenced in scrap <45>.
<old.procmailrc <11>> Not referenced.
<perl location for \#1 <4>, <4>> Referenced in scrap <1>.
<perl location for central <3>> Not referenced.
<perl location for hawthorn <4>> Referenced in scraps <1>, <1>.
<perl location for indy03 <2>> Not referenced.
<perl location <1>> Referenced in scraps <21>, <33>.
<procmail mimencode content pattern <31>> Referenced in scrap <9>.
<procmail mimencode handling <32>> Referenced in scrap <9>.
<procmail pilot-link content pattern <15>> Referenced in scrap <11>.
<procmail pilot-link handling <16>> Referenced in scrap <11>.
<procmail uuencode content pattern <19>> Referenced in scrap <11>.
<procmail uuencode handling <20>> Referenced in scrap <11>.
<procmail xml-perl content pattern <17>> Referenced in scrap <10>.
<procmail xml-perl handling <18>> Referenced in scrap <10>.
<send sender a warning <28>> Referenced in scraps <27>, <48>, <49>.
<uudecode installation <30>> Referenced in scrap <52>.
<uudecode mail file <22>> Referenced in scraps <23>, <26>.
<uudecoder: check and collect From field <25>> Referenced in scraps <23>, <39>.
<uudecoder: check and collect file name <24>> Referenced in scrap <23>.
<uudecoder: pass constructed file through uudecoder <26>> Referenced in scrap <21>.
<uudecoder: read from standard input and store somewhere <23>> Referenced in scrap <21>.
<uudecoder: report appropriate information and log details <27>> Referenced in scrap <21>.
<warning <6>> Referenced in scrap <28>.

16.3. Identifiers

Knuth prints his index of identifiers in a two-column format. This requires modification of the TeX output routine, and significantly increases the size of the nutmacs.tex file. Therefore, it seems better to leave it this up to the user.