T M D A

  
TMDA Homepage TMDA @ SourceForge      
Overview
Introduction
History
Features
Results & Testimonials
TMDA In Use
 
Usage
Requirements
Download
Installation
Configuration
 
Configuration
Overview
Pre-Configuration
Server Configuration
Client Configuration
Filter Specification
 
Support
Troubleshooting
FAQ
Bugs & Patches
Mailing Lists
List Archive
External Resources
 
Author
Jason R. Mastaler
 
© 2001-2002
  

TMDA Filter Specification

TMDA filter files are used to control mail both coming in to and going out of TMDA. For incoming, the filter controls how the message is disposed of. For outgoing, it controls how the message is tagged. The incoming filter file default is ~/.tmda/filters/incoming, which can be changed by setting FILTER_INCOMING in your tmdarc. The outgoing filter file default is ~/.tmda/filters/outgoing, which can be changed by setting FILTER_OUTGOING.

Format of Filter Files:

A filter file is composed of filters, blank lines and comments. Each filter in a filter file is expected to be a string containing three fields separated by whitespace. Each filter may be placed on a single line or may be spread over several lines, to increase readability. You must follow a few rules when formatting your filters.
  • Each filter must start at the beginning of a line.
  • Subsequent lines that are part of the same filter must be indented by one or more spaces or tabs.
  • Blank lines (either empty or containing only whitespace) or a line beginning a new filter signify the end of the previous rule.
  • A line containing only a comment does not end a filter. You may intersperse comment lines within your filters (see the examples).
  • Everything after # on a line is considered a comment and ignored.
  • Filters with invalid syntax are logged with a specific error message into the debug log and the mail is deferred. This is to allow you the chance to fix your filter before processing the mail.
The filter file is read sequentially from top to bottom, and the first match wins. The three fields in a filter are:

source match action

  • source: specifices the source of the match.

    Some sources take optional arguments. An argument begins with a dash (-). Some arguments take options. If an argument takes an option, the argument is followed immediately by an equals sign (=) and the option. No whitespace is allowed on either side of the equals sign.

    Single arguments look like this:

        -argument
    
    and arguments with options look like this:
        -argument=option
    
    Arguments are listed below. Square brackets ([]) indicate the the argument is optional. Words in chevrons (<>) should be replaced by the appropriate option, without the chevrons.

    Sources can be one of:

    (for both incoming and outgoing):

    from (sender address)
    from-file [ -autocdb | -autodbm ]
    from-cdb
    from-dbm
    from-ezmlm
    from-mailman -attr=<attribute>
    to (recipient address)
    to-file [ -autocdb | -autodbm ]
    to-cdb
    to-dbm
    to-ezmlm
    to-mailman -attr=<attribute>
    
    (for incoming only):
    body [ -case ] (message body)
    body-file [ -case ]
    headers [ -case ] (message headers)
    headers-file [ -case ]
    size (message size)
    
    
  • match: should be an expression, or the full path to a textfile, CDB database, or DBM database containing more expressions if source was suffixed with `-file', `-cdb', or `-dbm'.

    The second field within a textfile, CDB or DBM is optional, but overrides action if present. e.g,
    foo@mastaler.com
    bar@mastaler.com bounce
    
    
    In a CDB or DBM, the keys should be the e-mail addresses to match, and their corresponding values (or records) should be empty unless you want to override the action specified in the filter file DBM support comes with your Python interpreter, but CDB support currently requires that you first install the python-cdb extension module.

    The '-autocdb' and '-autodbm' arguments are intended to ease the use of CDB/DBM lists in TMDA by automatically rebuilding the CDB or DBM file as necessary. This gives you the performance advantages of hashed databases without the hassle of having to manually maintain them. Both 'from-file' and 'to-file' take the -autocdb/-autodbm flags. The match field of 'from-file' and 'to-file' is always the name of the textfile. You do not need to add '.cdb' or '.db' if you use the -auto* flags. It will be automatically appended to the filename. With the -auto* arguments, TMDA will rebuild the database if it doesn't exist or if its timestamp is older than its source file. If the rebuild fails for some reason, TMDA will fall back to matching from the sourcefile instead. Before you try the CDB version of this feature, make sure you have the python-cdb extension module installed. The printcdb/printdbm scripts in the contrib directory can be used to print the contents of CDB/DBM files in TMDA-list format.

    The '-ezmlm' and '-mailman' sources can be used to match the subscribers of an ezmlm or Mailman mailing list.

    '-ezmlm' sources match addresses contained in ezmlm `subscribers' directories. The match field of an -ezmlm line should be the full path to the parent directory.

    '-mailman' sources match addresses contained in a Mailman configuration database. The match field should be the full path to the list directory. Both Mailman 2.0 and 2.1-style configuration databases are supported. The '-mailman' sources require you to specify an `attribute' to search. Use the '-attr=' flag to specify the name of an attribute contained in the database. For example, `members' (subscriber addresses), `digest_members' (digest subscriber addresses), or `owner' (list owner's address).

    The following table shows what the match expression should contain for a given source:

    source:		match:
    -------		------
    to*		recipient e-mail address or wildcard expression.
    from*		sender e-mail address or wildcard expression.
    body*		regular expression matching message body content.
    headers*	regular expression matching message header content.
    size		comparison operator and number of bytes to compare to the
    		size of the message.  Only `<' and `>' are supported.
    
    
    NOTE: To match the empty envelope sender such as bounce messages are sent with, use <> as the expression.

    In addition to explicit e-mail addresses, you can use expressions based on UNIX shell-style wildcard characters in either the match field of a line, or within the textfile in the match field. Wildcard characters are not recognized in a CDB or DBM file. The special characters are:
    
    Characters(s)    Description
    -------------    -----------
    *                Matches everything.
    ?                Matches any single character.
    [seq]            Matches any character in seq.
    [!seq]           Matches any character not in seq.
    
    
    In addition, `@=' (a custom rule) will expand to match both @ and @*.

    Here are some common examples:
       
    # match only jdoe@domain.dom
    jdoe@domain.dom
    # match anyone@domain.dom, but not anyone@sub.domain.dom
    *@domain.dom 
    # match anyone@sub.domain.dom, but not anyone@domain.dom
    *@*.domain.dom
    # match both anyone@domain.dom, and anyone@sub.domain.dom
    *@=domain.dom   
    
    
    The body* and headers* sources on the other hand take regular expressions as defined in Python's re module. Because regular expressions may include spaces, you must surround the regular expressions with quotation marks. You may use either single quotes (') or double quotes (") as long as you use the the same one at both the beginning and the end.

    The regular expression sources use case-insensitive matching by default. If you want a case-sensitive match, use the '-case' flag.

    If you need to match a quote in your regular expression, simply use the other style of quotes to surround the expression or escape the embedded quote with a backslash (\).

  • action: action specifies what action to take on the message. An optional = separates the action from the action's option. Possible values differ based on whether the message is incoming or outgoing.

    (for incoming, action can be one of):
    bounce,reject (bounce the message)
    drop,exit,stop (silently drop the message)
    ok,accept,deliver (deliver the message)
    ok,accept,deliver=instruction (deliver to a Maildir, mbox, program, or address)
    confirm (request confirmation for the message)
    
    
    The instruction option to ok,accept,deliver can be used to short-circuit the default delivery method by delivering the message to a specific location. Delivery to qmail-style Maildirs, mboxrd-format mboxes, programs (pipe), and different e-mail addresses are supported.

    instruction: format: examples:
    program (pipe) A program instruction begins with a vertical bar. The rest of the line will be passed to /bin/sh. Whitespace and shell variables (i.e, $HOME and ~) are allowed. deliver=|/usr/ucb/vacation jason
    deliver=| /usr/bin/maildrop $HOME/.mailfilter
    forward A forward instruction begins with an ampersand. If the address begins with a letter or number, you may leave out the ampersand. deliver=&johndoe@new.job.com
    deliver=janedoe@new.isp.net
    mbox An mbox instruction begins with a slash or tilde, and does not end with a slash. Please note the following restrictions below. deliver=/home/jason/Mailbox
    deliver=~/Mailbox
    maildir A maildir instruction begins with a slash or tilde and ends with a slash. Please note the following restrictions below. deliver=/home/jason/Maildir/
    deliver=~/Maildir/

    Please note the following restrictions to mbox and Maildir delivery:
    • TMDA will not create Maildirs or mbox files if they do not exist. You must create them prior to having TMDA deliver mail to them with commands like maildirmake and touch.

    • TMDA requires write access to any Maildir or mbox file you wish to deliver mail to.

    • TMDA will not deliver to an mbox symlink. Specify the path to the actual mbox file instead.

    • Do not deliver to mbox files located on an NFS filesystem. This is unsafe and can corrupt your mbox file -- this applies to all MDAs, not just TMDA. Use Maildir instead as it is immune to such problems.

    (for outgoing, action can be one of):
    bare (don't tag)
    bare=append (don't tag, and also add recipient to your BARE_APPEND file)
    sender (tag with a sender address based on recipient)
    sender=address (tag with a sender address based on address instead)
    dated (tag with a dated address)
    dated=timeout_interval 
    exp,explicit,as=full_address (use an explicit address)
    ext,extension=address_extension (add an extension to the address)
    kw,keyword=keyword (tag with a keyword address)
    default (take the default action specified by ACTION_OUTGOING)
    tag (tag one or more specific headers with the above actions)
    
    
    For all of the outgoing actions except tag, the tag applies to both the From: (or Resent-From:) header and the envelope sender. Both will be set to the same address.

    The tag action is a little more flexible, in that it can be used to tag more than one header and each header can be tagged with a different address. The tag syntax is as follows:

    to* <email_address> tag <header action> [header action] ...
    Any of the to* sources may be used. The email_address is explained in the match section above. The email address is followed by the keyword tag which is followed by a list of header/action pairs.

    The header is the name of the RFC822 header field you want to tag, usually From: or Reply-To:. When you specify the header name, do not include the :. Use envelope to tag the envelope sender.

    The action is any one of the outgoing actions from the list above (except tag, of course). That action will be used to tag the specified header only. If the specified action is not one from the list above, the action is inserted verbatim as a string. This can be used to add arbitrary headers to your outgoing messages based on destination. Quotes are required for text that includes whitespace and is be unnecessary for single word strings.

    Why would you want to do this? Some mailing lists restrict posting to subscribers only. The way they accomplish this is by comparing the email address of the sender to their subscriber list. Some lists compare the envelope sender while others compare the From: header. If the particular list you are subscribed to compares against the From: header, that means your From: header must contain a valid email address, which can be harvested by spammers.

    The solution to this problem is to subscribe with a sender address and use that same address in your From: header. Then only the list can successfully send mail to you using that address. See the Client Configuration page for more information about sender addresses.

    The final issue now is that legitimate list members can't respond to you, since the sender address works only for the list itself. To get around that problem, set the Reply-To: header to a dated address. Then list members will be able to reply for a period of time but by the time the spammers harvest that address, it will have expired.

    Here's how to set all that up in your outgoing filter:

    to closed-list@lists.com tag from sender=closed-list-admin@lists.com envelope sender=closed-list-admin@lists.com reply-to dated
    As you can see, the ability to place a filter on several lines helps with the readability of these longer tag actions.

    There is one last wrinkle. A common case is where you wish to tag the From: header and the envelope sender with the same address but tag Reply-To: differently. If you specify the From: header but not the envelope sender, TMDA will use address in the From: header as the envelope sender.

    Thus, the above example can be shortened to this:

    to closed-list@lists.com tag from sender=closed-list-admin@lists.com reply-to dated
    Example Incoming Filter:
    
    ### ~/.tmda/filters/incoming (first match wins) ###
    
    # Accept all bounces (messages with an empty envelope sender)
    from <> ok
    
    # Accept all messages to postmistress
    to postmistress@* accept
    
    # Bounce all messages from badboy.dom
    from *@=badboy.dom bounce
    
    # Accept all messages from mycorp.dom
    from *@=mycorp.dom ok
    
    # Include my blacklist and whitelist
    from-dbm ~/.tmda/lists/blacklist.db drop
    from-cdb ~/.tmda/lists/whitelist.cdb accept
    from-file -autodbm ~/.tmda/lists/nastygrams bounce
    from-file -autocdb ~/.tmda/lists/confirmed ok
    from-file ~/.tmda/lists/whitelist_wildcards accept
    
    # Mailman mailing list subscribers and digest subscribers
    from-mailman -attr=members ~mailman/lists/viewnet-news ok
    from-mailman -attr=digest_members ~mailman/lists/viewnet-news ok
    
    # ezmlm mailing list subscribers and digest subscribers
    from-ezmlm ~alias/all-acl-users ok
    from-ezmlm ~alias/all-acl-users/digest ok
    
    # Revoked addresses
    to jason-stupid-promo.289076@mastaler.com bounce
    to jason-jcrew.832234@mastaler.com confirm
    
    # Examine the message content
    body "viagra|ginseng" confirm
    headers 'Precedence:.*junk' reject
    headers -case 'MAKE MONEY FAST' drop
    
    # Accept all messages smaller than 10KB, but drop messages larger than 1MB
    size <10000 deliver
    size >1000000 exit
    
    
    Example Outgoing Filter:
    
    #### ~/.tmda/filters/outgoing (first match wins) ###
    
    # All mail from postmaster is sent bare
    from postmaster@* bare
    
    # And so is all mail from the tmda-users mailing list owner
    to-mailman -attr=owner ~mailman/lists/tmda-users bare
    
    # All whitelisted contacts receive untagged messages
    to-cdb ~/.tmda/lists/whitelist.cdb bare
    to-file ~/.tmda/lists/whitelist_wildcards bare
    
    # Keyword Addresses
    to *@myisp.net kw=myisp
    to king@grassland.com keyword=elvis_parsley
    
    # Dated addresses (some with a non-default timeout interval)
    to bobby@peru.com dated
    to-dbm /var/dbm/slowpokes.db dated=6M
    
    # Majordomo and Mailman check the From: header for membership
    to mutt-users@mutt.org tag
       from      sender=owner-mutt-users@mutt.org
       reply-to  dated
    
    # ezmlm checks the envelope sender address for membership
    to tmda-admin@samstech.net tag
       # Set envelope sender to the extension address I subscribed with
       # and From: to 'default'.  TMDA comes with the default action
       # (ACTION_OUTGOING) set to 'dated'.  Reply-To: isn't necessary in
       # this case.
       envelope  extension=mlists-tmda-admin
       from      default
    
    # Add some arbitrary headers.
    to foo@bar.com tag organization "Whatwerks.com, Inc."
    
    to foo@bar.com tag
       from             sender
       reply-to         dated
       organization     "Disney Land"
       x-favorite-dwarf Hungry
    
    # Use a different username and/or domain
    to *@gnus.org exp=jason@gnus.org
    to xemacs-binary-kits* explicit=binkit-manager@XEmacs.ORG
    to *mail*@=xemacs.org as=postmaster@XEmacs.ORG
    to *@=xemacs.org as=jasonrm@xemacs.org