mail2sms - README
Docs Index
Man Page
Regex Guide

SourceForge Logo

                               _ _ ____                    
               _ __ ___   __ _(_) |___ \ ___ _ __ ___  ___ 
              | '_ ` _ \ / _` | | | __) / __| '_ ` _ \/ __|
              | | | | | | (_| | | |/ __/\__ \ | | | | \__ \
              |_| |_| |_|\__,_|_|_|_____|___/_| |_| |_|___/

Project: mail2sms
Version: 1.3.x
Date:    February 12, 2001
Author:  Daniel Stenberg <>


  mail2sms reads a (MIME) mail and converts it to a short message. It offers
search and replace, conditional rules, conditional search and replace etc to
create a custom output. It can optionally pipe its output into a specified

  mail2sms is entirely FREE. See the LEGAL file for details.

			      Table Of Contents

	1. Usage

	2. Config File Format
	  In General
	  2.1 General Keywords
	  2.2 Search/Replace, Conditional Search/Replace
	  2.3 Conditional Config File Sections
	  2.4 Stop/Allow Message Forwarding
	  2.5 Conditional Actions and Variables
	  2.6 Logfile and Include

	3. mail2sms internals

				   1. Usage

	mail2sms [options] < mail

 mail2sms reads the config file /usr/local/mail2sms/config first and then
 $HOME/.mail2sms by default.

 Available options:

 -c [file]  specifies what config file to read. It can be used repeatedly.
 -d         switch on debug messages in the log file
 -I [dir]   adds a directory to the include path.
 -l [file]  log everything to the specified file, this overrides 'logfile'
            entries on the config file
 -n         prevents reading the default config files
 -o         makes mail2sms write the sms message to stdout when completed (and
            not invoke any sub-command).
 -p         sets the $phone variable (see the run command)
 -q         shuts off all logging
 -v         prints version number and quits

			    2. Config File Format

Each line should be in the format:

	<keyword> [ : <value> ]

Values are either written plainly and whitespaces left and right of the words
are cut off, or within quotes ("). If the value is quoted, you must escape
quotes to be able to have them in the string, as in: " \" ".

Lines beginning with '#' are treated as comments.

Basically, there are two kinds of keywords. The first is the one read and
dealt with in real-time when read from the config file, they may also build
strings that define output format, specify command to run etc.. The other type
is the keywords that build a tree of regular expressions with accompanying
actions. Those actions might be performed when the regexes match contents of
the input mail. It is important to understand the difference.

  2.1 General Keywords

	options: <options>

	(may be specified as o:)

	Options are single words separated with spaces or commas. The options
	control the following search/replace or if operation. Available
	options are:

	 1perline (previously: "noloop")
	   - only one replace per line
	   - only one replace per mail
	 subject (*)
	   - replace in subject
	 from (*)
	   - replace in from (the name or if not available, the email address)
         fromaddress (*)
           - replace in from address (the email address)
	 to (*)
	   - replace in to (the name or if not available, the email address)
         toaddress (*)
           - replace in the to field (the email address)
	 body (*)
	   - replace in body
	 fullbody (*)
	   - replace in the full body, as one large buffer without newlines
         header (*)
	   - search/replace in header! Must be specifily specified, if not
	     specified no searching will be done in headers.
	   - case insensitive search. Only the letters A-Z will be treated
	     case insensitively.
	 prio <1-5>
	   - lower prio value makes the regex be done before higher values.
	     Default prio is 3.

	(*) = when one or more of these are used, the search/replace or if
	      is only valid for the specified parts.

	log: <message>

	This is a message that will be added to the log file when the action
	specified after this is performed. 'log' is used by search, if,
	not, begin and many other keywords.

 2.2 Search/Replace, Conditional Search/Replace
	search: <search regex>

	(may be specified as s:) The search regex is a full posix egrep style
	regex. Must be used BEFORE the replace command.

        This regex may in fact match many times, not just one.

	replace: <replacement>

	(may be specified as r:) Replace is the replacement for the
	search. \<num> can be used to insert "registers" from the search. Must
	be used AFTER a search command. This must also be the last line in a
	search/replace action.

        One search/replace pair may be used many times if the search pattern
        is general enough.

	if: <trigging regex>

	Starts a conditional sub-section. This sub-section that MUST be ended
	with an 'endif' keyword, is an ordinary sequence of config items that
	won't be considered until the if-regex has first matched once. Once
	the if-regex has matched, all the sub-sections' expressions (like
	search/replace within this sub-section) are all moved to the "normal"
	list and are then treated just as normal items.

	You can control the if-regex with the 'options' keyword. The 'log'
	keyword is also used.

	You can "append" actions to this keyword by using one or more of the
	following keywords within the if/endif block:

                abort, outsize, create, delete, system, config, run, program,
                progargs, output, phone, server, port, multipart, maxparts

        When you use these keywords within an if/endif block, they will all
        take effect the first time the IF regex matches (and only then).

        Do not confuse the IF keyword with BEGIN. The IF/ENDIF is for
        conditions against mail content like if you want different behaviour
	for different kinds of mails. BEGIN/END is used for making parts of
	the config file conditional.

	You MUST end this sub-section with 'endif'


	Closes a conditional sub-section previously started with the 'if'

	not: <regex>

	This keyword can only be used within an if-endif section, or after a
	'replace' keyword. Each time this keyword is used, it adds a regex to
	the list on the the preceeding regex keyword (i.e search/replace, or
	if) that must NOT match for the regex to match. 'not' can be used any
	number of times.

	You may specify a separate 'log' line for the NOT expression. The
	'not' expression will always be tried on the exact same context that
	the previous regex-expression just matched.

	Example, if the subject includes Daniel but NOT Stenberg, add a
	special search/replace rule:

		options: subject
		if: Daniel
		Not: Stenberg
			search: Daniel
			replace: Fake-Danman

	Make a search/replace like the above without the if:

		options: subject
		search: Daniel
		replace: Fake-Danman
		not: Stenberg

 2.3 Conditional Config File Sections
 These keywords control what parts of a config file that is read.

	when: <expression>
	whennot: <expression>

	If the given when expression matches current conditions, it sets the
	condition flag. 

        If a whennot expression matches current conditions it clears the
        condition flag.

	You can also reverse the expression by prefixing it with !.

	NOTE: a 'when' command only defines what sections to read from the
	config file.

	The condition flag controls whether a following sub-section shall be
	parsed or skipped. If the condition flag is set, the section is
	parsed, otherwise it is skipped!

	A sub section is specified within 'begin', 'else' and 'end'
	keywords. The condition flag is always undefined after a begin, end
	or else.

	Default state of the condition flag is undefined, and the first when
	command will set it.

	    The EXPRESSION may involve:

	     day <sequence 1 - 31>
		Day of the month.
	     hour <sequence 0 - 23 or 0000 - 2359>
		Note that the size of the numbers are used to figure out which
		of these formats to use. 0-23 means full hours, 0-26 would
		mean from 00:00 to 00:26>
	     min <sequence 0000 - 2359>
		Note that this is hour+minutes and not plain minutes
	     wday <sequence 1 - 7>
		Day of the week. 1 is Monday, 7 is Sunday.
	     month <sequence 1 - 12>
		Month of the year. 1 is January, 12 is December.
	     year <sequence 1998 - 2038>
		Year number. Only full 4-digit years are supported.
	     file <filename>
		TRUE if the file is present
		Always set TRUE
		Always set FALSE

	    Sequences can be specified as:

		1-4,2,9-4,3 (no whitespace)

	    Filename is a filename without whitespace.

	    You can combine the different parts of an expression to a combined
	    expression as in:

		day 1-3 hour 8-14 wday 1,3,6

	    The file keyword makes the operation dependent on the presense of
	    a named file:

		when: file /etc/passwd

	    ... requires that named file to be there for the program to

	    Expressions are read top to bottom.

	    You can mix when and whennot keywords to make complex expressions.

	Example, make a special search expression friday the 13th 1999:

		# place condition
		when: year 1999
		when: day 13 wday 5
		  # start sub section
	   	  options: nocase
		  search: .*
		  replace: friday the 13th!

	Starts a config file sub section. If the condition flag is set or
	undefined, this section will be parsed. If the condition flag is
	cleared, this section will be skipped all the way to the following end
	(or else) keyword (not counting sub sections within this section). You
	MUST end a sub section with a corresponding 'end'.

        It is important to remember that begin/end keywords only change what
        to read in the config file. The section's "condition flag" is set by
        conditions found when reading the config. The begin/end section can
        *NEVER* depend on particular mail contents, since the config file is
        scanned long before any mail content is known. To make conditional
        actions depending on mail content, see the IF keyword.

	If 'log' was used for this keyword, it will be logged together with
	the condition flag status.

	You can use '{' instead of 'begin'.


	Ends a sub section. See 'begin' and 'when' for more details. This
	shall not be used without a preceeding 'begin' or 'else' keyword.

	You can NOT have the 'end' keyword of section appear in another file
	than the one with the initiating 'begin'.

	You can use '}' instead of 'end'.


	This must be preceeded with a begin keyword and an end keyword should
	end this section. This keyword ends a section and if the previous one
	was parsed, the one following this won't be and vice verse.

	You can NOT have the 'else' keyword of section appear in another file
	than the one with the initiating 'begin'.

	'else' shall not be used without a preceeding 'begin' keyword.

  2.4 Stop/Allow Message Forwarding
 These commands are taken care of when the config file is read, and the
decision whether to continue or not will be taken before the mail is read at


	Sets the BREAK flag. If all config files have been read and the BREAK
	flag is set, no SMS will be sent.

        To abort the SMS sending depending on some mail content, use the abort
	keyword within an if/endif section.


	Clears the BREAK flag. See 'break'.

	exit: <exit code>

	Sets the BREAK flag and sets what return code mail2sms should return
        to the invoking process/shell. If all config files have been read and
        the BREAK flag is set, no SMS will be sent.

        Note that this keyword can also be used within IF/ENDIF sections and
        if so, it will be treated as a ABORT keyword with an added exitcode

  2.5 Conditional Actions and Variables

 These actions can be put on root-level, within begin-end sections or within
if-endif section. When set within if/endif sections, they take effect at first
when that IF expression matches.

 All these keywords take advantage of the log string in case there is one set
before the actual keyword. The logging will take place when the action is

	abort: <message>

	The abort keyword aborts the sms creation immediately. This keyword
	can't be used on root-level, abort MUST be put within an if/endif
	section. The message will simply be logged.

	config: <file name>

	Reads the specified config file if the previous 'if' regex matches.

	create: <filename>

	Creates the file if it isn't already created.

	delete: <filename>

	Deletes the specified file.

exit  (see the upper section for full details)

	maxparts: <num>

        This specifies the maximum number of output parts that mail2sms is
	allowed to generate. Default is 1. Each part is maximum 'outsize'
	bytes long.

        multipart: <format>

        Specifies how to deal with multipart prefixes or suffixes. If the
	message is sent in more than one part, this string is used to define
	the prefix or suffix to include in every single part. If this format
	starts with a dash (-) the rest of the string will be used as a
	suffix, otherwise it will be a prefix.

        The format string has two "variables" that will be replaced with the
	numbers that goes for each part and message:

                $index     starts at 1 and is increased for every part
                $numparts  the amount of parts that this message uses 

        Other characters that can be used in the string are:

                \t         Tab character
                \n         Newline character
                \r         Carriage return character

	output: <output>

	Specifies what to output and in what order. Available variables are:

          $from         The name or the email of the sender
          $fromaddress  The email address of the sender
          $to           The name or the email of the receiver
          $toaddress    The email address of the receiver
          $subject      The subject field
          $body         The body of the mail

        Other characters that can be used in the string are:

        \t              Tab character
        \n              Newline character
        \r              Carriage return character

	Default output string is "F: $from S: $subject B: $body". Note that
        as much as possible of all variables will be output. I.e if the first
        variable used is very big, no other output will be shown.

	outsize: <number of bytes>

	Specifies the maximum size of the output message. Default is 160.
	There is at max 'maxparts' parts of this size sent.

	system: <command line>

	Runs the specified shell command line.

	<keyword> : <value>

	These keywords all set the variable with the same name as the keyword.
	The variable can be used in the 'run' string to specify what command
	line to use when passing the SMS to a client program.

	run : <command line to run>

	This command line is a string with the possibility to insert different
	variables. All non-variable characters are inserted as specified. The
	'output' string will be passed to the program on its stdin.

	There is no default run string. If no string is specified, no program
	will be run by mail2sms.

	Available variables are:

	Name		Contains
	----		--------
	$message	The entire SMS message.
	$phone		What has been specified with the phone keyword.
	$port		What has been specified with the port keyword.
	$progargs	What has been specified with the progargs keyword.
	$program	What has been specified with the program keyword.
	$server		What has been specified with the server keyword.

        filter : <filter instruction>

        The filter can only be used conditionally. It cannot be set on the
        root-level. A filter is set on a section on a line-by-line basis. When
        the filter is switched on, it is in use until it is again switched
        off. The filter instruction that is written on the right side should
        be in this format:

                <name> <status> [lines <num>] [exclude/include]

        'name' is the filter name. Currently 'ignore' is the only available
        supported filter. 'ignore' effectively cuts off the lines in the
        filtered section. Note that the theory here is that you should be able
        to switch on and off different filters independently. As soon as there
        is more filters available that is.

        'status' is either "on" or "off". Setting a filter to "on" means it
        takes effect, it gets in use. Switched "off" a filter means that the
        filter is turned off and things go back to normal (non-filtered).

        'lines <num>' (for 'on' entries only) sets the filter to remain valid
        for a certain number of lines, the matching one counted. This makes
        the filter to automatically get switched off after this amount of lines
        have been sent through the filter.

        'exclude' in the line means that the matching line in itself should
        not be considered as part of the filtered section. Exclude it from the

        'include' is the opposite of 'exclude'. It is the default behaviour
        and it makes the matching line of a filter expression to be included
        in the filter section.

  2.6 Logfile and Include
 These can only be made conditional within begin/end sections.

	logfile: <filename>

	Logs all messages to the specified logfile instead of stderr.

	showlog: <what to include>

	Sets what kinds of log messages to include in the logfile. The format
	for the control string is:

		<[+/-]LOGTYPE1>, <[+/-]LOGTYPE2>

	Available log types are: ALL, INFO, BREAK, ERROR, DEBUG, WARNING,

	A few examples explain this best:

	To see all log entries:

		showlog: +all

	To see all except the DELETE ones:

		showlog: +all, -delete

	To see the default set but not the WARNING ones:

		showlog: -warning

	To disable if, search and not:

		showlog: -if,search,not

	Switch off everything:

		showlog: -all

	NOTE: the -d command line flag enables "ALL" (including DEBUG). Normal
	defaults are "+ALL,-DEBUG".

	path: <directory name without trailing slash>

	Adds the specified directory to the include path. The include path is
	a list of directories that mail2sms will search through for the
	specified file when include is used. By default, no directory is in
	the include path.

	include: <file>

	Makes that specified conf file get read.

			    3. mail2sms Internals

 The Regex Loop
  When traversing the config files, mail2sms creates a linked list of regex
nodes. One node is added for each IF or SEARCH/REPLACE.

 Each regex node may have sub-list (that themselves are actal regex nodes)
that if the main node matches, are moved into the main list. They're not taken
into account until they're in the main list.

 Each sublist node may itself have subnodes of course, which can make it
pretty advanced.

 Each regex node may also have a list of not-matches. If any of the entries in
the not-list matches, the main node is considered not a match. An attempt to
draw this situation looks like:

 Search/Replace #1
     If #2 ----- Subsearch/replace #1 - Subsearch/replace #2 - ...
 Search/Replace #3
        | \__                             
        |    \__                          
        |        Not #1  - Not #2 - ...
 Search/Replace #4

 The list is built when the config files are read. mail2sms then goes through
the input mail and for each line of the mail it does the following:

We start at the beginning with the highest prio; number 1.

  1. If the node isn't of this prio, get the next. If we reach the end of the
     list, increase the prio to check for and restart the list. If the prio
     to check for reaches max, end the loop.

  2. Check if the condition is dependent on what kind of input (header, from,
     subject, etc). If it isn't supposed to replace/match the current type,

  3. Make sure that we haven't looped this too many times.

  4. Check if the regex matches. Otherwise, loop.

  5. Make sure we haven't found exactly this match too many times.

  6. Check the type of the regex.

    6.1 If it is an 'IF':

	o Check that it hasn't already been "done".

	o Check the not-list. If any of them matches, treat this as a

        o Perform all the actions that were specified within this if/endif

	o Move the sublist to the "main list".

    6.2 If it is a 'SEARCH/REPLACE', perform the replace.

  7. Reset to start-of-list at prio 1.

  8. Loop

 Order Of Tests

 First all headers are read.

 Each header is first tested as a 'header' and then secondly, if it is a known
 header (like To:, From: etc) it is tested as that kind of header.

 When all headers are done the body gets read.

 Each line of the body is tested as 'body', one line at a time. This goes on
 until mail2sms has collected a body-"buffer" that is ten times the size of
 the output text (by default that means 10*160 bytes). It then tests the whole
 body-buffer as 'fullbody'.

 In each of these many tests, the low prio tests are performed before the high
 prio tests.  But this order described here is important to understand since
 the prio system only changes importance within the same test.

Modified April 19, 2011