MXP design notes

Author: Nick Gammon http://www.gammon.com.au/support/ .

This document is in the public domain. It is not copyright.

Introduction
What is MXP?
Basic parsing rules
- Escape sequences
- Newlines
Elements and entities
Parsing of elements and entities
Comments
Elements
- Syntax of elements
- Treatment of malformed elements
Quoting
Entities
- Syntax of Entities
- Treatment of malformed entities
Arguments
Open, secure, locked sequences
Modal and command elements
Entities
- Nesting of entities
Elements
- Atomic elements
- User-defined elements
Unknown elements

Introduction

These notes are intended to complement the MXP specifications at http://www.zuggsoft.com/zmud/mxp.htm.

Comments or questions about this document should be forwarded to http://www.gammon.com.au/support/ or zugg@zuggsoft.com .

The purpose of this document is to help MUD client developers implement MXP in a standard way, so that using an MXP-enabled MUD will result in the same gaming experience, regardless of client.

It is a method for MUD servers to specify formatting and content information which goes beyond the scope of what is available with ANSI escape codes. For instance, using MXP a server can designate a line as a room description, exit, hyperlink, and so on.

For example:

 
This word is <b>bold</b>.
<RoomDesc>This is a room description</RoomDesc>

Basic parsing rules

Escape sequences

MXP is a stream protocol, and is designed to be interpreted on-the-fly without waiting for newlines or other record boundaries.
MXP tags are only parsed within the escape sequences:
- \e[0z -- open line
- \e[1z -- secure line
- e[10z through to \e[99z -- special sequences, described later
where \e represents the <ESC> character.
Note - strictly speaking, you could include leading zeroes, eg.
- \e[0000z
Outside those escape sequences any MXP tags are displayed "as is". eg. <shrug>

Newlines

Where reference is made to a newline (in C, the \n character) it is intended to encompass the various line endings that a MUD server might send:

`\n`	`Unix`
`\n\r`	`Should not appear`
`\r\n`	`DOS/Windows`
`\r`	`Macintosh`

In each case this document assumes that whatever line ending occurs, it is pre-processed by the client into a straight "newline" character (\n).

Elements and entities

An element is a sequence inside the "<" and ">" characters. Eg. indicating bold text.
An entity is a substitution string, inside the "&" and ";" characters. eg. & indicating an ampersand.

Parsing of elements and entities

Collection of elements - when the character "<" is encountered, the client enters a "collection" phase which terminates when the corresponding ">" character is parsed, however excluding any ">" characters found inside a valid quoted string.

eg.
<!ELEMENT red '<color red>'>
Collection of entities - when the character "&" is encountered, the client enters a "collection" phase which terminates when the corresponding ";" character is parsed. eg. <
Abnormal termination of collection - collection of an element or entity is terminated upon receipt of:
- newline \n
- escape \e
- link loss (loss of connection to server)
If an element is abnormally terminated, eg. "<FONT \n" then it is discarded, possibly with a debug message being displayed somewhere.
Note - the client may offer the option to display abnormally terminated elements, although this is not recommended.
If an entity is abnormally terminated, eg. "&lt\n" then it is displayed "as is".

Comments

Comments may be placed inside the sequence:

Comments commence with "".
Special characters such as "&" and ">" are permitted inside comments, and have no special significance.
The character sequence "-->" is explicitly disallowed inside a comment, as it is the comment terminator.
As described in the previous section, comments are also (abnormally) terminated upon receipt of a \n or \e character.

Elements

Syntax of elements

Elements consist of:

  <keyword [args]>
  </keyword>
  <!keyword [args]>

  eg. <font color=red>This is in red.</font>
      <b>This is in bold.</b>
      <!ELEMENT boldred '<font color=red><b>'>

Keywords must start with a letter (A-Z) and then consist of letters, numbers or the underscore character.
Element names are not case-sensitive.
Keywords are separated from the (optional) arguments by one or more spaces. Arguments are separated from each other by one or more spaces. There does not need to be a space preceding the final ">".
Keywords may not be quoted. eg. <'BR'> is invalid.
Argument names may not be quoted, eg. is invalid.
Closing elements (eg. ) do not take arguments. eg. is invalid.

Treatment of malformed elements

Elements not conforming to the syntax rules above are discarded, possibly with a debug message.
Note - the client may offer the option to display abnormally formed elements, although this is not recommended.
In any case, malformed elements are not evaluated or expanded.

Quoting

Arguments (but not argument names) may be quoted, and must be quoted if they contain imbedded spaces or the ">" or "=" symbols. eg. <SEND href='buy fruit'>
Either single or double quotes may be used. Whichever quote starts the quoted string must terminate the quoted string. The other quote may be used inside the string.
Note - possibly allow doubled quotes to represent a single quote? eg. <SEND 'goto Nick''s house'>

Entities

Syntax of entities

Entities consist of:
```
 &keyword;

eg. &amp;
```
Keywords must start with a letter (A-Z) and then consist of letters, numbers or the underscore character. No imbedded spaces or other characters are permitted.
Entity names are case-sensitive.

Treatment of malformed entities

Entities not conforming to the syntax rules above are displayed "as is". eg. a line containing "John & Judy" would display correctly.

Arguments

Arguments may be by keyword or positional. If by keyword the syntax is:
```
   argument_name=argument_value
```
If no argument name is provided then the argument is assumed to be the next argument by position from the previous argument, or if no previous argument the first argument. This means, that following a keyword argument, the next argument that does not have a keyword is now considered to be the argument in sequence after the keyword. Thus you could use a single keyword argument to "jump" to the middle of an argument list.

Open, secure, locked sequences

Escape sequences used to change mode

By using escape sequences streams of output can be designated as open, secure or locked.

Sequence	Type	Meaning
`\e[0z`	Open	Only open commands (eg. <B>) will be interpreted.
`\e[1z`	Secure	All MXP tags are interpreted.
`\e[2z`	Locked	No MXP tags or entities will be interpreted.

Newline switches to open mode

Following a newline (\n) the output stream is considered open, until another escape sequence arrives which changes the mode.

Purpose of these modes

Open mode

This is intended for output from the MUD that players may be able to influence, eg. chat, say, tell, whisper, player descriptions, and so on. Tags in this category are limited to those that are "harmless" like setting text to bold, italic and so on.

Secure mode

This is for the server to send the full range of tags, such as room descriptions, exits, player hit points, and so on.

Locked mode

This is to allow the server to send out "verbatim" text, to save having to "escape" out characters like "<" and ">".

Modal and command elements

Elements (such as ) can either be modal or command.

Modal elements

Modal elements temporarily change the mode of the text being displayed until the closing element.

eg.

 
This is plain text. <B>This is bold text.</B> This is plain text.

Command elements

Command elements have an immediate effect. There is no corresponding closing tag.

eg.

 
<p>  This is a new paragraph.

Automatic closure of modal elements

To guard against room designers (or people chatting) inadvertently forgetting to close their modal elements the client will automatically close outstanding open tags as follows:

Open mode

All outstanding tags are closed at any of the following:

newline \n
escape \e
mode change to secure mode
link loss (loss of connection to server)

Secure mode

All outstanding tags are closed at:

Mode change to non secure mode
Link loss (loss of connection to server)
Note that a newline \n directly followed by a change back to secure mode does not close outstanding tags

Closure of tags out of sequence

It is possible, indeed likely, that closing tags will be received out of sequence.

eg.

 
<B>This is bold text. <I>This is bold italic text. </B></I>

The should have been sent first. For simplicity of client implementation, the client should close all outstanding tags, up to and including the one requested. In the example above, the would close both the outstanding and the .

Suggested implementation

To allow for nested font and colour changes, we suggest that clients "push" onto a temporary stack the tag name, and current text attributes (colour, bold, italic, font etc.).

Then when the closing tag is reached the outstanding tag is "popped" from the stack, with the saved text attributes being restored.

Also, in the case of tags like <SEND href=&text;> this would be an appropriate time to send the collected text.

Entities

Entities, such as "&", provide simple text substitution. For example, "&" is replaced by the "&" character.

You can define your own entities like this:

 
<!ENTITY version "version 5.5" >

Nesting of entities

You cannot nest entities, nor can you place elements inside entities. The text defined as the entity replacement text is simply displayed "as is".

Elements

Elements, such as "", are one of:

Atomic elements that have a meaning known to the client, such as for bold text
User-defined macro-expansion elements, that allow server designers to define their own groups of atomic elements. For example: <bolditalic> might expand to
Unknown elements. These are tags that are neither known to the client nor user-defined.

Atomic elements

These are predefined elements that have a well-known meaning to client programs, such as for bold, <COLOR> for colour changes, and so on.

These cannot be defined by the server, because to add new atomic elements requires changes to the client program.

User-defined elements

These are defined by the server by using the syntax:

 <!ELEMENT element_name definition [ATT=attribute_list] [TAG=tag] 
    [FLAG=flag] [OPEN] [DELETE] [EMPTY]>

Effectively these act as a 'macro expansion' much as a #define does in C. It lets server implementors combine elements, to simplify various situations.

eg.

 <!ELEMENT bolditalic '<B><I>'> 

This is plain text <bolditalic> this is bold italic </bolditalic> this 
    is plain again.

Processing of user-defined elements

The 'definition' referred to above should consist of atomic elements, which the client can the build into a list.
The elements in the definition list should be simple atomic elements, not closing elements or definition elements.
- <!ELEMENT bold ''> is OK
- <!ELEMENT notbold ""> is not OK
- <!ELEMENT defineit "<!ELEMENT italic ''>"> is not OK
When the macro element is encountered in the text stream, the client expands it into each atomic element, in sequence.
eg. <!ELEMENT bolditalic ''>

<bolditalic> now expands to
When the corresponding element is closed, either explicity or implicitly, then the client expands it into the corresponding closing elements, in reverse sequence.
eg. <!ELEMENT bolditalic ''>

</bolditalic> now expands to

Arguments to elements

Macro expansion elements can contain an attribute list, which effectively becomes "arguments" to the expanded elements.

eg.

 
<!ELEMENT colorbold '<COLOR &col;><B>' ATT='col=red'>

<colorbold>This text is in bold red</colortext>
<colorbold col=blue>This text is in bold blue</colortext>
<colorbold blue>This text is in bold blue</colortext>

Processing of arguments

We suggest that for each macro-expansion element the client stores:

name of element, eg. <colorbold>
OPEN/SECURE flag
definition list, eg.
- <COLOR &col;>
-
attribute list, consisting of:
- name of attribute
- default value (the empty string, if not specified)
  eg. COL=red

Then when the element is encountered in the text stream:

A copy is made of the macro, so that initially the attributes take their default values
Process any supplied attributes, by name or position
Substitute any arguments with the supplied value (eg. &col; might become 'red')
Walk the definition list, processing each atomic item as applicable

Processing the closing tag

The closing tag (eg. </colorbold>) for an element will cause the element to be searched for in the 'outstanding elements' list, and if found, removed. Also at this time:

Any &text; arguments will be acted upon
Any 'set variable' elements will be acted upon

In the case of macro-expansion elements, the definition list will be walked in reverse order, and each definition in it will be closed. eg. will be processed as .

Unknown elements

An element that is neither recognised as an atomic element, nor a macro, is silently discarded, with an optional debug message.

Note - the client may offer the option to display unknown elements, although this is not recommended, as it makes using older clients with MUDs that use newer tags display irritating unknown tags.

Comments to http://www.gammon.com.au/support/

Page updated on Sunday, 13th January 2001