This chapter describes how the various UUCP protocols work, and discusses some other internal UUCP issues.
This chapter is quite technical. You do not need to understand it, or even read it, in order to use Taylor UUCP. It is intended for people who are interested in how the UUCP code works.
The information in this chapter is posted monthly to the Usenet newsgroups `comp.mail.uucp', `news.answers', and `comp.answers'. The posting is available from any `news.answers' archive site, such as `rtfm.mit.edu'. If you plan to use this information to write a UUCP program, please make sure you get the most recent version of the posting, in case there have been any corrections.
"Unix-to-Unix Copy Program," said PDP-1. "You will never find a more wretched hive of bugs and flamers. We must be cautious."---DECWars
I took a lot of the information from Jamie E. Hanrahan's paper in the Fall 1990 DECUS Symposium, and from Managing UUCP and Usenet by Tim O'Reilly and Grace Todino (with contributions by several other people). The latter includes most of the former, and is published by
O'Reilly & Associates, Inc. 103 Morris Street, Suite A Sebastopol, CA 95472It is currently in its tenth edition. The ISBN number is `0-937175-93-5'.
Some information is originally due to a Usenet article by Chuck Wegrzyn. The information on execution files comes partially from Peter Honeyman. The information on the `g' protocol comes partially from a paper by G.L. Chesson of Bell Laboratories, partially from Jamie E. Hanrahan's paper, and partially from source code by John Gilmore. The information on the `f' protocol comes from the source code by Piet Berteema. The information on the `t' protocol comes from the source code by Rick Adams. The information on the `e' protocol comes from a Usenet article by Matthias Urlichs. The information on the `d' protocol comes from Jonathan Clark, who also supplied information about QFT. The UUPlus information comes straight from Christopher J. Ambler, of UUPlus Development; it applies to version 1.52 and up of the shareware version of UUPlus Utilities, called FSUUCP 1.52, but referred to in this article as UUPlus.
Although there are few books about UUCP, there are many about networks and protocols in general. I recommend two non-technical books which describe the sorts of things that are available on the network: The Whole Internet, by Ed Krol, and Zen and the Art of the Internet, by Brendan P. Kehoe. Good technical discussions of networking issues can be found in Internetworking with TCP/IP, by Douglas E. Comer and David L. Stevens and in Design and Validation of Computer Protocols by Gerard J. Holzmann.
Modern UUCP packages support a priority grade for each command. The grades generally range from A (the highest) to Z followed by a to z. Some UUCP packages (including Taylor UUCP) also support 0 to 9 before A. Some UUCP packages may permit any ASCII character as a grade.
On Unix, these grades are encoded in the name of the command file
created by uucp
or uux
. A command file name generally has
the form `C.nnnngssss' where `nnnn' is the remote system name
for which the command is queued, `g' is a single character grade,
and `ssss' is a four character sequence number. For example, a
command file created for the system `airs' at grade `Z' might
be named `C.airsZ2551'.
The remote system name will be truncated to seven characters, to ensure that the command file name will fit in the 14 character file name limit of the traditional Unix file system. UUCP packages which have no other means of distinguishing which command files are intended for which systems thus require all systems they connect to to have names that are unique in the first seven characters. Some UUCP packages use a variant of this format which truncates the system name to six characters. HDB and Taylor UUCP use a different spool directory format, which allows up to fourteen characters to be used for each system name.
The sequence number in the command file name may be a decimal integer, or it may be a hexadecimal integer, or it may contain any alphanumeric character. Different UUCP packages are different. Taylor UUCP uses any alphanumeric character.
UUPlus Utilities (as FSUUCP, a shareware DOS based UUCP and news package) uses up to 8 characters for file names in the spool (this is a DOS file system limitation; actually, with the extension, 11 characters are available, but FSUUCP reserves that for future use). FSUUCP defaults mail to grade `D', and news to grade `N', except that when the grade of incoming mail can be determined, that grade is preserved if the mail is forwarded to another system. The default grades may be changed by editing the `LIB/MAILRC' file for mail, or the `UUPLUS.CFG' file for news.
UUPC/extended for DOS, OS/2 and Windows NT handles mail at grade
`C', news at grade `d', and file transfers at grade `n'.
The UUPC/extended UUCP
and RMAIL
commands accept grades to
override the default, the others do not.
I do not know how command grades are handled in other non-Unix UUCP packages.
Modern UUCP packages allow you to restrict file transfer by grade depending on the time of day. Typically this is done with a line in the `Systems' (or `L.sys') file like this:
airs Any/Z,Any2305-0855 ...This allows grades `Z' and above to be transferred at any time. Lower grades may only be transferred at night. I believe that this grade restriction applies to local commands as well as to remote commands, but I am not sure. It may only apply if the UUCP package places the call, not if it is called by the remote system.
Taylor UUCP can use the timegrade
and call-timegrade
commands to achieve the same effect.
See section When to Call.
It supports the above format when reading `Systems' or
`L.sys'.
UUPC/extended provides the symmetricgrades
option to announce the
current grade in effect when calling the remote system.
UUPlus allows specification of the highest grade accepted on a per-call
basis with the `-g' option in UUCICO
.
This sort of grade restriction is most useful if you know what grades
are being used at the remote site. The default grades used depend on
the UUCP package. Generally uucp
and uux
have different
defaults. A particular grade can be specified with the `-g' option
to uucp
or uux
. For example, to request execution of
`rnews' on `airs' with grade `d', you might use something
like
uux -gd - airs!rnews < article
Uunet queues up mail at grade `C', but increases the grade based on the size. News is queued at grade `d', and file transfers at grade `n'. The example above would allow mail (below some large size) to be received at any time, but would only permit news to be transferred at night.
This discussion applies only to Unix. I have no idea how UUCP locks ports on other systems.
UUCP creates files to lock serial ports and systems. On most, if not
all, systems, these same lock files are also used by cu
to
coordinate access to serial ports. On some systems getty
also
uses these lock files, often under the name uugetty
.
The lock file normally contains the process ID of the locking process. This makes it easy to determine whether a lock is still valid. The algorithm is to create a temporary file and then link it to the name that must be locked. If the link fails because a file with that name already exists, the existing file is read to get the process ID. If the process still exists, the lock attempt fails. Otherwise the lock file is deleted and the locking algorithm is retried.
Older UUCP packages put the lock files in the main UUCP spool directory, `/usr/spool/uucp'. HDB UUCP generally puts the lock files in a directory of their own, usually `/usr/spool/locks' or `/etc/locks'.
The original UUCP lock file format encodes the process ID as a four byte
binary number. The order of the bytes is host-dependent. HDB UUCP
stores the process ID as a ten byte ASCII decimal number, with a
trailing newline. For example, if process 1570 holds a lock file, it
would contain the eleven characters space, space, space, space, space,
space, one, five, seven, zero, newline. Some versions of UUCP add a
second line indicating which program created the lock (uucp
,
cu
, or getty/uugetty
). I have also seen a third type of
UUCP lock file which does not contain the process ID at all.
The name of the lock file is traditionally `LCK..' followed by the base name of the device. For example, to lock `/dev/ttyd0' the file `LCK..ttyd0' would be created. On SCO Unix, the lock file name is always forced to lower case even if the device name has upper case letters.
System V Release 4 UUCP names the lock file using the major and minor
device numbers rather than the device name. The file is named
`LK.XXX.YYY.ZZZ', where XXX, YYY and
ZZZ are all three digit decimal numbers. XXX is the major
device number of the device holding the directory holding the device
file (e.g., `/dev'). YYY is the major device number of the
device file itself. ZZZ is the minor device number of the device
file itself. If s
holds the result of passing the device to the
stat system call (e.g., stat ("/dev/ttyd0", &s)
), the following
line of C code will print out the corresponding lock file name:
printf ("LK.%03d.%03d.%03d", major (s.st_dev), major (s.st_rdev), minor (s.st_rdev));The advantage of this system is that even if there are several links to the same device, they will all use the same lock file name.
When two or more instances of uuxqt
are executing, some sort of
locking is needed to ensure that a single execution job is only started
once. I don't know how most UUCP packages deal with this. Taylor UUCP
uses a lock file for each execution job. The name of the lock file is
the same as the name of the `X.*' file, except that the initial
`X' is changed to an `L'. The lock file holds the process ID
as described above.
UUCP `X.*' files control program execution. They are created by
uux
. They are transferred between systems just like any other
file. The uuxqt
daemon reads them to figure out how to execute
the job requested by uux
.
An `X.*' file is simply a text file. The first character of each line is a command, and the remainder of the line supplies arguments. The following commands are defined:
uux
was executed, but it can also be a file from the local system
or some other system. If the file is not from the local system, then
the command will usually name a file in the spool directory. If the
optional second argument appears, then the file should be copied to the
execution directory under that name. This is necessary for any file
other than the standard input file. If the standard input file is not
from the local system, it will appear in both an `F' command and an
`I' command.
execve
system call. For some packages this is
the default anyhow.
Here is an example. Given the following command executed on system test1
uux - test2!cat - test2!~ian/bar !qux '>~/gorp'(this is only an example, as most UUCP systems will not permit the cat command to be executed) Taylor UUCP will produce something like the following `X.' file:
U ian test1 F D.test1N003r qux O /usr/spool/uucppublic test1 F D.test1N003s I D.test1N003s C cat - ~ian/bar quxThe standard input will be read into a file and then transferred to the file `D.test1N003s' on system `test2'. The file `qux' will be transferred to `D.test1N003r' on system `test2'. When the command is executed, the latter file will be copied to the execution directory under the name `qux'. Note that since the file `~ian/bar' is already on the execution system, no action need be taken for it. The standard output will be collected in a file, then copied to the directory `/usr/spool/uucppublic' on the system `test1'.
The UUCP protocol is a conversation between two UUCP packages. A UUCP conversation consists of three parts: an initial handshake, a series of file transfer requests, and a final handshake.
Before the initial handshake, the caller will usually have logged in the called machine and somehow started the UUCP package there. On Unix this is normally done by setting the shell of the login name used to `/usr/lib/uucp/uucico'.
All messages in the initial handshake begin with a ^P (a byte with the octal value `\020') and end with a null byte (`\000'). A few systems end these messages with a line feed character (`\012') instead of a null byte; the examples below assume a null byte is being used.
Some options below are supported by QFT, which stands for Queued File Transfer, and is (or was) an internal Bell Labs version of UUCP.
Taylor UUCP size negotiation was introduced by Taylor UUCP, and is also supported by DOS based UUPlus and Amiga based wUUCP and UUCP-1.17.
The initial handshake goes as follows. It is begun by the called machine.
Most UUCP packages will consider each locally supported protocol in turn and select the first one supported by the called UUCP. With some versions of HDB UUCP, this can be modified by giving a list of protocols after the device name in the `Devices' file or the `Systems' file. For example, to select the `e' protocol in `Systems',
airs Any ACU,e ...or in Devices,
ACU,e ttyXX ...Taylor UUCP provides the
protocol
command which may be used either
for a system
(see section Protocol Selection)
or a
port (see section The Port Configuration File).
UUPlus allows specification of the protocol string on a per-system basis
in the `SYSTEMS' file.
The optional number following a `-N' sent by the calling system, or an `ROKN' sent by the called system, is a bitmask of features supported by the UUCP package. The optional number was introduced in Taylor UUCP version 1.04. The number is sent as an octal number with a leading zero. The following bits are currently defined. A missing number should be taken as `011'.
After the protocol has been selected and the initial handshake has been completed, both sides turn on the selected protocol. For some protocols (notably `g') a further handshake is done at this point.
Each protocol supports a method for sending a command to the remote system. This method is used to transmit a series of commands between the two UUCP packages. At all times, one package is the master and the other is the slave. Initially, the calling UUCP is the master.
If a protocol error occurs during the exchange of commands, both sides move immediately to the final handshake.
The master will send one of five commands: `S', `R', `X', `E', or `H'.
Any file name referred to below is either an absolute file name
beginning with `/', a public directory file name beginning with
`~/', a file name relative to a user's home directory beginning
with `~USER/', or a spool directory file name. File names in
the spool directory are not absolute, but instead are converted to file
names within the spool directory by UUCP. They always begin with
`C.' (for a command file created by uucp
or uux
),
`D.' (for a data file created by uucp
, uux
or by an
execution, or received from another system for an execution), or
`X.' (for an execution file created by uux
or received from
another system).
After the `C' command response has been received (in the `SY' case) or immediately (in an `SN' case) the master will send another command.
After the protocol has been shut down, the final handshake is performed. This handshake has no real purpose, and some UUCP packages simply drop the connection rather than do it (in fact, some will drop the connection immediately after both sides agree to hangup, without even closing down the protocol).
That is, the calling UUCP sends six `O' characters and the called UUCP replies with seven `O' characters. Some UUCP packages always send six `O' characters.
The `g' protocol is a packet based flow controlled error correcting protocol that requires an eight bit clear connection. It is the original UUCP protocol, and is supported by all UUCP implementations. Many implementations of it are only able to support small window and packet sizes, specifically a window size of 3 and a packet size of 64 bytes, but the protocol itself can support up to a window size of 7 and a packet size of 4096 bytes. Complaints about the inefficiency of the `g' protocol generally refer to specific implementations, rather than to the correctly implemented protocol.
The `g' protocol was originally designed for general packet drivers, and thus contains some features that are not used by UUCP, including an alternate data channel and the ability to renegotiate packet and window sizes during the communication session.
The `g' protocol is spoofed by many Telebit modems. When spoofing is in effect, each Telebit modem uses the `g' protocol to communicate with the attached computer, but the data between the modems is sent using a Telebit proprietary error correcting protocol. This allows for very high throughput over the Telebit connection, which, because it is half-duplex, would not normally be able to handle the `g' protocol very well at all. When a Telebit is spoofing the `g' protocol, it forces the packet size to be 64 bytes and the window size to be 3.
This discussion of the `g' protocol explains how it works, but does not discuss useful error handling techniques. Some discussion of this can be found in Jamie E. Hanrahan's paper, cited above (see section UUCP Protocol Sources).
All `g' protocol communication is done with packets. Each packet begins with a six byte header. Control packets consist only of the header. Data packets contain additional data.
The header is as follows:
The control byte in the header is composed of three bit fields, referred
to here as tt (two bits), xxx (three bits) and yyy
(three bits). The control is ttxxxyyy, or (tt
<< 6) + (xxx << 3) + yyy
.
The TT field takes on the following values:
l -
b1
valid bytes of data in the data field, beginning with the
second byte. If b1 >= 128
, let b2 be the second byte
in the data field. Then there are l - ((b1 & 0x7f) +
(b2 << 7))
valid bytes of data in the data field, beginning with
the third byte. In all cases l bytes of data are sent (and all
data bytes participate in the checksum calculation) but some of the
trailing bytes may be dropped by the receiver. The xxx and
yyy fields are described below.
In a data packet (short or not) the xxx field gives the sequence number of the packet. Thus sequence numbers can range from 0 to 7, inclusive. The yyy field gives the sequence number of the last correctly received packet.
Each communication direction uses a window which indicates how many unacknowledged packets may be transmitted before waiting for an acknowledgement. The window may range from 1 to 7, and may be different in each direction. For example, if the window is 3 and the last packet acknowledged was packet number 6, packet numbers 7, 0 and 1 may be sent but the sender must wait for an acknowledgement before sending packet number 2. This acknowledgement could come as the yyy field of a data packet, or as the yyy field of a `RJ' or `RR' control packet (described below).
Each packet must be transmitted in order (the sender may not skip sequence numbers). Each packet must be acknowledged, and each packet must be acknowledged in order.
In a control packet, the xxx field takes on the following values:
To compute the checksum, call the control byte (the fifth byte in the header) c.
The checksum of a control packet is simply 0xaaaa - c
.
The checksum of a data packet is 0xaaaa - (check ^
c)
, where ^
denotes exclusive or, and check is the
result of the following routine as run on the contents of the data field
(every byte in the data field participates in the checksum, even for a
short data packet). Below is the routine used by an early version of
Taylor UUCP; it is a slightly modified version of a routine which John
Gilmore patched from G.L. Chesson's original paper. The z
argument points to the data and the c
argument indicates how much
data there is.
int igchecksum (z, c) register const char *z; register int c; { register unsigned int ichk1, ichk2; ichk1 = 0xffff; ichk2 = 0; do { register unsigned int b; /* Rotate ichk1 left. */ if ((ichk1 & 0x8000) == 0) ichk1 <<= 1; else { ichk1 <<= 1; ++ichk1; } /* Add the next character to ichk1. */ b = *z++ & 0xff; ichk1 += b; /* Add ichk1 xor the character position in the buffer counting from the back to ichk2. */ ichk2 += ichk1 ^ c; /* If the character was zero, or adding it to ichk1 caused an overflow, xor ichk2 to ichk1. */ if (b == 0 || (ichk1 & 0xffff) < b) ichk1 ^= ichk2; } while (--c > 0); return ichk1 & 0xffff; }
When the `g' protocol is started, the calling UUCP sends an `INITA' control packet with the window size it wishes the called UUCP to use. The called UUCP responds with an `INITA' packet with the window size it wishes the calling UUCP to use. Pairs of `INITB' and `INITC' packets are then similarly exchanged. When these exchanges are completed, the protocol is considered to have been started.
Note that the window and packet sizes are not a negotiation. Each system announces the window and packet size which the other system should use. It is possible that different window and packet sizes will be used in each direction. The protocol works this way on the theory that each system knows how much data it can accept without getting overrun. Therefore, each system tells the other how much data to send before waiting for an acknowledgement.
When a UUCP package transmits a command, it sends one or more data packets. All the data packets will normally be complete, although some UUCP packages may send the last one as a short packet. The command string is sent with a trailing null byte, to let the receiving package know when the command is finished. Some UUCP packages require the last byte of the last packet sent to be null, even if the command ends earlier in the packet. Some packages may require all the trailing bytes in the last packet to be null, but I have not confirmed this.
When a UUCP package sends a file, it will send a sequence of data packets. The end of the file is signalled by a short data packet containing zero valid bytes (it will normally be preceeded by a short data packet containing the last few bytes in the file).
Note that the sequence numbers cover the entire communication session, including both command and file data.
When the protocol is shut down, each UUCP package sends a `CLOSE' control packet.
The `f' protocol is a seven bit protocol which checksums an entire file at a time. It only uses the characters between `\040' and `\176' (ASCII space and ~) inclusive, as well as the carriage return character. It can be very efficient for transferring text only data, but it is very inefficient at transferring eight bit data (such as compressed news). It is not flow controlled, and the checksum is fairly insecure over large files, so using it over a serial connection requires handshaking (XON/XOFF can be used) and error correcting modems. Some people think it should not be used even under those circumstances.
I believe that the `f' protocol originated in BSD versions of UUCP. It was originally intended for transmission over X.25 PAD links.
The `f' protocol has no startup or finish protocol. However, both sides typically sleep for a couple of seconds before starting up, because they switch the terminal into XON/XOFF mode and want to allow the changes to settle before beginning transmission.
When a UUCP package transmits a command, it simply sends a string terminated by a carriage return.
When a UUCP package transmits a file, each byte b of the file is translated according to the following table:
0 <= b <= 037: 0172, b + 0100 (0100 to 0137) 040 <= b <= 0171: b ( 040 to 0171) 0172 <= b <= 0177: 0173, b - 0100 ( 072 to 077) 0200 <= b <= 0237: 0174, b - 0100 (0100 to 0137) 0240 <= b <= 0371: 0175, b - 0200 ( 040 to 0171) 0372 <= b <= 0377: 0176, b - 0300 ( 072 to 077)
That is, a byte between `\040' and `\171' inclusive is transmitted as is, and all other bytes are prefixed and modified as shown.
When all the file data is sent, a seven byte sequence is sent: two bytes of `\176' followed by four ASCII bytes of the checksum as printed in base 16 followed by a carriage return. For example, if the checksum was 0x1234, this would be sent: `\176\1761234\r'.
The checksum is initialized to 0xffff. For each byte that is sent it is modified as follows (where b is the byte before it has been transformed as described above):
/* Rotate the checksum left. */ if ((ichk & 0x8000) == 0) ichk <<= 1; else { ichk <<= 1; ++ichk; } /* Add the next byte into the checksum. */ ichk += b;
When the receiving UUCP sees the checksum, it compares it against its own calculated checksum and replies with a single character followed by a carriage return.
The sending UUCP checks the returned character and acts accordingly.
The `t' protocol is intended for use on links which provide reliable end-to-end connections, such as TCP. It does no error checking or flow control, and requires an eight bit clear channel.
I believe the `t' protocol originated in BSD versions of UUCP.
When a UUCP package transmits a command, it first gets the length of the
command string, c. It then sends ((c / 512) + 1) *
512
bytes (the smallest multiple of 512 which can hold c bytes
plus a null byte) consisting of the command string itself followed by
trailing null bytes.
When a UUCP package sends a file, it sends it in blocks. Each block
contains at most 1024 bytes of data. Each block consists of four bytes
containing the amount of data in binary (most significant byte first,
the same format as used by the Unix function htonl
) followed by
that amount of data. The end of the file is signalled by a block
containing zero bytes of data.
The `e' protocol is similar to the `t' protocol. It does no flow control or error checking and is intended for use over networks providing reliable end-to-end connections, such as TCP.
The `e' protocol originated in versions of HDB UUCP.
When a UUCP package transmits a command, it simply sends the command as an ASCII string terminated by a null byte.
When a UUCP package transmits a file, it sends the complete size of the file as an ASCII decimal number. The ASCII string is padded out to 20 bytes with null bytes (i.e. if the file is 1000 bytes long, it sends `1000\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0'). It then sends the entire file.
The `G' protocol is used by SVR4 UUCP. It is identical to the `g' protocol, except that it is possible to modify the window and packet sizes. The SVR4 implementation of the `g' protocol reportedly is fixed at a packet size of 64 and a window size of 7. Supposedly SVR4 chose to implement a new protocol using a new letter to avoid any potential incompatibilities when using different packet or window sizes.
Most implementations of the `g' protocol that accept packets larger than 64 bytes will also accept packets smaller than whatever they requested in the `INITB' packet. The SVR4 `G' implementation is an exception; it will only accept packets of precisely the size it requests in the INITB packet.
The `i' protocol was written by Ian Lance Taylor (who also wrote this manual). It was first used by Taylor UUCP version 1.04.
It is a sliding window packet protocol, like the `g' protocol, but
it supports bidirectional transfers (i.e., file transfers in both
directions simultaneously). It requires an eight bit clear connection.
Several ideas for the protocol were taken from the paper A
High-Throughput Message Transport System by P. Lauder. I don't know
where the paper was published, but the author's e-mail address is
piers@cs.su.oz.au
. The `i' protocol does not adopt his
main idea, which is to dispense with windows entirely. This is because
some links still do require flow control and, more importantly, because
using windows sets a limit to the amount of data which the protocol must
be able to resend upon request. To reduce the costs of window
acknowledgements, the protocol uses a large window and only requires an
ack at the halfway point.
Each packet starts with a six byte header, optionally followed by data bytes with a four byte checksum. There are currently five defined packet types (`DATA', `SYNC', `ACK', `NAK', `SPOS', `CLOSE') which are described below. Although any packet type may include data, any data provided with an `ACK', `NAK' or `CLOSE' packet is ignored.
Every `DATA', `SPOS' and `CLOSE' packet has a sequence number. The sequence numbers are independent for each side. The first packet sent by each side is always number 1. Each packet is numbered one greater than the previous packet, modulo 32.
Every packet has a local channel number and a remote channel number. For all packets at least one channel number is zero. When a UUCP command is sent to the remote system, it is assigned a non-zero local channel number. All packets associated with that UUCP command sent by the local system are given the selected local channel number. All associated packets sent by the remote system are given the selected number as the remote channel number. This permits each UUCP command to be uniquely identified by the channel number on the originating system, and therefore each UUCP package can associate all file data and UUCP command responses with the appropriate command. This is a requirement for bidirectional UUCP transfers.
The protocol maintains a single global file position, which starts at 0. For each incoming packet, any associated data is considered to occur at the current file position, and the file position is incremented by the amount of data contained. The exception is a packet of type `SPOS', which is used to change the file position. The reason for keeping track of the file position is described below.
The header is as follows:
(packet << 3) + locchan
(ack << 3) + remchan
(type << 5) + (caller << 4) + len1
If the data length is non-zero, the packet is immediately followed by the specified number of data bytes. The data bytes are followed by a four byte CRC 32 checksum, with the most significant byte first. The CRC is calculated over the contents of the data field.
The defined packet types are as follows:
When the protocol starts up, both systems send a `SYNC' packet. The `SYNC' packet includes at least three bytes of data. The first two bytes are the maximum packet size the remote system should send, most significant byte first. The third byte is the window size the remote system should use. The remote system may send packets of any size up to the maximum. If there is a fourth byte, it is the number of channels the remote system may use (this must be between 1 and 7, inclusive). Additional data bytes may be defined in the future.
The window size is the number of packets that may be sent before a packet is acknowledged. There is no requirement that every packet be acknowledged; any acknowledgement is considered to acknowledge all packets through the number given. In the current implementation, if one side has no data to send, it sends an `ACK' when half the window is received.
Note that the `NAK' packet corresponds to the unused `g' protocol `SRJ' packet type, rather than to the `RJ' packet type. When a `NAK' is received, only the named packet should be resent, not any subsequent packets.
Note that if both sides have data to send, but a packet is lost, it is perfectly reasonable for one side to continue sending packets, all of which will acknowledge the last packet correctly received, while the system whose packet was lost will be unable to send a new packet because the send window will be full. In this circumstance, neither side will time out and one side of the communication will be effectively shut down for a while. Therefore, any system with outstanding unacknowledged packets should arrange to time out and resend a packet even if data is being received.
Commands are sent as a sequence of data packets with a non-zero local channel number. The last data packet for a command includes a trailing null byte (normally a command will fit in a single data packet). Files are sent as a sequence of data packets ending with one of length zero.
The channel numbers permit a more efficient implementation of the UUCP file send command. Rather than send the command and then wait for the `SY' response before sending the file, the file data is sent beginning immediately after the `S' command is sent. If an `SN' response is received, the file send is aborted, and a final data packet of length zero is sent to indicate that the channel number may be reused. If an `SY' reponse with a file position indicator is received, the file send adjusts to the file position; this is why the protocol maintains a global file position.
Note that the use of channel numbers means that each UUCP system may send commands and file data simultaneously. Moreover, each UUCP system may send multiple files at the same time, using the channel number to disambiguate the data. Sending a file before receiving an acknowledgement for the previous file helps to eliminate the round trip delays inherent in other UUCP protocols.
The `j' protocol is a variant of the `i' protocol. It was also written by Ian Lance Taylor, and first appeared in Taylor UUCP version 1.04.
The `j' protocol is a version of the `i' protocol designed for communication links which intercept a few characters, such as XON or XOFF. It is not efficient to use it on a link which intercepts many characters, such as a seven bit link. The `j' protocol performs no error correction or detection; that is presumed to be the responsibility of the `i' protocol.
When the `j' protocol starts up, each system sends a printable ASCII string indicating which characters it wants to avoid using. The string begins with the ASCII character ^ (octal 136) and ends with the ASCII character ~ (octal 176). After sending this string, each system looks for the corresponding string from the remote system. The strings are composed of escape sequences: `\ooo', where `o' is an octal digit. For example, sending the string `^\021\023~' means that the ASCII XON and XOFF characters should be avoided. The union of the characters described in both strings (the string which is sent and the string which is received) is the set of characters which must be avoided in this conversation. Avoiding a printable ASCII character (octal 040 to octal 176, inclusive) is not permitted.
After the exchange of characters to avoid, the normal `i' protocol start up is done, and the rest of the conversation uses the normal `i' protocol. However, each `i' protocol packet is wrapped to become a `j' protocol packet.
Each `j' protocol packet consists of a seven byte header, followed by data bytes, followed by index bytes, followed by a one byte trailer. The packet header looks like this:
(high - 040) * 0100 + (low - 040)
,
where 040 <= high < 0177
and 040 <= low <
0140
. This permits a length of 6079 bytes, but there is a further
restriction on packet size described below.
The header is followed by the number of data bytes given in data-high and data-low. These data bytes are the `i' protocol packet which is being wrapped in the `j' protocol packet. However, each character in the `i' protocol packet which the `j' protocol must avoid is transformed into a printable ASCII character (recall that avoiding a printable ASCII character is not permitted). Two index bytes are used for each character which must be transformed.
The index bytes immediately follow the data bytes. The index bytes are created in pairs. Each pair of index bytes encodes the location of a character in the `i' protocol packet which was transformed to become a printable ASCII character. Each pair of index bytes also encodes the precise transformation which was performed.
When the sender finds a character which must be avoided, it will transform it using one or two operations. If the character is 0200 or greater, it will subtract 0200. If the resulting character is less than 020, or is equal to 0177, it will xor by 020. The result is a printable ASCII character.
The zero based byte index of the character within the `i' protocol
packet is determined. This index is turned into a two byte printable
ASCII index, index-high and index-low, such that the index
is (index-high - 040) * 040 + (index-low - 040)
.
index-low is restricted such that 040 <= index-low <
0100
. index-high is not permitted to be 0176, so 040 <=
index-high < 0176
. index-low is then modified to encode
the transformation:
The receiver decodes the index bytes as follows (this is the reverse of the operations performed by the sender, presented here for additional clarity):
040 <= index-high < 0176
, the index refers to the
data byte at position (index-high - 040) * 040 +
index-low % 040
.
040 <= index-low < 0100
, then 0200 must be added
to indexed byte.
0100 <= index-low < 0140
, then 020 must be xor'ed
to the indexed byte.
0140 <= index-low < 0177
, then 0200 must be added
to the indexed byte, and 020 must be xor'ed to the indexed byte.
index-high == 0176
, the index refers to the data
byte at position (index-low - 040) * 040 + 037
. 0200 must
be added to the indexed byte, and 020 must be xor'ed to the indexed
byte.
This means the largest `i' protocol packet which may be wrapped
inside a `j' protocol packet is (0175 - 040) * 040 + (077 -
040) == 3007
bytes.
The final character in a `j' protocol packet, following the index bytes, is the ASCII character ~ (octal 176).
The motivation behind using an indexing scheme, rather than escape characters, is to avoid data movement. The sender may simply add a header and a trailer to the `i' protocol packet. Once the receiver has loaded the `j' protocol packet, it may scan the index bytes, transforming the data bytes, and then pass the data bytes directly on to the `i' protocol routine.
The `x' protocol is used in Europe (and probably elsewhere) with machines that contain an builtin X.25 card and can send eight bit data transparently across X.25 circuits, without interference from the X.28 or X.29 layers. The protocol sends packets of 512 bytes, and relies on a write of zero bytes being read as zero bytes without stopping communication. It first appeared in the original System V UUCP implementation.
The `y' protocol was developed by Jorge Cwik for use in FX UUCICO, a PC uucico program. It is designed for communication lines which handle error correction and flow control. It requires an eight bit clean connection. It performs error detection, but not error correction: when an error is detected, the line is dropped. It is a streaming protocol, like the `f' protocol; there are no packet acknowledgements, so the protocol is efficient over a half-duplex communication line such as PEP.
Every packet contains a six byte header:
When the protocol starts up, each side must send a sync packet. This is a packet with a normal six byte header followed by data. The sequence number of the sync packet should be 0. Currently at least four bytes of data must be sent with the sync packet. Additional bytes should be ignored. They are defined as follows:
A length field with the high bit set is a control packet. The following control packet types are defined:
If a control packet other than `YPKT_ACK' is received, the connection is dropped. If a checksum error is detected for a received packet, a `YPKT_ERR' control packet is sent, and the connection is dropped. If a packet is received out of sequence, a `YPKT_BAD' control packet is sent, and the connection is dropped.
The checksum is initialized to 0xffff. For each data byte in a packet it is modified as follows (where b is the byte before it has been transformed as described above):
/* Rotate the checksum left. */ if ((ichk & 0x8000) == 0) ichk <<= 1; else { ichk <<= 1; ++ichk; } /* Add the next byte into the checksum. */ ichk += b;
This is the same algorithm as that used by the `f' protocol.
A command is sent as a sequence of data packets followed by a null byte. In the normal case, a command will fit into a single packet. The packet should be exactly the length of the command plus a null byte. If the command is too long, more packets are sent as required.
A file is sent as a sequence of data packets, ending with a zero length packet. The data packets may be of any length greater than zero and less than or equal to the maximum permitted packet size specified in the initial sync packet.
After the zero length packet ending a file transfer has been received, the receiving system sends a `YPKT_ACK' control packet. The sending system waits for the `YPKT_ACK' control packet before continuing; this wait should be done with a large timeout, since there may be a considerable amount of data buffered on the communication path.
The `d' protocol is apparently used for DataKit muxhost (not RS-232) connections. No file size is sent. When a file has been completely transferred, a write of zero bytes is done; this must be read as zero bytes on the other end.
The `h' protocol is apparently used in some places with HST modems. It does no error checking, and is not that different from the `t' protocol. I don't know the details.
The `v' protocol is used by UUPC/extended, a PC UUCP program. It is simply a version of the `g' protocol which supports packets of any size, and also supports sending packets of different sizes during the same conversation. There are many `g' protocol implementations which support both, but there are also many which do not. Using `v' ensures that everything is supported.