scanf, fscanf, sscanf, or wsscanf Subroutine

Purpose

Converts formatted input.

Library

Standard C Library (libc.a)

or (libc128.a)

Syntax

#include <stdio.h>

int scanf ( Format [, Pointer, ... ]) const char *Format;

int fscanf (Stream, Format [, Pointer, ... ]) FILE * Stream; const char *Format;

int sscanf (String, Format [, Pointer, ... ]) const char * String, *Format;

int wsscanf (wcs, Format [, Pointer, ... ]) const wchar_t * wcs const char *Format;

Description

The scanf, fscanf, sscanf, and wsscanf subroutines read character data, interpret it according to a format, and store the converted results into specified memory locations. If the subroutine receives insufficient arguments for the format, the results are unreliable. If the format is exhausted while arguments remain, the subroutine evaluates the excess arguments but otherwise ignores them.

These subroutines read their input from the following sources:

Item	Description
scanf	Reads from standard input (stdin).
fscanf	Reads from the Stream parameter.
sscanf	Reads from the character string specified by the String parameter.
wsscanf	Reads from the wide character string specified by the wcs parameter.

The scanf, fscanf, sscanf, and wsscanf subroutines can detect a language-dependent radix character, defined in the program's locale (LC_NUMERIC), in the input string. In the C locale, or in a locale that does not define the radix character, the default radix character is a full stop . (period).

Parameters

Item	Description
wcs	Specifies the wide-character string to be read.
Stream	Specifies the input stream.
String	Specifies input to be read.
Pointer	Specifies where to store the interpreted data.
Format	Contains conversion specifications used to interpret the input. If there are insufficient arguments for the Format parameter, the results are unreliable. If the Format parameter is exhausted while arguments remain, the excess arguments are evaluated as always but are otherwise ignored. The Format parameter can contain the following: Space characters (blank, tab, new-line, vertical-tab, or form-feed characters) that, except in the following two cases, read the input up to the next nonwhite space character. Unless a match in the control string exists, trailing white space (including a new-line character) is not read. Any character except a % (percent sign), which must match the next character of the input stream. A conversion specification that directs the conversion of the next input field. The conversion specification consists of the following: The % (percent sign) or the character sequence %n$. Note: The %n$ character sequence is an X/Open numbered argument specifier. Guidelines for use of the %n% specifier are: The value of n in %n$ must be a decimal number without leading 0's and must be in the range from 1 to the NL_ARGMAX value, inclusive. See the limits.h file for more information about the NL_ARGMAX value. Using leading 0's (octal numbers) or a larger n value can have unpredictable results. Mixing numbered and unnumbered argument specifications in a format string can have unpredictable results. The only exceptions are %% (two percent signs) and %* (percent sign, asterisk), which can be mixed with the %n$ form. Referencing numbered arguments in the argument list from the format string more than once can have unpredictable results. The optional assignment-suppression character * (asterisk). An optional decimal integer that specifies the maximum field width. An optional character that sets the size of the receiving variable for some flags. Use the following optional characters: l Long integer rather than an integer when preceding the d, i, or n conversion codes; unsigned long integer rather than unsigned integer when preceding the o, u, or x conversion codes; double rather than float when preceding the e, f, or g conversion codes. ll Long long integer rather than an integer when preceding the d, i, or n conversion codes; unsigned long long integer rather than unsigned integer when preceding the o, u, or x conversion codes. L A long double rather than a float, when preceding the e, f, or g conversion codes; long integer rather than an integer when preceding the d, i, or n conversion codes; unsigned long integer rather than unsigned integer when preceding the o, u, or x conversion codes. h A short integer rather than an integer when preceding the d, i, and n conversion codes; an unsigned short integer (half integer) rather than an unsigned integer when preceding the o, u, or x conversion codes. H _Decimal32 rather than a float, when preceding the e, E, f, F, g, or G conversion codes. D _Decimal64 rather than a float, when preceding the e, E, f, F, g, or G conversion codes. DD _Decimal128 rather than a float, when preceding the e, E, f, F, g, or G conversion codes.

Item Description

Item	Description
Format (cont.)	An optional character that sets the size of the receiving variable for vector data types. Use the following optional characters: v vector float (four 4-byte float components) when preceding the e, E, f, g, G, a, or A conversion codes; vector signed char (sixteen 1-byte char components) when preceding the c, d, or i conversion codes; vector unsigned char when preceding the o, u, x, or X conversion codes. vl or lv vector signed integer (four 4-byte integer components) when preceding the d or i conversion codes; vector unsigned integer when preceding the o, u, x, or X conversion codes. vh or hv vector signed short (eight 2-byte integer components) when preceding the d or i conversion codes; vector unsigned short when preceding the o, u, x, or X conversion codes. For any of the preceding specifiers, an optional separator character can be specified immediately preceding the vector size specifier. If no separator is specified, the default separator is a space unless the conversion is c, in which case the default separator is null. The set of supported optional separators are `,` (comma), `;` (semicolon), `:` (colon), and `_` (underscore). A conversion code that specifies the type of conversion to be applied. The conversion specification takes the form: `%[*][width][size]convcode`

Format (cont.)

An optional character that sets the size of the receiving variable for vector data types. Use the following optional characters:
v

vector float (four 4-byte float components) when preceding the e, E, f, g, G, a, or A conversion codes; vector signed char (sixteen 1-byte char components) when preceding the c, d, or i conversion codes; vector unsigned char when preceding the o, u, x, or X conversion codes.

vl or lv

vector signed integer (four 4-byte integer components) when preceding the d or i conversion codes; vector unsigned integer when preceding the o, u, x, or X conversion codes.

vh or hv

vector signed short (eight 2-byte integer components) when preceding the d or i conversion codes; vector unsigned short when preceding the o, u, x, or X conversion codes.

For any of the preceding specifiers, an optional separator character can be specified immediately preceding the vector size specifier. If no separator is specified, the default separator is a space unless the conversion is c, in which case the default separator is null. The set of supported optional separators are , (comma), ; (semicolon), : (colon), and _ (underscore).
A conversion code that specifies the type of conversion to be applied.
The conversion specification takes the form:
```
%[*][width][size]convcode
```

The results from the conversion are placed in the memory location designated by the Pointer parameter unless you specify assignment suppression with an * (asterisk). Assignment suppression provides a way to describe an input field to be skipped. The input field is a string of nonwhite space characters. It extends to the next inappropriate character or until the field width, if specified, is exhausted.

The conversion code indicates how to interpret the input field. The corresponding Pointer parameter must be a restricted type. Do not specify the Pointer parameter for a suppressed field. You can use the following conversion codes:

%

Accepts a single % (percent sign) input at this point; no assignment or conversion is done. The complete conversion specification should be %% (two percent signs).

d

Accepts an optionally signed decimal integer with the same format as that expected for the subject sequence of the strtol subroutine with a value of 10 for the base parameter. If no size modifier is specified, the Pointer parameter should be a pointer to an integer.

i

Accepts an optionally signed integer with the same format as that expected for the subject sequence of the strtol subroutine with a value of 0 for the base parameter. If no size modifier is specified, the Pointer parameter should be a pointer to an integer.

u

Accepts an optionally signed decimal integer with the same format as that expected for the subject sequence of the strtoul subroutine with a value of 10 for the base parameter. If no size modifier is specified, the Pointer parameter should be a pointer to an unsigned integer.

o

Accepts an optionally signed octal integer with the same format as that expected for the subject sequence of the strtoul subroutine with a value of 8 for the base parameter. If no size modifier is specified, the Pointer parameter should be a pointer to an unsigned integer.

x

Accepts an optionally signed hexadecimal integer with the same format as that expected for the subject sequence of the strtoul subroutine with a value of 16 for the base parameter. If no size modifier is specified, the Pointer parameter should be a pointer to an integer.

e, f, or g

Accepts an optionally signed floating-point number with the same format as that expected for the subject sequence of the strtod subroutine. The next field is converted accordingly and stored through the corresponding parameter; if no size modifier is specified, this parameter should be a pointer to a float. The input format for floating-point numbers is a string of digits, with some optional characteristics:

It can be a signed value.
It can be an exponential value, containing a decimal rational number followed by an exponent field, which consists of an E or an e followed by an (optionally signed) integer.
It can be one of the special values INF, NaNQ, or NaNS. This value is translated into the IEEE-754 value for infinity, quiet NaN, or signaling NaN, respectively.

p

Matches an unsigned hexadecimal integer, the same as the %p conversion of the printf subroutine. The corresponding parameter is a pointer to a void pointer. If the input item is a value converted earlier during the same program execution, the resulting pointer compares equal to that value; otherwise, the results of the %p conversion are unpredictable.

n

Consumes no input. The corresponding parameter is a pointer to an integer into which the scanf, fscanf, sscanf, or wsscanf subroutine writes the number of characters (including wide characters) read from the input stream. The assignment count returned at the completion of this function is not incremented.

s

Accepts a sequence of nonwhite space characters (scanf, fscanf, and sscanf subroutines). The wsscanf subroutine accepts a sequence of nonwhite-space wide-character codes; this sequence is converted to a sequence of characters in the same manner as the wcstombs subroutine. The Pointer parameter should be a pointer to the initial byte of a char, signed char, or unsigned char array large enough to hold the sequence and a terminating null-character code, which is automatically added.

S

Accepts a sequence of nonwhite space characters (scanf, fscanf, and sscanf subroutines). This sequence is converted to a sequence of wide-character codes in the same manner as the mbstowcs subroutine. The wsscanf subroutine accepts a sequence of nonwhite-space wide character codes. The Pointer parameter should be a pointer to the initial wide character code of an array large enough to accept the sequence and a terminating null wide character code, which is automatically added. If the field width is specified, it denotes the maximum number of characters to accept.

c

Accepts a sequence of bytes of the number specified by the field width (scanf, fscanf and sscanf subroutines); if no field width is specified, 1 is the default. The wsscanf subroutine accepts a sequence of wide-character codes of the number specified by the field width; if no field width is specified, 1 is the default. The sequence is converted to a sequence of characters in the same manner as the wcstombs subroutine. The Pointer parameter should be a pointer to the initial bytes of an array large enough to hold the sequence; no null byte is added. The normal skip over white space does not occur.

C

Accepts a sequence of characters of the number specified by the field width (scanf, fscanf, and sscanf subroutines); if no field width is specified, 1 is the default. The sequence is converted to a sequence of wide character codes in the same manner as the mbstowcs subroutine. The wsscanf subroutine accepts a sequence of wide-character codes of the number specified by the field width; if no field width is specified, 1 is the default. The Pointer parameter should be a pointer to the initial wide character code of an array large enough to hold the sequence; no null wide-character code is added.

[scanset]

Accepts a nonempty sequence of bytes from a set of expected bytes specified by the scanset variable (scanf, fscanf, and sscanf subroutines). The wsscanf subroutine accepts a nonempty sequence of wide-character codes from a set of expected wide-character codes specified by the scanset variable. The sequence is converted to a sequence of characters in the same manner as the wcstombs subroutine. The Pointer parameter should be a pointer to the initial character of a char, signed char, or unsigned char array large enough to hold the sequence and a terminating null byte, which is automatically added. In the scanf, fscanf, and sscanf subroutines, the conversion specification includes all subsequent bytes in the string specified by the Format parameter, up to and including the ] (right bracket). The bytes between the brackets comprise the scanset variable, unless the byte after the [ (left bracket) is a ^ (circumflex). In this case, the scanset variable contains all bytes that do not appear in the scanlist between the ^ (circumflex) and the ] (right bracket). In the wsscanf subroutine, the characters between the brackets are first converted to wide character codes in the same manner as the mbtowc subroutine. These wide character codes are then used as described above in place of the bytes in the scanlist. If the conversion specification begins with [] or [^], the right bracket is included in the scanlist and the next right bracket is the matching right bracket that ends the conversion specification. You can also:

Represent a range of characters by the construct First-Last. Thus, you can express [0123456789] as [0-9]. The First parameter must be lexically less than or equal to the Last parameter or else the - (dash) stands for itself. The - also stands for itself whenever it is the first or the last character in the scanset variable.
Include the ] (right bracket) as an element of the scanset variable if it is the first character of the scanset. In this case it is not interpreted as the bracket that closes the scanset variable. If the scanset variable is an exclusive scanset variable, the ] is preceded by the ^ (circumflex) to make the ] an element of the scanset. The corresponding Pointer parameter should point to a character array large enough to hold the data field and that ends with a null character (\0). The \0 is added automatically.

A scanf conversion ends at the end-of-file (EOF character), the end of the control string, or when an input character conflicts with the control string. If it ends with an input character conflict, the conflicting character is not read from the input stream.

Unless a match in the control string exists, trailing white space (including a new-line character) is not read.

The success of literal matches and suppressed assignments is not directly determinable.

The National Language Support (NLS) extensions to the scanf subroutines can handle a format string that enables the system to process elements of the argument list in variable order. The normal conversion character % is replaced by %n$, where n is a decimal number. Conversions are then applied to the specified argument (that is, the nth argument), rather than to the next unused argument.

The first successful run of the fgetc, fgets, fread, getc, getchar, gets, scanf, or fscanf subroutine using a stream that returns data not supplied by a prior call to the ungetc (ungetc or ungetwc Subroutine) subroutine marks the st_atime field for update.

Return Values

These subroutines return the number of successfully matched and assigned input items. This number can be 0 if an early conflict existed between an input character and the control string. If the input ends before the first conflict or conversion, only EOF is returned. If a read error occurs, the error indicator for the stream is set, EOF is returned, and the errno global variable is set to indicate the error.

Error Codes

The scanf, fscanf, sscanf, and wsscanf subroutines are unsuccessful if either the file specified by the Stream, String, or wcs parameter is unbuffered or data needs to be read into the file's buffer and one or more of the following conditions is true:

Item	Description
EAGAIN	The O_NONBLOCK flag is set for the file descriptor underlying the file specified by the Stream, String, or wcs parameter, and the process would be delayed in the scanf, fscanf, sscanf, or wsscanf operation.
EBADF	The file descriptor underlying the file specified by the Stream, String, or wcs parameter is not a valid file descriptor open for reading.
EINTR	The read operation was terminated due to receipt of a signal, and either no data was transferred or a partial transfer was not reported.

Note: Depending upon which library routine the application binds to, this subroutine may return EINTR. Refer to the signal (sigaction, sigvec, or signal Subroutine) subroutine regarding SA_RESTART.

Item	Description
EIO	The process is a member of a background process group attempting to perform a read from its controlling terminal, and either the process is ignoring or blocking the SIGTTIN signal or the process group has no parent process.
EINVAL	The subroutine received insufficient arguments for the Format parameter.
EILSEQ	A character sequence that is not valid was detected, or a wide-character code does not correspond to a valid character.
ENOMEM	Insufficient storage space is available.