tuning Kernel Service

Purpose

Provides access to the kernel tunable variables through an easily accessible interface.

Syntax

typedef enum {
    TH_MORE,
    TH_EOF
} tmode_t;

#define TH_ABORT TH_EOF

typedef int (*tuning_read_t)(tmode_t mode, long *size, char **buf, void *context);
typedef int (*tuning_write_t)(tmode_t mode, long *size, char *buf, void *context);
tinode_t *tuning_register_handler (path,  mode, readfunc, writefunc, context)
const char *path;
mode_t mode;
tuning_read_t readfunc;
tuning_write_t writefunc;
void * context;
tinode *tuning_register_bint32 (path, mode, variable, low,  high)
const char *path;
mode_t mode;
int32 *variable;
int32 low;
int32 high;
tinode *tuning_register_bint32x (path, rfunc, wfunc, mode, low, high)
const char *path;
mode_t mode;
int32 (*rfunc)(void *);
int (*wfunc)(int32, void *);
void *context;
int32 low;
int32 high;
tinode *tuning_register_buint32 (path, mode,variable, low, high)
const char *path;
mode_t mode;
uint32 *variable;
uint32 low;
uint32 high;
tinode *tuning_register_buint32x (path, rfunc, wfunc, mode, low, high)
const char *path;
mode_t mode;
uint32 (*rfunc)(void *);
int (*wfunc)(uint32, void *);
void *context;
uint32 low;
uint32 high;
tinode *tuning_register_bint64 (path, mode, variable, low, high)
const char *path;
mode_t mode;
int64 *variable;
int64 low;
int64 high;
tinode *tuning_register_bint64x (path, rfunc, wfunc, mode, low, high)
const char *path;
mode_t mode;
int64 (*rfunc)(void *);
int (*wfunc)(int64, void *);
void *context;
in64 low;
in64 high;
tinode *tuning_register_buint64 (path, mode, variable, low, high)
const char *path;
mode_t mode;
uint64 *variable;
uint64 low;
uint64 high;
tinode *tuning_register_buint64x (path, rfunc, wfunc, mode, low, high)
const char *path;
mode_t mode;
uint64 (*rfunc)(void *);
int (*wfunc)(uint64, void *);
void *context;
uint64 low;
uint64 high;
void tuning_deregister (t)
tinode_t * t;

Description

The tuning_register_handler kernel service is used to add a file at the location specified by the path parameter. When this file is read from or written to, one of the two callbacks passed as parameters to the function is invoked.

Accesses to the file are viewed in terms of streams. A single stream is created by a sequence of one open, one or more reads, and one close on the file. While the file is open by one process, attempts to open the same file by other processes will be blocked unless O_NONBLOCK is passed in the flags to the open subroutine.

The readfunc callback behaves like a producer function. The function is called when the user attempts to read from the file. The mode parameter is equal to TH_MORE unless the user closes the file prematurely. On entry, the size parameter is an integer containing the size of the buffer. The context parameter is the context pointer passed to the registration function. Upon return, size should contain either the actual amount of data returned, or a zero if an end-of-file condition should be returned to the user. The return value of the function can also be used to signal end-of-file, as described below.
Note: It is expected that the readfunc callback has already done any necessary end-of-file cleanup when it returns the end-of-file signal.

If the amount of data returned is nonzero, the buf parameter may be modified to point to a new buffer. If this is done, the callback is responsible for freeing the new buffer.

If the buffer provided by the caller is too small, the caller may instead set buf to NULL. In this case, the size parameter should be modified to indicate the size of the buffer needed. The caller will then re-invoke the callback with a buffer of at least the requested size.

If the user closes the file before the callback indicates end-of-file, the callback will be invoked one last time with mode equal to TH_ABORT. In this case, the size parameter is equal to 0 on entry, and any data returned is discarded. The callback must reset its state because no further callbacks will be made for this stream.

The writefunc callback behaves as a consumer function and is used when the user attempts to write to the file. The mode parameter is set to TH_EOF if no further data can be expected on this stream (for example, the user called the close subroutine on the file). Otherwise, mode is set to TH_MORE. The size parameter contains the size of the data passed in the buffer. The buf parameter is the pointer to the buffer.
Note: There will be zero or more calls with the mode parameter set to TH_MORE and one call with the mode parameter set to TH_EOF for every stream.
The buf parameter may change between invocations. Upon return from the callback, the size parameter must be modified to reflect the amount of data consumed from the buffer, and the buffer must not be freed even if all data is consumed. The function is expected to consume data in a linear (first in, first out) fashion. Unconsumed data is present at the beginning of the buffer at the next invocation of the callback. The size parameter will include the size of the unconsumed data.

Both callbacks' return values are expected to be zero. If unsuccessful, a positive value will be placed into the errno global variable (with the accompanying indication of an error return from the kernel service). If the return value of a callback is less than 0, end-of-file will be signaled to the user, and the return value will be treated as its unary negation (For example, -1 will be treated like 0). In this case, no further callbacks will be made for this stream.

The tuning_register_bint32, tuning_register_buint32, tuning_register_bint64, and tuning_register_buint64 kernel services are used to add a file at the location specified by the path parameter that, when read from, will return the ASCII value of the integer variable pointed to by the variable parameter. When written to, this file will set the integer variable to the value whose ASCII value was written, unless that value does not satisfy the relation low <= value < high. In this case, the integer variable is not modified, and an error is returned to the user through an error return of the kernel service during which the invalid attempt is detected (probably either write or close).

The tuning_register_b*x functions operate similarly to their non-x variants, but they use a pair of callbacks to retrieve (rfunc) and set (wfunc) the variable. The callback is passed the value (if setting) and the context parameter. This permits more complex operations on read/write, such as serialization and memory allocation and deallocation.

The tuning_get_context kernel service returns the context of the registration function used to create the tinode_t structure referred to by the argument parameter.

The tuning_register kernel service is the basic interface by which a file can be added to the /proc/sys directory hierarchy. This function is not exported to kernel extensions, and its direct use in the kernel is strongly discouraged. The path parameter contains the path relative to the /proc/sys root at which the file should appear. Intermediate path components are automatically created. The mode parameter contains the UNIX permissions and the type of the file to be created (as per the st_mode field of the stat struct). If the file type is not specified, it is assumed to be S_IFREG. In most cases this parameter will be 0644 or 0600. The vnops parameter is used to dispatch all operations on the file.

The tuning_deregister kernel service is used to remove a file from the /proc/sys directory hierarchy. It is exported to kernel extensions. It should only be used when a specific file's implementation is no longer available. The t parameter is a tinode_t structure as returned by tuning_register. If the file is currently open, any further access to it after this call returns ESTALE.

Parameters

Item Description
mode Is set to either TH_EOF if no further data is expected from the user for this change, or TH_MORE if further data is expected.
size Contains the size of the data passed in the buffer.
buf Points to the buffer.
context Points to the context passed to the registration function.
path Specifies the location of the file to be added.
readfunc Behaves as a producer function.
rfunc Retrieves the variable.
wfunc Sets the variable.
writefunc Behaves as a consumer function.
variable Specifies the variable.
high Specifies the maximum value that the variable parameter can contain.
low Specifies the minimum value that the variable parameter can contain.
t A tinode_t structure as returned by tuning_register.

Return Values

Upon successful completion, the tuning_register kernel service returns the newly created tinode_t structure. If unsuccessful, a NULL value is returned.

Examples

A user of this interface might include the following line in their initialization routine:
tuning_var = tuning_register_buint64 
("fs/jfs2/max_readahead", 0644 &j2_max_read_ahead, 0, 1024);

In this example tuning_var is a global variable of type tinode_t *. This causes the fs and fs/jfs2 directories to be created, and a file (pipe) to be created as fs/jfs2/max_readahead. The file returns the value of j2_max_readahead in ASCII when read. The variable is read at the time of the first read. A write would set the value of the variable, but only at the time of either the first newline being written or a close function being performed. In order to write the variable after reading it, one must close the file and reopen it for write. This file is not seekable.