Copyright © tutorialspoint.com
Stack entities require timer services from the Operating System. This service can be for message retransmission, thread/process synchronization and for various other time-related services. The goal of this paper is to design an Operating System Interface in the form of Operating System Wrapper (OSW) between the Stack entity and the Operating System that can provide the required timer-related service. The OSW hides the details of underlying Operating System from calling Stack entity. The Stack entity uses APIs provided by OSW to use timer facilities of underlying System.
This document covers requirements of timer module and constraints imposed by the Stack entity. In the appendix, it discusses how time is maintained on Linux platform. The design for Linux Implementation is covered. It also contains user manual, which shows sample implementation timer daemon developed.
The following are general requirements for OS interface:
The timer management module is designed to provide timer-services to stack entities. In order to provide this service, a high-level architecture, as shown in Figure 1.1 is adopted. The stack entity has an OS Wrapper that provides APIs for communication with another module. This timer-management module runs as a separate process/thread and is referred to as Timer daemon.
Figure 1.1: High-level architecture for Timer Services
Based on the interactions between timer daemon and protocol stack entities, there are three alternatives for availing timer services. These alternatives are as follows:
Stack-based architecture: In this architecture, stack entity requests for timer services, and timer-daemon responds back with a handle. For availing of the timer services the stack entity uses this handle. The onus on providing a mapping between timers started by user process and timer-handle rests with stack. Thus, this architecture is referred to as Stack-based architecture. The drawback of this scheme is that the stack has to wait for the handle before it can use the timer.
Timer-daemon based architecture: In this architecture, the stack does not wait for timer-handle. Rather, it passes its own handle and the onus of maintaining a mapping between stack-provided handle and the handle generated by timer service rests with timer-daemon.
Distributed architecture: The drawback of the previous scheme is that the assumption that system.s timer-services maintains stack.s handle is too strong. To do away with this assumption, a mid-way approach is used. The concept of timer-client is introduced which keeps track of handles supplied by stack. Internally, it posts a request to timer daemon. When the latter returns a handle, it maps the same with the handle provided by the stack.
Figure 1.2: Timer-service Architectures. (a) Stack-based architecture. (b) Timer-daemon based architecture. (c) Distributed architecture
After detailed discussion and analysis, both from the perspective of stack developer, and from the perspective of OS Wrapper developer, the second approach is adopted, i.e. timer-daemon based architecture is adopted.
In general, a large number of timers are needed for network protocol implementation and real time applications. These timers are used by the protocol for error recovery and rate control. With the growth in the demand for a large number of timers, there is a pressing need for a very efficient and fast timer implementation.
Generally, the general operating systems provide only one timer per process. To call multiple timers in the Linux operating system a process has to fork additional child processes. This is a complicated affair. Thus, it is desirable that it be possible that using a single timer facility, multiple timers can be internally implemented. This can be done using a linked list of pending timers the details of which are provided in this paper.
As mentioned above, a stack maintains a large number of timers, which are started and stopped on a continuous basis. Given this, each timer must be maintained uniquely. In order to identify a timer, an identifier or handle is required. The handle refers to the key/value used to uniquely identify a particular timer. One of the key design issues, as apparent from the discussion on architectural alternatives, is the allocation of handle (i.e. who allocates the handle, the stack entity or the timer daemon).
To implement multiple timers, the general approach is to maintain a sorted array of pending timers. However, a much better approach is to use callwheels. The callwheel is a circular array of unsorted linked lists callwheel[callwheelsize] (refer Figure 1.3).
Now, all the timers, which are to be expired after t ticks, are inserted in the list corresponding to callwheel [t % callwheelsize]. The c_time (expiry time) member of the timer to be expired is kept equal to t / callwheelsize. For example, if callwheelsize is 10, and t is 23, then 23%10 is 3, and 23/10 are 2.
Therefore, c_time member is 2 and the list is kept in array position 3. This implies that the timer will expire after 2 complete rounds of ticks and 3 additional ticks.
The value of callwheelsize is kept comparable to the maximum number of outstanding callouts. The reason for this is that the number of timers inserted per circular array list is small and so the time taken to traverse the list is less. On every clock tick the c_time (expiry time field) member in all the timer nodes in the appropriate list is decremented by one. If it is found before decrementing that the value of c_time is zero then it means that the timer node has expired and subsequently the timer node is removed. For removing any timer from the data structure we need identifiers that are handles to the appropriate timer node to be removed.
Figure 1.3: Call Wheel
The Timer Daemon (TD) process maintains its own ticks. Tick interval is configurable at compile time. This interval could be in seconds, microseconds or milliseconds. Timer Daemon uses system function setitimer () to receive signal (SIGALRM) at each tick.
In order to use the services of TD, the Stack-entity/User-process must register with TD at the beginning. Registration with TD is required for its internal operations as more than one stack entity could be accessing the timer services. The TD maintains a map table (a table that provides a mapping between user-handle and TD-handle, as discussed later) for each module. Map table corresponding to a new module is initialized at the time of registration of module.
To start a timer, the stack entity uses TD API osw_start_timer (args), as described in Table 2. After receiving start message from stack entity, TD inserts node at appropriate call wheel index. TD also generates its own handle for the inserted node. This handle is generated to avoid search operation in case of stop timer request. The TD handle generation leads to requirement of map table for user handle and TD handle.
To stop a timer, stack entity calls TD API osw_stop_timer (args) as described in Table 3. The TD searches the map table by the module id as given by the user process and extracts the TD handle. Then it determines the call wheel index and node pointer from handle. It indexes into callwheel array and removes node. After the timer expiry, the buffer, as supplied by the stack entity, is sent back to the user process. Interface for Stack and TD is shown in Figure 1.4.
Figure 1.4: Stack Entity/User Process to TD Interface
There are three APIs or interfaces, which could be used by the stack entities to access the timer, related services. Refer tables 1, 2 and 3. These are start_timer, stop_timer and register_mod.
osw_init_tim_intf: This function is used to initialize stack entities interface with TD. This function gets the queue id on which TD expects message from stack entities. Refer table 5.
osw_start_timer: The osw_start_timer function prepares a tSTRT_TIM type message from the parameters (as given in table 2) passed to the osw_start_timer function and delivers this message to the message queue at OSW_TD_IN_MSGQ queue. The TD receives requests/messages for the start and stop timer from various stack entities on this queue id. The return value for this function is as given in table 2. A sample implementation is given in the text box below.
osw_ stop_timer:The osw_stop_timer function prepares a tSTOP_TIM type message filling the user supplied handle and the module-id and delivers this message to the message queue at the OSW_TD_IN_MSGQ queue.
/* This implementation shows a test case to use TD (Timer Daemon) */
#include <oswapitim.h>
#define STACK_ID 1
/*-------------------------------------------------------------
* Description: This macro is used to start a timer.
*--------------------------------------------------------------*/
#define START_TIMER(time_val_m,timer_m, \
cyclic_m,timer_exp_m,u4ret_val_m) \
{\
u4ret_val_m = osw_start_timer( \
time_val_m, \
timer_qid, \
STACK_ID, \
&timer_m, \
sizeof(tTIMER_EXP), \
&timer_exp_m,cyclic_m);\
}
/*-------------------------------------------------------------
* Description: This macro is used to stop a timer.
*--------------------------------------------------------------*/
#define STOP_TIMER (timer_m) \
{\
osw_stop_timer (timer_m, STACK_ID); \
}
|
The TD library is developed using "C" Language. It has message-based interface. User process / stack entities send message on queue generated by TD and receive messages on queue generated by them. As shown in Figure 1.5 stack entity sends messages on TD IN queue and receives messages from TD on STACK IN queue.
Figure 1.5: Message-based Interface between Timer Deamon and Stack Entity
Macros to fill messages to TD are defined in "oswapitimm.h". Other functions to start timer, stop timer, register module are defined in "oswapitim.c". The TD uses doubly linked list and tree library for its internal operation. These libraries comprise of files "dlist.c", "dlist.h","btree.c" and "btree.h". All variable naming convention is followed as per DASA standard. These are defined in file "dasatypes.h". Core functions of TD are defined in oswc_wheel.c.
/*------------------------------------------------------------
* Function name: main
*
* Description: This is sample function showing how stack entity
* can use TD service.
*
* Parameters:
* argc : Number of arguments.
* argv : Pointer to arguments.
*
* Return Value : none
*
*-----------------------------------------------------------*/
INT4 main ()
{
tTIMER_BLOCK timer_blk;
tTIMVAL timval;
INT4 timer_qid;
UINT4 u4ret_val;
tTIMER_EXP tim_exp;
timer_qid = msgget (STK_QID,IPC_CREAT|0666);
if ( -1 == timer_qid )
{
printf("\n error creating message queue.");
exit(-1);
}
/* INITIALISE INTERFACE WITH TD */
osw_init_tim_intf();
/*REGISTER WITH TD */
osw_reg_mod(A2S_ID);
/* All module id's are defined in osw_module.h . System
* Administrator will be owner of this file and ensures
* that all module id's are unique.
*/
timer_blk.u1type = 1; /* Arbitrary value taken */
timval.u4time = 10; /* Setting Timer for 10 seconds.*/
tim_exp. u1api_id = A2S_ID ;
tim_exp.req_id = 5; /*Pointer to user defined data*/
timval.u1unit = SEC;
START_TIMER (timval, &timer_blk, 0,&timer_exp,u4ret_val)
/* To receive timer expiry message */
{
UINT1 u1msgbuf[MAX_BUFF_SIZE];
UINT4 u4nbyte_rcvd;
u4nbyte_rcv;
msgrcv(i4qid,u1msgbuf,sizeof(u1msgbuf),0,0);
if ( u4nbyte_rcvd > 0 )
{
memcpy(&tim_exp
, u1msgbuf+sizeof (long), u4nbyte_rcvd);
}
}
/* To stop timer */
STOP_TIMER(&timer_blk);
|
osw_reg_mod: The register_mod function prepares a tREG_MOD type message filling the module-id and delivers this message to the message queue at the OSW_TD_IN_MSGQ queue. All module ids are defined in "osw_module.h", system administrator holds ownership of this file and ensure uniqueness.
To understand the functioning of the timer daemon, the key functions are described below:
osw_process_queue: The TD calls the function process_queue. The process_queue function receives the message sent by the stack entity and checks the field of the header part of the message. The u2mtype field is used to decide whether the message is for start_timer, for stop_timer or for register_mod.
osw_istart_timer: The osw_istart_timer appends a callwheel node at the appropriate callwheel index position and also appends a node in handle map-table in the map table list (see Figure 1.6). The index of the callwheel (where the callwheel node is to be inserted) and the u4ticks is calculated by using any of the three macros (GET_IDX_AND_TICKS_IN_MSEC, GET_IDX_AND_TICKS_IN_MUSEC or GET_IDX_AND_TICKS_IN_SEC) depending on whether the unit field of the tTIMVAL data structure is for seconds, microseconds or milliseconds.
Now, if the unit is seconds, the procedure is as follows. To the current value of the global variable, gu4currpos, the time field value of the tTIMVAL data structure is added. The remainder obtained when this new value is divided by the call wheel size gives the index position of the callwheel at which the callwheel node is to be inserted. For example if the current value of gu4currpos is 2 and the time field is 10. Then after adding the two we get 12. If the callwheel size is 16, then 12 % 16 gives 12 which is the new index at which the callwheel is to be inserted. The u4ticks value would be 10/ 16 that is zero. If the unit set is for milliseconds then the time field value of the tTIMVAL structure is multiplied by 1000 and then only index and u4ticks field calculated as above. For unit set at microsecond the time field 10 to the power 6 multiply value and then only the index and the u4ticks value calculated as done above
A tCW_NODE type of data structure is prepared and this node is inserted directly to the index calculated from above. The tMODULE data structure is filled with the module id. The map table for this module id is searched (see Figure 1.7).
Figure 1.6: Start Timer
Figure 1.7: Start Timer (Updating Map Table)
osw_istop timer: This function is used to stop a pending timer. The parameter required for this is a pointer to the tSTOP_TIM type data structure containinig the handle generated by the stack entity and the module id (which identifies the stack entity using the timer services). The function uses module id to search module list. Once the module has been identified, it then uses the user supplied handle to search the corresponding map table. Using this handle, it extracts call wheel handle from map table. After this it removes call wheel node from call wheel array. (Handle determines call wheel index and its node.s address). Also, the entry from handle map table is removed.
osw_softclock: After each tick the function softclock is called. This function increments a global variable (which maintains the current position index of the circular callwheel array) by one. Then it checks whether any list exists on that index position. If a list exists, the softclock traverses the list by using the function dl_traverse_list. After this the softclock function is reregistered so that function "softclock" gets called when signal is received next. Reregistering of function to handle signal is requires in Unix environment. If function does not reregister itself it will not be called after second or more times signal is received.
dl_traverse_list: This takes in a generic tDLIST_PTR type as an argument. The dl_traverse list function is a generic function for traversing a doubly linked list. While traversing the list, the ticks field for each callwheel node in the list is checked to see whether it is zero or not. If it is not zero, it is decremented by 1. If the ticks' field is zero, the u1c_opt field is checked to see whether it is zero or not. If the u1c_opt field is zero (no restart), timer expiry buffer (sent by stack at the time of starting timer) is sent to stack entity (on queue id specified by stack at the time of starting timer) after that callwheel node is simply removed. Also the corresponding entry from the map table is also removed. If the u1c_opt field is not zero this means that the timer is to be restarted for that many times. In this case, the function "set_cyclic_timer" is called.
set_cyclic_timer: As mentioned above, the function set_cyclic_timer is called when the u1c_opt field of the call wheel node is not zero. A new call wheel node is to be created and inserted at a new index position. If u1c_opt field is not equal to 0xff (infinite restart option) then it is decremented by one. Corresponding map table entry for stack handle, call wheel generated handle is also updated.
The following gives the list of data structures used in the implementation.
tHDR: The tHDR data structure consists of two fields the u2mtype of unsigned short integer type and the u4mlen that is of unsigned integer type. This is a common header structure. All the messages contain this header. The u2mtype defines the type of message (i.e. whether the message is to start, stop or to register the module id. The u4mlen field gives the length of the message sent by the user process.
typedef struct sHDR
{
UINT2 u2mtype;
UINT4 u4mlen;
}tHDR
|
tREG_MOD: The tREG_MOD data structure consists of two fields the hdr of tHDR type and the u2mod_id that is of unsigned short integer type. The hdr consists of the tHDR data type while the u2mod_id consists of the module id (unique number which identifies the stack entity). This number is a global variable and visible to the stack entity.
typedef struct sHDR
{
tHDR hdr;
UINT2 u2mod_id;
}tREG_MOD;
|
tSTRT_TIM: The tSTRT_TIM data structure consists of nine fields. This structure/message is used for start timer request.
typedef struct sSTRT_TIM
{
tHDR hdr; /*The header for the message */
tTIMVAL timval; /* The expiry time and the unit */
INT4 i4qid; /* The queue id on which the start
timer request is to be sent */
UINT2 u2mod_id; /* The module id which identifies
the stack entity */
void pu_hndl; /* The handle supplied by the stack Entity */
UINT1 pbuf [MAX_BUFF_SIZE]; /* Explained in API.s */
UINT4 u4buflen; /* Explained in API.s */
UINT1 u1restart_flag; /* Here zero means no restart,
more than zero
means that many restarts */
}tSTRT_TIM;
|
tCW_NODE: The tCW_NODE data structure consists of ten fields.
typedef struct sCW_NODE
{
/* It consists of the already
existing generic doubly link list */
tDLIST_NODE lnode;
/* the qid on which the message
is to be sent to the user process */
INT4 i4qid;
/* Time interval and the unit */
tTIMVAL timval;
UINT4 u4ticks;
void *pu_hndl;
UINT1 pbuf [MAX_BUFF_SIZE];
UINT4 u4buflen;
UINT2 u2mod_id;
UINT1 u1c_opt;
}tCW_NODE;
|
tSTOP_TIM: tSTOP_TIM data structure/message is used for sending a stop timer request to stop a pending timer. It consists of three fields.
typedef struct sSTOP_TIM
{
tHDR hdr; /* Message Header */
unsigned short int u2mod_id;
void *pu_hndl; /* user supplied handle */
}tSTOP_TIM;
|
tHNDL_NODE: The tHNDL_NODE data structure is used to maintain a mapping table between the user supplied handle and the timer daemon generated handle.
typedef struct sHNDL_NODE
{
tTREE_NODE node; /*Required to maintain a tree structure */
void *pu_hndl; /* User supplied handle */
LONG *cw_hndl; /* Call wheel supplied handle */
}tHNDL_NODE;
|
tTIMVAL: The tTIMVAL data structure consists of two fields the u4time of unsigned integer type and the u1_unit that is of unsigned char data type. The u4time consists of the expiry time of the timer while the u1unit field consists of the multiplication factor.
typedef struct sTIMVAL
{
UINT4 u4time;
UINT1 u1unit;
}tTIMVAL;
|
The following section provides the API's for using timer services.
| API Name: | Start Timer |
| Description: | To start a timer |
| Syntax: | rt = osw_start_timer(args_list) |
| Parameters: | timer_val_struct: This structure will have two members. The first member will represent the time-unit and the second will have the multiplication factor (timer value will be donated by the expression X*10Y). We will be taking the two members that is x and y of int data type. Although no strict upper bound on these integers exist, y should be between -6 <y< 4 (few micro second to thousands of seconds) and that of x should be between (0 < x < 32,767). The latter value is the maximum portable value of an int. u4out_qid: It identifies the channel through which module will receive TD messages. u2mod_id: It identifies the particular module that requested to start the timer. It is assumed that module id will be unique for different modules (stack entities) using TD services. A file defining module id of all module is used. Stack entity / User process will both use this file. System administrator will be owner of this file. u1restart_flag: If restart_flag is 0, then no restart, 1 - n the number of times the timer is to be restarted, 0xff infinite restart pbuff: This identifies the buffer that will be returned to user process / stack entity after timer expiry phndl: Handle supplied by the user to identify the timer |
| Return type | DASA_SUCCESS: Function completed successfully. DASA_FAILURE: Function failed to complete successfully. |
| API Name: | Stop Timer |
| Description: | To stop a timer |
| Syntax: | rt = osw_stop_timer(args_list) |
| Parameters: | Phndl: Timer handle supplied by the user process (stack entity) u2mod_id: It identifies the particular module that requested to start the timer. |
| Return type | DASA_SUCCESS: Function completed successfully. DASA_FAILURE: Function failed to complete successfully. |
| API Name: | Register Module |
| Description: | To register the module id |
| Syntax: | rt = osw_reg_mod(args_list) |
| Parameters: | U2mod_id: It identifies the particular module that requested to start the timer. |
| Return type | DASA_SUCCESS: Function completed successfully. DASA_FAILURE: Function failed to complete successfully. |
| API Name: | Initialization |
| Description: | To initialize the stack entity interface to the TD and to get a queue id on which the stack entity is to receive messages from the TD |
| Syntax: | rt = rt = osw_init_tim_intf() |
| Parameters: | None |
| Return type | DASA_SUCCESS: Function completed successfully. DASA_FAILURE: Function failed to complete successfully. |
Most Linux systems available today use January 1, 1970, 00:00:00 as an epoch, or point of reference for time counters. In other words, clock counter values maintains the number of seconds since the epoch, and, with a reference date and interval, the kernel can do the necessary arithmetic to convert a counter value to the correct date and time. The reason for the year 2038 problem in Unix systems is due to this use of Jan 1, 1970 as a reference date. Using a 32-bit data type, counting the number of seconds since the epoch will cause the kernel to overflow sometime around January 19, 2038. This problem goes away on a 64-bit kernel, where we now have a 64-bit data type to count the seconds since the epoch -- so we're good for about 10 billion years or so. Figure 1.8 shows how time is maintained on Unix.
Figure 1.8: Linux Time Maintenance
All computer systems -- from desktop PCs to high-end multiprocessor systems -- have a clock circuit of some sort. It can exist as a clock circuitry or a clock chip. This time-of-day (TOD) clock chip is addressable by the kernel as part of the firmware address space and has a hardware register specification. A kernel interface to the TOD hardware is implemented as a TOD device driver. The chip itself implements several registers, readable and writable by the kernel through the device driver. Each component of a day and time value is stored as a separate counter value in the clock chip, and each counter increments the next logical value when it reaches its top value. For example, seconds count values 0 to 59, then increment minutes and restart at 0. Executing the date command to set the date calls the stime system call, which in turn calls the tod_set device driver interface that sets the values in the TOD clock hardware. To comply with industry-standard interfaces (system calls and library routines), the kernel provides functions for converting the date values read from the clock hardware to the Unix convention of the number of seconds since the epoch, and vice versa.
In addition to the system clock used for time keeping, an interval clock is initialized at boot time to generate clock interrupts at regular intervals. The interrupt level, interrupt vector, and handler are set through the clkstart function, which is executed at system boot time. The clock interrupt level is set at 10. Nothing else on the system generates interrupts at this level. By default, an interval of 10 milliseconds (100 interrupts per second) is used. A kernel parameter, hires_tick, can be set in the /etc/system file to increase the interval to 1,000 interrupts per second. The following entry in the /etc/system file would increase the clock interrupt frequency to once per millisecond: set hires_tick = 1. Great care should be exercised while setting high-resolution ticks. They can alter system performance dramatically. Some of the clock interrupt processing is not done in every clock interrupt (100 times a second), but rather at one-second intervals. For the services indicated, one-second granularity is sufficient.
Redesigning the BSD Callout and Timer Facilities by: Adam M. Costello and George Varghese.
Design of Unix operating system by: Maurice J Bach
DASA framework document by: DSQ Software Ltd.