GEMS : Gossip-Enabled Monitoring Service
Table Of Contents
1. Prerequisites
2. Introduction
3. Files
4. (a) Starting
GEMS
(b) Restarting GEMS in a new node
(c) Inserting a new node into the service
5. Stopping GEMS
6. Retrieving the resource monitoring data
7. Configuring GEMS
8. Additional
features
9. Comments
1. Prerequisites
A thorough understanding of the gossip-style failure detection protocol [1-2] is
required to understand the instructions
provided in this manual. Knowledge of the various configurations of Gossip when
run in both the flat and layered scheme is
also required. For a detailed description of the gossip failure detection
protocol refer to [3-5] while for a more
functional description of the implementation consult the GOSSIP-README in the
DOCS directory provided.
2. Introduction
Gossip-based failure detection uses exchange of liveliness information to detect
failures and reach consensus on a
particular node. GEMS is an extension of the Gossip-style failure detection
service to support resource monitoring for
heterogeneous clusters. GEMS uses sensors to monitor resources and the monitored
data are piggybacked on Gossip messages.
In GEMS, the entire system is divided into groups and the distribution of
monitored data takes place at two levels. The
system parameters are measured at the lowest layer and distributed within the
group while they are aggregated at upper
layers and exchanged between groups. This file describes the various features
provided by GEMS. A detailed discussion on
the architecture and features of GEMS is provided in [6].
3. Files
This package contains the following GEMS files
MONITOR-README
C-code files:
monitor.c
load_Script.c
insert-monitor.c
rma_api.c
api.c
C-include files:
header.h
api_header.h
consts.h
Sample API files:
testfn.c
pseudo-code.c
4. (a) Starting GEMS
GEMS can be started in any node in two ways. The service can be started either
along with the gossip-style failure
detection service or at any arbitrary time when GEMS services are needed. In the
former method, GEMS is started by
providing the command line option '-m' to the gossip executable. For a detailed
account on how to start the gossip service
itself, please consult the GOSSIP-README. This method of starting is preferred
when GEMS is used for system
administration or when GEMS is to be started on a large number of nodes. The
latter method of starting GEMS at any
arbitrary time can be done using the API of GEMS. This latter method requires
the gossip service to be running on the
nodes. This method of starting is used by scheduling and load balancing services
to start GEMS in a limited number of
nodes that also run gossip. This is done by using the gems_init function which
sends a message to RMA in the corresponding
nodes to monitor the resources.
4. (b) RESTARTING GOSSIP IN
A DEAD NODE
The same two methods detailed above can be used to restart GEMS in a node that
comes back alive. But care should be taken
to start the gossip service as instructed in the GOSSIP-README
4. (b) Inserting a new node
into GEMS
The insertion of new node is same as that specified in the GOSSIP-README except
that the '-m' command-line option is to
be provided to enable the monitoring service. The issues such as the destination
group of node and insertion are handled
by the gossip service. Once the insertion is complete the monitoring service
reconfigures itself to transparently
accommodate the new node into the service.
5. Stopping GEMS
Stopping the GEMS service can be done by either stopping the entire gossip
service or using the GEMS API. When the gossip
service is stopped GEMS is automatically stopped as there is no carrier for the
monitored data. The GEMS service can also
be stopped using the API, without affecting the Gossip service. The API control
functions gems_end and gems_stopall are
used for this purpose. The earlier function stops the dissemination of both the
built-in sensor data and the userdata (For
a description of the built-in sensor data and the userdata please see the
attached rough draft titled GEMS: Gossip-Enabled
Monitoring Service for heterogeneous distributed systems [1]) while the latter
just stops the dissemination of the userdata.
6.
Disseminating and retrieving the resource monitoring data
The application programming interface provided with GEMS is used for the
dissemination and retrieval of data. The
implementation of the application programming interface is provided in the file
api.c and also as a dynamically loadable library named libgems_api.so with the
GEMS package. When the client requests for the retrieval of monitored data the
API
contacts the GEMS and retrieves data which is stored in a global data structure.
The applications can access these global
data structures through the inclusion of the header file api_header.h in their
source files. The different types of global
data structures for the different types of data are detailed below.
GEMS monitored data is principally composed of the built-in sensor data and the
user data. The built-in sensor data as name
implies is measured by sensors built within the GEMS and can only be retrieved
by the applications. The applications
cannot change the values of the built-in sensor data. The applications use the
API to retrieve the built-in sensor data. The
API function gems_update is used for this purpose. The gems_update function
communicates with GEMS and stores the received
monitored data in the data structure agg_loadinfo.
The structure agg_loadinfo is composed of an array of load_info structures. The
load_info structure is shown below
struct load_info
{
u_int8_t loadavg;
u_int8_t memfree;
u_int8_t swapfree;
u_int8_t netactivity;
u_int8_t diskactivity;
u_int8_t pageactivity;
}
struct agg_loadinfo_tag
{
struct load_info loadinfo[num_hosts];
} agg_loadinfo[num_layers];
The num_hosts refers to the number of hosts/groups in each layer and the
num_layers refers to the number of layers. These
structures are dynamically allocated and the values num_hosts and num_layers are
retrieved from GEMS at runtime. To access
the monitored data of a particular group (group j) of hosts in a specific layer
(layer i) the format to be followed is
agg_loadinfo[i].loadinfo[j]
To access the monitored data of individual nodes within the same group as the
node being contacted
agg_loadinfo[0].load_info[node_number]
Any number > 0 given for the num_layers will refer to a higher layer and in
higher layers the num_hosts will in turn refer
to a group of hosts instead of a single host. The load_info contains the values
that were compressed to reduce the resource
utilization and to ensure the portability of the service. The decompressed
values which are of integer length can be
accessed through data structure
int_agg_loadinfo[num_layers].int_load_info[num_host]
The other type of data that is disseminated by GEMS is the user data. User data
is the data provided by the applications for
dissemination through the GEMS. GEMS does not change these values and it just
receives these data from applications and
disseminates it through the system. The transmission and receipt of the userdata
is achieved through the API functions
gems_senduserdata and gems_recvuserdata. The pseudo-code for the dissemination
with the various API functions is given in
the file pseudo-code.c
1. gems_aggfn_init( &aggfn_id, filename,
strlen(filename));
2. gems_userdata_init(&userdata_id, size, aggfn_id );
3. while (1)
4. {
5. data = sensor ( );
6. gems_senduserdata (userdata_id, data, size);
7. gems_recvuserdata (userdata_ id);
8. sleep (1);
9. }
10. gems_userdata_stop (userdata_id);
The dissemination of user data involves the following steps: procuring an ID for
the user data, procuring an ID for the
aggregation function if the data employs a user-defined aggregation function,
and the selection of an aggregation function.
1. gems_aggfn_init( &aggfn_id, FILE, strlen(FILE));
If a new user-defined aggregation function is required, then a request for a
unique ID is sent to the RMA, along with the
filename containing the aggregation function. Here the variable aggfn_id holds
the Id assigned for the aggregation function,
when the call returns. FILE is the name of the file containing the aggregation
function. All user-defined aggregation
functions should be of the format
char * aggfn(char * data);
because GEMS uses the keyword aggfn to dynamically load the function. The data
refers to the monitored data supplied by
GEMS, which is aggregated by the newly loaded function and returned as a
character array. An example file containing an
aggregation function is shown in the file testfn.c. While sending the filename
of the file the entire path of the file
should be provided or GEMS assumes that the file is present in GEMS source
directory. The RMA assigns the unique ID to the
new function, after dynamically loading it into the service and propagates it
throughout the system.
2. gems_userdata_init(&userdata_id, datasize, aggfn_id
);
The application then requests an ID for the new data to be disseminated. This is
done on line 2 of the pseudo-code. Here the
size of the data to be disseminated is given in the datasize variable and the
aggregation function that will be used for
this data is given in the aggfn_id variable. When the call returns, the userdata
ID is returned in the variable userdata_id.
The call to these two functions should be done by only one of the processes of a
distributed application. This process
should then broadcast the Id obtained to all the other processes.
5. data = sensor ( );
6. gems_senduserdata (userdata_id, data, datasize);
Line 5 shows an example function of the application that returns the data to be
disseminated through the GEMS. In line 6
this data is sent to GEMS through the API function gems_senduserdata. Here the
Id for the data is provided in userdata_id,
the data is supplied in the character array data whose size is specified by the
datasize.
7. gems_recvuserdata (userdata_ id);
Line 7 shows the function call for receiving the user data. This function gives
GEMS the ID of the userdata in the variable
userdata_id, whose current disseminated values are required. The user data is
returned in the datastructure
layernum[num_layer].hostnum[num_host].datanum[num_data].data
The declaration of each structure is shown below
struct userdata
{
u_int8_t data[MAX_SIZE];
};
struct all_userdata
{
struct userdata datanum[NUM_DATA];
};
struct l_userdata
{
struct all_userdata hostnum[num_host];
}* layernum[num_layer];
Here, as explained already the num_layer and num_host refer to the number of
layers and the number of hosts/groups
respectively. These structures are dynamically allocated by obtaining these
values at run time. The NUM_DATA refers maximum
number of data that can be disseminated. To retrieve the userdata of any
individual node in the same group as contacted
node, the format to be followed is
layernum[0].hostnum[num_host].datanum[userdata_id].data
Here 0 refers to the lowest layer and hence an individual node. To retrieve the
userdata (userdata_id k)of a particular
group (group j) of nodes in a specific layer (layer i)the format to be followed
is
layernum[i].hostnum[j].datanum[k].data
The gems_recvuserdata() function retrieves only the disseminated information of
single user data identified by the
userdata_id. These values are retrieved by giving zero for the userdata_id in
the datastructures. The format is shown below
layernum[i].hostnum[j].datanum[0].data
Finally, to retrieve all the userdata that is currently disseminated by GEMS the
API functions gems_update_w_userdata and
gems_update_w_nuserdata are used. The gems_update_w_userdata does not take any
arguments storing all the retrieved sensor
data in agg_loadinfo data structure and the userdata in layernum data structure.
int gems_update_w_nuserdata(u_int8_t num_data,u_int8_t
*ID_array)
This function is used when only a specific number of userdata are required
identified by their Ids. The num_data variable
specifies the total number of data required whose Ids are given in the ID_array
variable.
To use the API functions provided in the file api.c either the client process
should be compiled together with this file or it can use the dynamically
loadable library libgems_api.so.
7. Configuring GEMS
There is no separate configuration for GEMS. The GEMS follows the same
configuration as that of Gossip. Thus for monitoring
a specific groups of nodes or for grouping nodes in a specific fashion the
configuration instructions specified in the
GOSSIP-README file should be used.
Currently the load update interval for the built-in sensors is fixed the same as
that of the Tgossip. Since the proc file
system is opened and read every Tgossip it is a good practice to use a minimum
Tgossip time of 1 seconds for minimizing
resource utilization.
Tgossip
- Failure detection time is proportional to Tgossip
- Small value of Tgossip would increase bandwidth and CPU Utilization
- Minimum value of Tgossip for efficient CPU usage should be higher than
the OS time slice (10ms). 1sec recommended for GEMS
8. ADDITIONAL FEATURES
Gossip Interval (Tgossip): The minimum Tgossip interval has been set to 10ms (OS
time slice). Provision has also been made
for Tgossip less than 10ms using busy waiting. But since busy waiting increases
CPU utilization, it is advised to have the
default minimum value of 10ms. Also note that for using GEMS the recommended
minimum Tgossip value is 1sec.
9. COMMENTS
gcc is assumed to be the default compiler. C++ style comments are used in the
code.