[Erlang Systems]

2 Services

There are two management protocol independent EVA services provided, the basic Event and Alarm service and the Log Control service. The basic EVA service provides clients with an API for registering and sending events and alarms. The Log control service provides a mechanism for control of generic logs. Also included is a specialization of the generic log function for logging of events and alarms.

Each service provides client functions that can be used from applications in the system to, for example, send alarms. There is also an API that management applications can use to monitor and control the system. This API can be extended for specific management protocols, such as SNMP or CORBA.

2.1 Basic Event and Alarm Service

This service contains functions for the client API to EVA. EVA is a distributed global application, which means that clients can access the EVA functionality from any node.

Clients can register and send events and alarms. Management applications can subscribe to event and alarms, and control the treatment of them.

An event is a notification sent from the NE to a management application. An event is uniquely identified by its name. A special form of an event is an alarm. An alarm represents a fault in the system that needs to be reported to the manager. An example of an alarm could be equipment_on_fire. When an alarm is sent, it becomes active, and is stored in an active alarm list. When the application that sent the alarm notices that the fault that caused the alarm is not valid anymore, it clears the alarm. When an alarm is cleared, the alarm is deleted from the active alarm list, and an clear_alarm event is generated by EVA. Each fault may give rise to several alarms, maybe with different severities. There can however only be one active alarm for each fault at the same time. For example, associated with disk space usage may be two alarms, disk_80_percent_filled and disk_90_percent_filled. These two alarms represents the same fault, but only one of them can be active at the same time. An active alarm is identified by its fault_id. In contrast to alarms, ordinary events do not represent faults, and are not stored as the alarms in the active alarm list.

The basic EVA server is a global server to which all events and alarms are sent. The server updates its tables (e.g. the active alarm list), and sends the event or alarm to the alarm_handler process that runs on the same node as the global server. alarm_handler is a gen_event process defined in the SASL application.

Before a client can send an event or alarm, the name of the event must be registered in EVA. To register an event, a client calls register_event/2. The parameters of this function are the name of the event and whether the event should be logged by default or not. A manager can decide to change this value later. To register an alarm, a client calls register_alarm/4. The parameters of this function are the name and logging parameters as for events, and the class and default severity of the alarm.

EVA stores the definitions of events and alarms in the Mnesia tables eventTable and alarmTable respectively. Since an alarm is a special form of an event, each alarm is present in both of these tables. The active alarm list is stored in the Mnesia table alarm. The records for all these tables are defined in the header file eva.hrl, available in the include directory in the distribution.

2.1.1 Event Definition Table

All registered events are stored in the eventTable. It has the following attributes:

The event is uniquely identified by its name, which is an atom.

The log attribute is a boolean flag that tells whether this event should be stored in some log when it is generated or not. This attribute is writable.

The generated attribute is a counter that counts how many times the event has been generated.

2.1.2 Alarm Definition Table

The alarmTable extends the eventTable, and has the following attributes:

The alarm is uniquely identified by its name, which is an atom. Note that each alarm is present in the eventTable as well.

The class attribute categorizes the alarm, and is defined when the alarm is registered. It is as defined in X.733, ITU Alarm Reporting Function:

The severity parameter defines five severity levels, which provide an indication of how it is perceived that the capability of the managed object has been affected. Those severity levels which represent service affecting conditions ordered from most severe to least severe are critical, major, minor and warning. The levels used are as defined in X.733, ITU Alarm Reporting Function:

When an alarm is cleared, a clear_alarm event is generated. This event clears the alarm with the fault_id contained in the event. It is not required that the clearing of previously reported alarms are reported. Therefore, a managing system cannot assume that the absence of an clear_alarm event for a fault means that the condition that caused the generation of previous alarms is still present. Managed object definers shall state if, and under which conditions, the clear_alarm event is used.

2.1.3 Active Alarm List

The active alarm list is stored in the ordered Mnesia table alarm. The corresponding record is sent to the alarm_handler when an alarm is sent. It has the following read-only attributes:

A row in the active alarm list is uniquely identified by its fault_id. However, to make the table ordered, the alarms uses the integer index as a key into the table. For each new alarm, EVA allocates a new index that is greater than the index of all other active alarms.

The name is the name of the corresponding alarm type, defined in alarmTable.

sender is a term that uniquely identifies the resource that generated the alarm.

cause describes the probable cause of the alarm.

severity is the perceived severity of the alarm.

time is the UTC time the alarm was generated.

extra is any extra information describing the alarm.

2.1.4 Event

When an event is generated, the event record is sent to alarm_handler. It has the following attributes:

The name is the name of the corresponding event type, defined in eventTable.

sender is a term that uniquely identifies the resource that generated the event.

time is the UTC time the event was generated.

extra is any extra information describing the event.

2.1.5 Example

As an example of how to register and send events and alarms, consider the following code:

%%%-----------------------------------------------------------------
%%% Resource code 
%%%-----------------------------------------------------------------
reg() ->
    eva:register_event(boardRemoved, true),
    eva:register_event(boardInserted, false),
    eva:register_alarm(boardFailure, true, equipment, minor).

remove_board(No) ->
    eva:send_event(boardRemoved, {board, No}, []).

insert_board(No, BoardName, BoardType) ->
    eva:send_event(boardInserted, {board, No}, {BoardName, BoardType}).

board_on_fire(No) ->
    FaultId = eva:get_fault_id(),
    %% Cause = fire, ExtraParams = []
    eva:send_alarm(boardFailure, FaultId, {board, No}, fire, []),
    FaultId.

Two events and one alarm is defined. Board removal is an event that is logged by default, and board insertion is an event that is not logged by default. The alarm equipmentFailure is a minor alarm that is logged by default.

When the application detects that board N is on fire, board_on_fire(N) is called. This function is responsible for sending the alarm. It gets a new fault identifier for the fault, and calls eva:send_alarm/5, pointing out the faulty board (N), and suggests that the probable cause for the equipment trouble is fire.

The board_on_fire function returns the fault identifier for the new alarm. This fault identifier can be used at a later time in a call to eva:clear_alarm(FaultId) to clear the alarm.

2.2 Log Control Service

The Log Control service contains functions for monitoring logs, and functions for transferring logs to remote hosts, e.g. management stations. The main purpose of the Log Control service is to provide one entity through which all logs in the system can be controlled by a management station. Regardless of the type log, all logs are controlled in a similiar fashion.

Clients can register their logs in the log server. Management applications can control the logs, and transfer the logs to a remote host.

2.2.1 Log Monitoring

This service uses a log server that monitors all logs in the system. Each log uses the standard module disk_log for the actual logging.

Each log has an administrative and an operational status, that both can be either up or down. If the operational status is up, the log is working, and if it is down, the log does not work. The administrative status is writable, and reflects the desired operational status. Normally they are both the same. If the administrative status is set to up, the operational status will be up as well. However, if the log for some reason does not work, e.g. if the disk partition is full, the operational status will be down. When the operational status is down, no events are logged in the log.

2.2.1.1 Alarms

The Tlog service defines two EVA alarms; log_file_error and log_wrap_too_often.

2.2.1.2 Example

The following is an example of code that creates a log to be controlled by the generic Log Control function:

start() ->
    disk_log:open([{name, "ex_log"},
                   {file, "ex_log/ex_log.LOG"},
                   {type, wrap},
                   {size, {10000, 4}}]),
    log:open("ex_log", ex_log_type, 3600).

test() ->
    %% Log an item
    disk_log:log("ex_log", {1, "log this"}),

    %% Set the administrative status of the log to 'down'
    log:set_admin_status("ex_log", down),
    
    %% Try  to log - this one won't be logged
    disk_log:log("ex_log", {2, "won't be logged"}),
    
    Logs1 = log:get_logs(),

    %% Set the administrative status of the log to 'up'
    log:set_admin_status("ex_log", up),
    
    %% Log an item
    disk_log:log("ex_log", {3, "log this"}),

    Logged = disk_log:chunk("ex_log", start),
    {Logs1, Logged}.

2.2.2 Log Transfer

It is possible to transfer a log to a remote host. When the log is transferred, the log may be filtered, and the log records may be formatted.

As the logs are implemented as disk_log logs, each log consists of several log files. When the log is transferred, it is written to one single file on the remote host. When disk_log is used, the log records are normally not formatted when they are stored in the log, in order to increase log performance. However, a manager will probably need the log formatted in a human readable format. Thus, when the log is being transferred, each log record may be formatted in a log specifc way. Of course, to further increase performance, the log can be transferred as is, and leave it to the managar to format the log off-line.

2.3 EVA Log Service

The EVA log service uses the generic Log Control service to implement log functionality for events and alarms defined in EVA.

In the rest of this description, the term event refers to both events and alarms as defined in EVA.

This log functionality supports logging of events from EVA. It uses the module disk_log for logging of events. There can be several event logs active at the same time. It is possible to create new event logs dynamically, either from within an application, or from a management system. Each log uses a filter function to decide whether an event should be stored in the log or not.

There is a concept of a default log. The default log is used to log any event that has the log flag in eventTable set to true, but no log is currently able to store the event (or there is no other log defined to log the event). The usage of the default log is optional.

For example, suppose that we want to define an alarm log, that logs all alarms in the system. We can do this with the following code:

-module(alarm_log).
-export([alarm_filter/1, make_alarm_log/0]).

alarm_filter(Item) when record(alarm, Item) -> true;
alarm_filter(_) -> false.

make_alarm_log() ->
  disk_log:open([{name, "alarm_log"},
                 {format, internal},
                 {type, wrap},
                 {size, {10000, 10}}]),
  eva_log:open("alarm_log", {alarm_log, alarm_filter, []}, 36000).
    

If we set the administrative status of this log to down, and an alarm that should be logged according to its definition in the eventTable, the alarm is stored in the default log instead of "alarm log" (provided there are no other logs that are defined to log the alarm).


Copyright © 1991-2003 Ericsson Utvecklings AB