Filesystem Modification Events¶
The modification event logging facility of BeeGFS uses the metadata servers to collect information
about modified files and directories in the file system. These messages are forwarded to external
applications using a UNIX socket. As an example of such a tool, we provide the
beegfs-event-listener
as part of the beegfs-utils
package. It collects the event information
from the metadata servers, and prints them as JSON formatted text to STDOUT.
When configured for modification event logging, each metadata server checks for a socket at the log target path specified in the configuration file and tries to deliver modification event packets there. Each metadata server only collects info about the files it manages itself, so one metadata-event-listener is needed per metadata server. In case of metadata mirroring, events are only emitted from the primary server. The secondary should also be equipped with an event listener, since it will become the primary in case of a fail-over.
The provided tool is just an example. There are many possibilities in developing your own tools,
for example, adapters to backup systems. When developing your own software using BeeGFS
modification event logging, you can use the beegfs_file_event_log.hpp
provided as part of the
beegfs-utils-devel
package. It allows you to read the file modification event stream provided
by the metadata server. As a reference, the source code of the beegfs-modification-event-listener
is also provided as part of the package.
Configuration¶
Filesystem Modification Events need to be enabled on all clients and the metadata daemon(s) to work properly.
Client¶
The metadata server has to rely on the client to forward some information when doing some actions to be able to complete the modification event messages. The events of interest can be selected in the client configuration file:
sysFileEventLogMask = flush,trunc,setattr,close,link-op,read
For a complete coverage of all possible events, switch on everything, as shown above. If you only need a subset of event types, others can be removed from the list to reduce the performance overhead. But usually, this is not worthwhile since the overhead is very small.
Metadata server¶
To enable the event stream, specify the path for the UNIX socket to use in the configuration file of the metadata server. For example:
sysFileEventLogTarget = unix:/tmp/beegfslog
If this variable is set, the server will try to write to this socket every time a filesystem event occurs that is related to this metadata server.
The receiving application has to open the socket at that path. It is recommended to start the receiving application before the metadata server since undeliverable event messages will be discarded. In this case the dropped events counter included in each message is increased to inform the receiver.
To capture all events of the file system and to get the full picture, the event output has to be activated on all metadata servers, each with their own local UNIX socket and receiving application instance. The merging of the multiple streams is left as a task for the receiving application.
beegfs-event-listener¶
The beegfs-event-listener
program is included in the beegfs-utils
package. It opens a UNIX
socket at the specified path and listens for incoming messages. For example
$ /opt/beegfs/sbin/beegfs-event-listener /tmp/beegfslog
Every message is printed as one line of JSON formatted output.
Example:
$ mv /mnt/beegfs0/a /mnt/beegfs0/b
This will result in, for example:
{ "VersionMajor": 1, "VersionMinor": 0, "PacketSize": 77, "Dropped": 0, "Missed": 0, "Event": { "Type": "Rename", "Path": "\/a", "EntryId": "0-5A9EB0A7-1", "ParentEntryId": "root", "TargetPath": "\/b", "TargetParentId": "root" } }
The output can easily be parsed by scripts. For example this simple ruby program will print the event type and the file path for each event:
1#!/usr/bin/env ruby
2
3require "json"
4
5def printEvent(event)
6 if event
7 print "Event: #{event['Type']} #{event['Path']}\n"
8 end
9end
10
11while a = gets
12 json_data = JSON.parse(a)
13 printEvent(json_data['Event'])
14end
Use it like this:
$ /opt/beegfs/sbin/beegfs-event-listener /tmp/beegfslog | ./read-event-log.rb
Messages¶
Every event message consists of the following fields:
Major Version (uint 16)
Minor Version (uint 16)
Size of the whole message (uint 32)
Dropped Messages counter (uint 64)
Missed Events counter (uint 64)
The actual event:
Event Type (uint 32) (see below)
Path of the file/directory (string)
EntryID (string) (see below)
Parent EntryID (string)
Target Path (string)
Parent EntryID of the target path (string)
For details see /usr/include/beegfs/beegfs_file_event_log.hpp
, and the example code at
/usr/share/doc/beegfs-utils-devel/examples/beegfs-event-listener/
, both included in the
beegfs-utils-devel
package,
Event Types¶
For most events the target path and target EntryID fields are empty. Path, EntryId, and ParentEntryId always contain information about the file/directory being worked on.
Path Full path relative to the mountpoint of the file/directory EntryId The EntryID (similar to an inode number of other systems) of the file/directory ParentEntryId The EntryID of the parent directory
The following event types exist:
Event |
Description |
---|---|
Flush |
File contents was flushed. File size might have changed. |
Truncate |
File was truncated. File size might have changed. |
SetAttr |
File attributes changed. |
Close |
File was closed and possibly modified. |
Create |
New file was created. |
MKdir |
New directory was created. |
MKnod |
A block or character special file was created. |
RMdir |
Directory was removed. |
Unlink |
File removed. Note: Multiple paths can reference the same |
Symlink |
A symbolic link was created.
|
Hardlink |
A hardlink was created.
|
Rename |
A file or directory was renamed or moved.
|
Read |
Create event log entries for open with O_RDONLY flag for the purpose of file access auditing. |
Each message contains a dropped
and a missed
counter. The dropped counter is incremented for
each message that could not be delivered. The missed counter counts events that can refer to
multiple paths at the same time, e.g. hardlinks. Decisions on when a full scan of the file system
is needed can be made based on the value of these counters.
EntryIDs¶
BeeGFS uses EntryId to identify files and directories, similar to inodes on normal UNIX file systems. An EntryID is a string of the following form:
root|disposal|mdisposal|[0-9A-F]{1,8}-[0-9A-F]{1,8}-[0-9A-F]{1,8}
The three hex numbers can be represented as positive, non-zero integers. The special cases
root
, disposal
, mdisposal
do not appear for normal files and are for internal
bookkeeping only. They can be represented by the integer triple by including zeros, for example.