Client Node Tuning¶
Parallel Network Requests¶
Each BeeGFS client establishes multiple network connections to the same server, which allows the
client to have multiple network requests in flight to this server.
The number of connections from a particular client to the same server can be configured by setting
the value of connMaxInternodeNum
in /etc/beegfs/beegfs-client.conf
.
Increasing the number of connections may improve performance and responsiveness for certain
workloads. When increasing the value, it is extremely important to keep the resulting RAM usage for
network buffers on the servers in mind, especially for InfiniBand and larger cluster setups. Make
sure to read the comments for connMaxInternodeNum
and connRDMABufSize
in
beegfs-client.conf
to learn more about the server-side RAM usage. Note that connRDMABufSize
needs to be an integer multiple of 4 KiB. On a compute node, it usually doesn’t make sense to set
this number higher than the number of CPU cores. On a cluster login node, setting this value higher
than the number of CPU cores may help to improve responsiveness when multiple users are active.
BeeGFS clients establish connections only when they are needed (and drop them after some idle time).
Use the command beegfs-net
on a client to see the number of currently established connections to
each of the servers. beegfs-net
is contained in the beegfs-utils
package. The total space
used by the buffers (connRDMABufSize
x connRDMABufNum
) should be larger or equal to the data
chunk size, so that the messages exchanged between client and storage servers do not need to be
split to fit into the buffers available. The default RDMA settings (connRDMABufSize = 64 KB
,
connRDMABufNum = 12
) are OK for the default chunk size of 512 KB. If you set a chunk size
of 1 MB and a buffer size of 64 KB, the number of buffers should be at least 1 MB / 64 KB + 4
additional buffers for protocol, so 20 in this example.
Remote fsync¶
BeeGFS clients have a configuration option to control behavior when a user application calls
fsync()
. The option is called tuneRemoteFSync
in /etc/beegfs/beegfs-client.conf
. The
client can either enforce that data is committed to the server disks on fsync()
(tuneRemoteFSync=true
) or only make sure that data is transferred to the server-side cache
(tuneRemoteFSync=false
). Disabling remote fsync can significantly reduce disk seeks and thus
improves performance for applications that use a lot of fsync()
calls.
Disable locate/mlocate/updatedb¶
Some Linux distributions install a locate tool which scans all file systems once per day
to build a database of existing files. These tools can create unnecessary load on the filessytem if
they are active on the client nodes.
Either deactivate this service if you don’t need it or edit the file /etc/updatedb.conf
to make
sure that the beegfs
file system type is contained in the PRUNEFS
list and your BeeGFS
mount point is contained in the PRUNEPATHS
list.
Communication retries¶
BeeGFS clients support a time-limited retry mode for communications,
which will make them keep on trying to complete their I/O operation in case a server is
temporarily unreachable. The connCommRetrySecs
option in /etc/beegfs/beegfs-client.conf
sets the time limit for communication retries. Retries can be stopped by:
Interrupting the application waiting for the I/O operation, e.g. CTRL + c.
Disabling retries for the entire mountpoint:
echo false > /proc/fs/beegfs/<...>/conn_retries_enabled
Preferred servers¶
Clients can prefer certain storage servers for new files and directories.
For example, due to a certain network topology. See options tunePreferredMetaFile
and
tunePreferredStorageFile
in /etc/beegfs/beegfs-client.conf
. Note: These options only affect the
placement of new files. Clients having those options can set still read all files, regardless on
which server they are located.