Xrd Report Monitoring

The xrd.report directive outputs a UDP packet (with XML formatted data) providing summary information regarding its xrootd instance.

This page documents the code and configuration of the Collector tool and infrastructure that is used to collect, report and display this information.

 

Xrootd directive configuration

The following line should be added to each xrootd instance that wants to be monitored (adjust the polling time as needed). For instances that also have cmsd, this directive might need to be protected within an if exec xrootd ... fi block.

xrd.report  172.16.105.48:9931 every 5m all

Python collector:

GitHub - snafus/xrdreport: Collect and reformat the output from the XrootD xrd.report monitoring

A python3 code that listens on a given UDP port for packets from xrd.report, converts the format and outputs / sends data to a variety of output collectors (e.g. File, influxDB).

Main configuration is done via an INI config file.

 

Cloud infrastructure

Two compute instances are configured in the Tier-1 Prod Internal project, behind a load balancer.
Each instance is running one copy of the xrdreport software.

The configuration is set that the load balancer will listen on UDP/9931, and each of the hosts also listens on 9931 (note, this should not be confused with the f-stream monitoring tools, which also listen on 9931).

 

InfluxDB

Data is sent to the xrootd database of influxdb at the Tier1.
Test data is sent to the devel_xrdreport measurement. The production data is sent to xrdreport measurement.

Vande (will be) configured to display the relevant panels

https://vande.gridpp.rl.ac.uk/next/d/U2pfv_g4k/echo-gateway-xrootd-report-monitoring?orgId=1&var-Instance=proxy&var-Instance=ceph&var-Instance=unified&var-Hosttype=gateway&var-Hostname=ceph-svc01.gridpp.rl.ac.uk&var-Hostname=ceph-svc02.gridpp.rl.ac.uk&var-bin=1m&var-rp=1_day&var-prefix=&from=now-12h&to=now

Aquilon and configuration

Archetype: ral-tier1
Personality: xrootd-xrdreporter
Sandbox: /jw-xrootd-reporter

Feature: grid/xrootd-xrdreport-collector (shared/features/grid/xrootd-xrdreport-collector)
config.cfg is the templated config file
config.pan is the usual definition of the feature

 

RPM and dependencies

The RPM of the package is build using (from the package git directory)

python3 setup.py bdist_rpm

This creates (e.g.) xrdreporter-0.0.3-1.noarch.rpm in the dist directory

Add the RPM to
repos-1.gridpp.rl.ac.uk:/srv/yum/local/sl7/
and build the cache

ssh root@repos-1.gridpp.rl.ac.uk cd /srv/yum/local/sl7 createrepo .

The package has dependencies on influxdb (or for v2 api influxdb-client.

The current setup does not install these by default and must be manually done on each host.

pip3 install infludb

When installing the rpm, the package will be placed in:

/usr/lib/python3.6/site-packages/xrdreporter

and the script placed as:
/usr/bin/xrdrep.py

 

Secrets file

An additional INI file is required, named /etc/xrdreport/secrets.cfg.
This file should contain any secrets that you are unwilling to put into the managed config.cfg file.
This file needs to be manually installed on any host which runs the software.
An example set of contents would be:

Even if empty, a file will need to exist.

Latest version / changelog

  • xrdreporter-0.0.3-1.noarch.rpm ; first properly working version

  • xrdreporter-0.0.5-1.noarch.rpm ; better secrets handling, fix for field with dodgy character, hacks for tier-1 specific conventions …

  • xrdreporter-0.0.6-1.noarch.rpm; Add some protection when INFO_HOST returns a long nonsense int