Compliance tools

In this section we’ll go over some general data storage tools and rules that Compliance uses. This is intended to inform users on how to use the product, and also how to make sure proper data retention and archiving processes are applied.

System requirements, CPU, data retention and storage

TimeKeeper Compliance operates by reviewing reported client data. Ideally this data comes from a TimeKeeper client as this has the most complete data, including the client’s other source information if it was tracking multiple PTP and NTP sources. Compliance can report on other client data also.

This data must be present on the Compliance instance at the time of audit generation in order to be built into the generated audit. This generally means that client data should be retained on the Compliance server for several days before being archived, in order to handle potential audit regeneration.

Once Compliance has generated the requested audits, the client data can be archived off of the host into long term storage. For some customers, keeping 7 days of data on the Compliance server is sufficient, for others 30 days is preferable. The number of days to keep is configurable in the TimeKeeper logrotate configuration files.

After the data is collected into the configured Compliance audits and taken off of the Compliance server, it can be stored as needed depending on any regulatory requirements. The data has been summarized by Compliance and is no longer needed in raw form unless a particular audit must be regenerated.

Disk space requirements

Disk space required for the raw data (.data files) which Compliance uses to generate reports will vary based on type of source, sync rate and similar. Assuming the default sync rate of 1 update per second and a NTP or PTP time source, each data file (once compressed, which is done automatically) is about 2.7MB per day of data for a single time source. That equates to about 985MB/year. This is about 20TB of data for a year for 20,000 clients. Higher sync rate and more than a single time source per client will increase the disk storage needed.

The reports themselves which are stored in a SQL accessible database (noted below) will vary more. The more often hosts exceed thresholds the more reporting data that must be stored so more disk space will be required.

CPU requirements

When Compliance is processing data it will provide an estimate of how long is required to generate a day of reports and how long until completed. This is shown in the Compliance GUI tab and also in the file:

${LOGDIR}/timekeepercompliance/compliance_auditor

It is best to test with your own data and hardware (or virtual system), then look at these values to estimate resources required but some estimates of likely performance are below.

Compliance does not require a specific amount of CPU to operate properly. CPU/speed does impact how long it takes to generate reports though. Most users in a steady-state environment will be generating reports every day so incremental reports are run which requires a small amount of processing each day to keep the records/reports updated. When generating reports for a very large amount of data from scratch (creating a new Compliance system that has to process months/years of data) CPU resources become very important.

Predicting how much time is required to generate reports is difficult as it will vary significantly with the number of times clients exceed thresholds (thus more work to do), disk speed and number of CPUs/speed. Assuming a configuration below (which is not intended as a minimal system configuration, simply a reference):

A single day of a single configured report (the default one) on the above system requires about 2.7 minutes/day or 9 hours to process 6 months of data. Normally TimeKeeper will build a report every day so only 2.7 minutes would be required each day. If a user creates a new compliance instance that must process 6 months of data or one deletes the compliance directory and forces TimeKeeper to re-process all data this would require the full 9 hours.

Client data storage

TimeKeeper stores the current client data in this directory:

${LOGDIR}/timekeeperclients/

The data format of these files is specified in more detail here. For Compliance to report on a client for a given period of time, that client’s data must be present in this directory at the time of audit generation. If it is not included in this directory it will not be included in the generated audit(s).

Once any audits are generated, the data does not need to be kept in this directory and may be stored elsewhere as needed.

Database storage

Compliance stores the data in .db files in this directory:

${LOGDIR}/timekeepercompliance/db

One file is created per year. By default LOGDIR is set to /var/log, but it may be overridden. No matter where the storage is, FSMLabs recommends that regular backups are made of this content.

PDF storage

In addition to database files, Compliance generates PDF versions of each configured report for every audit period. These files are stored here:

${LOGDIR}/timekeepercompliance/pdf

with one subdirectory per year below the pdf/ path. It is also recommended that regular backups are made of this directory.

Should PDFs be deleted from this directory, Compliance will regenerate them using the compliance_cli --rescan option as detailed below, or the missing PDFs will be flagged for regeneration as of the next TimeKeeper restart. As the original client data was processed earlier and is already present as an audit queryable from the database, the raw client data is not required in order to regenerate PDFs.

The raw client data is not needed to rebuild the PDFs provided the audit was done and is still present in the database.

If there is an issue generating a particular PDF file, Compliance will note it by creating a file named for the intended PDF with a suffix of ‘.failed’. This is rare but may happen due to resource constraints on an under-resourced system. If this does fail and the issue is remedied (for example by adding memory to the constrained system) the .failed file can be removed. At that point Compliance will reattempt the PDF generation as described above.

Compliance command line tools

Nearly all Compliance interaction is via the web interface as shown, or by reviewing and distributing the generated PDF audits. However, there a couple operations that are handled by the compliance_cli tool provided in the TimeKeeper installation.

In particular, this tool can be used to trigger the deletion and regeneration of audits. This is generally for situations where the data present at the time of audit generation was incomplete.

Note that deletion of audits is a destructive process and if the client data is no longer present, will result in the loss or modification of regulatory reporting data. This tool should be used with care.

Please contact support@fsmlabs.com with any questions.

Given a particular audit number, type, start time, and year, the compliance_cli tool can remove an audit within the Compliance database. The tool can also cause Compliance to rescan the existing client data for this audit period again, regenerating the audit with any corrected data.

A list of existing audits can be retrieved with compliance_query. This process may look like:

$ /opt/timekeeper/release64/compliance_query --list
(...lists all available audits...)
$ /opt/timekeeper/release64/compliance_cli --year 2017 --start 1509667200 \
     --report 0 --type daily --delete
*** Deleting audits is destructive (deletes data) and may not be recoverable.
*** Please type 'yes' to confirm.
yes
$ /opt/timekeeper/release64/compliance_cli --rescan

The last command with –rescan informs Compliance that any modifications are now complete and Compliance should scan/reprocess any missing audit periods. The audit(s) will get regenerated but only if client data is present. If the client data is gone or archived elsewhere, the audit may not be regenerated at all or may be incomplete. It is the responsibility of the user to use this tool safely.

Compliance query also allows the user to output data from the compliance database to standard JSON or Excel spreadsheet (xlsx) formats. To use compliance query this way, you can use the following syntax:

$ /opt/timekeeper/release64/compliance_query -r 0 -s 1518998400 -y 2018 -t daily \
	-l /app/timekeeperlog/ -f x -o /tmp/my.xlsx

Where :

-r is the report number
-s is the start date for the report (compliance_query --list provides available values)
-y is the year
-t is the type
-l is the LOGDIR configuration variable value from timekeeper.conf
-f is the output format type (x for xlsx)
-o produces an xlsx file at a location specified at the path given

Running /opt/timekeeper/release64/compliance_query --help will list all available command line options.