Network design
This section outlines some concepts and suggestions for designing a time sync network and general best practices. These recommendations are generic and meant to apply to a wide range of installations but of course some site to site variation will be necessary.
Below are two different sections describing different aspects of a time sync network design. The first deals with getting the best accuracy and most reliable time sync while the second focuses on monitoring, reporting and management concerns.
Time sync design
Simple is better
The best design philosophy for quality time sync is to keep things simple. We find that when doing initial setup/design of a network the first stumbling blocks are often because of enabling too many features in TimeKeeper, changing configurations from the defaults or making the network design too complicated. These problems can be avoided by allowing TimeKeeper to do what it defaults to in most cases.
The default values in TimeKeeper are designed to meet the requirements of most users and are based on years of helping ease deployment and testing. It’s unlikely that you will need to change the default values when doing an initial TimeKeeper install to obtain a working setup, so we recommend you do not. Optimizing and tuning the configuration is something we recommend be done once a working setup and installation is complete.
Simple also means having as shallow a tree as possible for the time sync hierarchy. That’s described below in more detail but a high level description would be to have no boundary clocks for PTP networks and as few (or no) stratum clocks as possible for NTP. That means ideally each client system talks directly to a top level time server connected to GPS or other high quality time source without intermediate nodes. Network traffic concerns, routing issues and time server load may require intermediate nodes, but the fewer the better as each introduces complexity and reduces the accuracy of time sync.
Separate network
A common question asked is whether a separate network is necessary for time sync. In general we say it is not. TimeKeeper is designed to handle the “jitter” and “noise” introduced by competition with other network traffic and can filter it out. A separate network for time sync would likely be expensive, time consuming to maintain and would only improve sync accuracy marginally if at all. TimeKeeper also makes very light use of the network so should not impact other operations. It typically sends and receives a single packet once per second.
Protocol choice - NTP or PTP
NTP and PTP, as implemented by TimeKeeper show equal accuracy. It is often mistakenly stated that NTP is less accurate than PTP because some NTP implementations are not as accurate as others. Some appliances use open-source implementations of the NTP protocol so it’s best to ensure that your device has a high-quality and accurate implementation.
TimeKeeper provides equal accuracy for each protocol as it is a high-quality piece of software designed for accuracy. For additional information please see this whitepaper.
There are differences between NTP and PTP for monitoring and management. For more information please see the sections below.
Monitoring design
This section discusses some concepts when designing your time sync network for proper monitoring, logging and reporting. TimeKeeper is able to monitor client systems, share those logs with peer nodes for consolidation and alarm/alert/report when time is outside of range. The capabilities of TimeKeeper are greater than can be fully covered in this section so instead below we cover some best practices and general ideas when designing a time sync network.
TimeKeeper is able to both report accuracy information to other TimeKeeper and non-TimeKeeper monitoring systems and is able to query TimeKeeper and non-TimeKeeper hardware/software for time synchronization information. TimeKeeper is able to query, record and log information over many protocols and uses many techniques in order to log/collect as much information from a time synchronization network as possible.
NTP client monitoring
The NTP protocol has limited capability for monitoring. There are no protocol specific ways to query clients for detailed information. Instead the client provides it’s estimated accuracy to the server when sending a time update request. This accuracy information is not very fine grained and is limited to 15 microsecond increments. That is, it can report 0 microseconds, 15 microseconds, 30 microseconds and similar. TimeKeeper will record and log this this information in $LOGDIR/timekeeperclients/ for each client.
PTP client monitoring
The PTP protocol provides more powerful and varied tools for client monitoring. As a result it is somewhat more complex. TimeKeeper attempts to use every available method to collect data from clients, server, boundary clocks and any other nodes on the timing network. A variety of mechanisms including TimeKeeper specific techniques and generic PTP protocol methods are used. Other PTP implementations may report partial or no information when queried. TimeKeeper clients will respond to both standard PTP management requests and also TimeKeeper specific queries.
The primary PTP monitoring method sends multicast and unicast PTP management requests to every reachable entity on the timing network (using unicast for environments where multicast is disabled or unavailable). These include standard PTP queries that collect a limited amount of information from non-TimeKeeper entities (if they respond). TimeKeeper-specific queries are also sent out on the network for more detailed and complete monitoring information.
TimeKeeper-specific queries will collect data from entities on the network for the last 15 seconds (the query interval period). This interval period can be changed in the TimeKeeper configuration using the MANAGEMENT_QUERY_INTERVAL directive in timekeeper.conf. TimeKeeper instances on the network will respond with the requested amount of data for its local time sources as well as any other entities that have responded to its queries. Responses to these queries will provide the IP address of the client systems and will not use hostnames.
TimeKeeper also sends standard PTP protocol queries. These are limited in the scope of information they collect and will only collect a single data point at a given instance in time rather than the much greater amount of information provided by TimeKeeper’s other methods. Many PTP implementations (either hardware or software) do not respond to any queries at all, but some will respond to these minimal queries. TimeKeeper will record those in the same format as the above and report it. This method will use the IP address of the responding system and will not use hostnames.
Filesync
TimeKeeper versions 8.0.19 and later also use a method called “Filesync” by sending out queries to which TimeKeeper clients v8.0.19 and later will respond to over TCP port 319 (note that time sync occurs over UDP ports 319 and 320). Filesync responses include a complete set of information for the local system and any data stored in your configured $LOGDIR/timekeeperclients/*/ collected from remote entities on the network. This is the most robust monitoring method as it will recover from cases where a TimeKeeper instance has been disabled or unreachable for an extended period of time and needs to catch-up with any data it is missing.
The ENABLE_FILESYNC_QUERY setting will allow a host to send queries for Filesync client data on the network. ENABLE_FILESYNC_RESPONSE will allow a host to respond to any Filesync queries seen. Both of these options apply if the overall relevant ENABLE_MANAGEMENT_QUERY or ENABLE_MANAGEMENT_RESPONSE is also on. This means that if ENABLE_MANAGEMENT_QUERY is off, ENABLE_FILESYNC_QUERY will be forced off too, and the same applies the two response options.
Data collected/shared in this way will store data in your configured LOGDIR as $LOGDIR/timekeeperclients/[ipaddress]/* avoiding confusion with the other method which stores information in $LOGDIR/timekeeperclients/ipaddress_*.data*.
Note the Filesync method defaults to off.
Setting the management query window
Some deployments may require bulk transfers to happen at specific times of the day, for instance, after business hours. The times of the day during which TimeKeeper will query for data and respond to data requests using Filesync can be managed with the global configuration values FILESYNC_START_TIME and FILESYNC_END_TIME.
If set TimeKeeper will only request and respond to Filesync data requests between the defined start and end times respectively configured in “HH:MM:SS” format (hours, minutes, seconds) from the start of the day in UTC time.
For example, to enable queries and responses between 8:00 a.m. and 9:00 a.m. UTC as a global default, you would configure as follows in timekeeper.conf:
FILESYNC_START_TIME="08:00:00";
FILESYNC_END_TIME="09:00:00";
Note: TimeKeeper Compliance creates final audits for the previous UTC day 2 hours after UTC midnight. To ensure that client data is delivered to the Compliance host for audit generation on time, make sure to set your FILESYNC_START_TIME and FILESYNC_END_TIME values to ensure data delivery completes shortly after the end of the UTC day.
Restricting data shared
Some deployments may require that only specific clients' data is shared with remote servers. The parameters FILESYNC_INCLUDE and FILESYNC_EXCLUDE control which of the client systems in the log directory (default /var/log/timekeeperclients/*/) are shared with remote systems using the Filesync method. These parameters are set on the system that is hosting the files to allow or deny transmission of client data, meaning these options apply to the system with ENABLE_FILESYNC_RESPONSE set to send client data upstream. These fields use the following globbing patterns:
? matches any single digit * matches zero or more digits [] matches any single digit in the brackets or range of digits, e.g. [1-7]
For instance, the include specification:
FILESYNC_INCLUDE='???.?[1-7].12.\*';
Would match any clients in subnet 12 of any network with three major digits and 2 minor digits (like 192.17.12.122, but not 192.178.16.12) with a subnet value having any digit between 1 and 7 as its second place.
It would match,e.g.:
192.17.12.222
but would not match, e.g.:
10.22.12.2
To expand on the example, if you only wanted hosts with a host number of less than 100 on subnet 12 on those networks you could use:
FILESYNC_EXCLUDE='\*.???';
Which would prevent anything ending with a “.” followed by three digits from being included in the transferred files.
So 192.27.12.222 would no longer match, but 192.27.12.22 would.
Disable monitoring queries/responses
It is possible to disable responding to network PTP queries for information about sync accuracy as well as disable sending queries/recording the accuracy of other systems. This is sometimes done to reduce system and network load or when monitoring of specific clients is not necessary.
One can disable querying/recording time synchronization accuracy with:
ENABLE_MANAGEMENT_QUERY=0
Similarly one can disable responding to any queries seen with:
ENABLE_MANAGEMENT_RESPONSE=0
The Filesync method can be disabled by setting ENABLE_FILESYNC_QUERY=0 and ENABLE_FILESYNC_RESPONSE=0 in the timekeeper.conf file which is the default.
This can be done globally in the configuration file as well as on a per-source and per-server basis.