Tips
Collected here are some short notes about various aspects of testing, drivers, and general practices. If you have any specific questions about any of these items, please contact support@fsmlabs.com.
Assumptions about time in the past being invalid
When receiving time from any time source TimeKeeper will validate that time to make sure that it can be trusted. One of these checks includes making sure the time is not earlier than the release date of the TimeKeeper version being used. This can catch problems with time sources that are reporting erroneous time but it also means that it is not possible to run simulations with time in the past. If you need to be able to use simulated time that is in the past please contact support@fsmlabs.com and we can explain how to configure TimeKeeper to allow that.
Holdover
Some time devices are capable of keeping good time while not actively receiving time updates. This includes the Spectracom TSync GPS/PPS card, Symmetricom BC637 GPS/PPS card and the TimeKeeper GPS/Oscillator. By default TimeKeeper will remain in holdover for 2 hours (7200 seconds) and continue to use the time source even when it reports no GPS/PPS signal. That time limit can be changed with the HOLDOVER_LIMIT=X setting where X is number of seconds. Once that time limit expires TimeKeeper will begin comparing all time sources to determine if the holdover time is out of range with other time sources by using the Sourcecheck algorithm (even if Sourcecheck is disabled). To facilitate a possible failover, you must configure TimeKeeper to have at least three sources so that Sourcecheck can find a quorum and therefore a source to failover to.
Clock adjustment and steering
TimeKeeper will smoothly adjust the clock to correct offsets in most cases. However, when the offset is greater than 5 seconds TimeKeeper will reset the time to avoid the delay necessary to slew the clock. Once the clock is set in this way it will continue tracking and slewing as necessary to keep the time synchronized. In practice the only time this happens is when TimeKeeper is first started and it begins synchronizing the clock with a remote time source.
Once TimeKeeper is active, setting the system time is not recommended. TimeKeeper will stop other timing daemons but it is important that other utilities do not also try to steer the clock. That includes the command line tool “date” and other system tools that set the time and time synchronization software such as ntpd that use system calls such as “settimeofday” and “adjtime”.
Windows
The Windows Time service, W32Time, includes an NTP daemon. It can cause problems in a couple of ways. First, it could be trying to steer the system clock in conflict with TimeKeeper. Second, even if it isn’t steering the clock, it could be bound to the same port that TimeKeeper uses. The following instructions are just a starting point for dealing with these issues. You should research Windows Time service options to determine what best suites your needs.
You can stop W32Time and prevent it from automatically starting on boot with these two commands:
sc stop w32time
w32tm /unregister
However, another process, e.g., the Windows Domain Controller, could start it up again, so it’s up to you to assure that this can’t happen. If this becomes a problem, e.g., you require Windows Domain Controller, you can leave W32Time running but prevent it from steering the clock with a registry setting (set Type to NoSync). There are three ways to do this.
-
You could use this command:
reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters /t REG_SZ /v Type /d NoSync /f
-
use regedit to make the same changes,
-
or use this command:
w32tm /config /syncfromflags:NO /update
After making any changes to the W32Time configuration, you should restart it with this command:
net stop w32time && net start w32time
If you have TimeKeeper configured to use NTP, there could still be a problem. W32Time could be bound to the NTP port even though it’s not using it. In that case, TimeKeeper cannot use NTP sources. Configure TimeKeeper to use something else instead, such as PTP.
Identifying inaccurate sources
Accuracy issues may manifest as a noisy primary clock - usually as noise in the smoothed offset plot on the web interface, or in the second column of data in timekeeper_0.data.
Comparing the noisy data file to a less noisy one can help identify where the noise is coming from. If your primary source is SOURCE0, and it appears to be smoothly handled in *timekeeper_0.data", and SOURCE1 is noisy in the web interface and *timekeeper_1.data", you can be fairly certain that either the network or the time server itself in SOURCE1 is noisy.
This is worth noting because if the noisy source is your primary and TimeKeeper is chasing that clock, the reported offsets for your secondary and other sources may appear to be noisy also. Put another way, if your SOURCE0 is noisy but SOURCE1 is stable, chasing the noise in SOURCE0 can appear in some situations to also be noise in SOURCE1.
Where it’s an option, configuring your primary source to be a clean and stable PPS can quickly indicate which sources are actually noisy. When TimeKeeper is not chasing a noisy clock and is instead tracking a stable PPS, much of the noise will be removed, allowing for more accurate measurements.
Sync rate
The default sync rate of 1 and 0.9 packets/second for PTP and NTP, respectively, is a good tradeoff between network traffic, CPU load, and synchronization accuracy. A higher sync rate won’t necessarily increase accuracy. If you’re considering it, experiment first in your environment to determine if it actually results in better accuracy than with the default. Also note that higher sync rates increase storage requirements for Compliance reporting. Please refer to the Compliance “Disk space requirements” section.
Unresponsive NTP server
When acting as a NTP client TimeKeeper will wait for many milliseconds for a response. If there is no response from the server in that time the server is assumed to not be responding and no time synchronization will occur during that query of the server. This could be a problem for synchronization of a very slow WAN connection. If your environment and network have a very long round-trip between the client and server please contact support@fsmlabs.com for configuration information.
Hardware timestamping network cards
Hardware timestamp support allows TimeKeeper to most accurately measure receipt and transmission of timing data. This is supported automatically if:
- The network hardware supports timestamping
- The driver for the network hardware can enable the feature
- The host distribution supports the calls that enable the feature
- TimeKeeper finds the reported data to be accurate
Hardware timestamping is supported on these cards (but not limited to them):
- Intel 82576, 82580, I350, I210 cards
- Mellanox CX-3, CX-4, CX-5
- Most Solarflare cards
- Broadcom 5719 cards
In the case of Red Hat Enterprise Linux 6, upgrade the provided driver with the most recent IGB version available from Intel since the driver shipped with the distribution does not support all cards. Version 2.4.8 of the IGB driver or above is recommended. Some version of these drivers do need to be loaded, configured with an IP address to initialize them (with ifconfig), unloaded and then loaded once again for proper behavior when used for hardware timestamp assist.
Hardware clock
TimeKeeper also updates the hardware clock every 10 minutes on Linux to keep it in sync with the current system time. This ensures there’s a good reference for any components that depend on the hardware being in sync with the system clock.
Identifying accuracy problems
Poor NTP implementations
Many NTP implementations are of very low quality and are very noisy, even if the network delivering the NTP data is fast and deterministic. In these cases it’s recommended that TimeKeeper be configured to minimize the effect that these bad NTP servers can have on your network. This can be accomplished in several ways:
- Replace the NTP server entirely and distribute NTP with TimeKeeper to get accurate timing data to your clients
- Place a TimeKeeper server between your clients and the existing server, acting as a stratum server. In this configuration, TimeKeeper accepts the noisy NTP and smooths it out in order to serve it directly to clients
- On the clients, specify LOWQUALITY=1 in the sources that use the noisy NTP server. This way TimeKeeper will perform additional smoothing to the noisy clock to extract more signal
Firewalls
Although uncommon, occasionally firewalls are used between timing clients and servers. While the clients can still sync through the firewall, generally this reduces the potential accuracy of the client greatly.
Timestamping accuracy
TimeKeeper will accurately serve or consume time on any supported host, but there are some configurations that provide more accurate results.
TimeKeeper will enable the highest resolution timestamping it can, and report which features were enabled in /var/log/timekeeper at startup. On windows, this will be logged to %ProgramFiles%\timekeeper\var\log\timekeeper.log. If the host distribution supports the feature, the driver can enable hardware timestamping, and the device supports the feature, timestamping will be done by the card. If the card or the driver can’t support the feature, software timestamps will be requested from the network stack, if the host can provide them.
Hardware timestamps are the most accurate, followed by software timestamps. TimeKeeper uses this data to factor out propagation delays in incoming time, and also to provide more accurate data when delivering time.
It is important to understand TimeKeeper does not need special support for your particular hardware - it will find and use the most accurate timestamping available, automatically, to the best ability of your host distribution, driver, and hardware.
If you have any questions about how to make sure you’re getting the best performance in your environment, please ask support@fsmlabs.com.
Bonded Ethernet
Bonding is supported, including the ability to use hardware timestamps with the devices in the bond. However, care should be taken to make sure that the underlying devices are of the same device type so that there’s a consistent feature set and performance across the members of the bond.
Best Practices
Here is a brief list of TimeKeeper best practices. These may not apply to all deployments, but they apply to many that depend on TimeKeeper for accurate and resilient timing services.
- Have multiple sources for both clients and Grandmasters, including backing the Grandmaster GNSS (they can fail, too).
- Have a mix of sources, e.g., PTP and NTP, to guard against protocol failure.
- Unless you have extraordinary needs, use the default sync rate because a higher rate does not necessarily mean better accuracy.
- Only use TimeKeeper’s Sourcecheck feature for a set of heterogeneous sources, e.g,. sources reached through different paths, different protocols, different Grandmasters.
- Disable all verbose-logging settings in production when not investigating an issue.
- Client
- Disable all other time daemons, e.g., ntpd and W32Time, and assure that no process restarts them.
- Make sure TimeKeeper is not pinned to a CPU with already high load.
- Grandmaster
- Use a GNSS receiver at the Grandmaster.
- Have peer Grandmasters use each other as a source for failover, but, to avoid a common point of failure, make sure they don’t share a GNSS receiver antenna.
- PTP
- Disable PTP features in PTP-aware switches.
- We generally recommend hybrid mode, but, for long PTP paths, where multicast is not propagated or not propagated efficiently, use telecom mode (all traffic is unicast).
- TimeKeeper supports the BMCA but can usually provide better failover by, instead, configuring it to use multiple sources and, in some instances, enabling Sourcecheck.
- NTP
- Use public NTP servers with care, such as when pointing to a NIST server for regulatory compliance, because they can vary in accuracy and availability.
- Alerting
- Handle alerts by specifying an email address, syslog server, or SNMP manager.
- Specify sync-error thresholds for every source according to your accuracy expectations so that you are alerted when it’s exceeded.
- Similarly, for serving time, specify sync-error thresholds per server so that you are notified when a client exceeds the threshold.
- Compliance
- Create reports with alert/warning thresholds more restrictive than needed for regulatory reporting to identify hosts with marginal sync accuracy.
- Networking
- Use network interfaces that support hardware timestamping.
- For failover at the network interface, use active/backup bonding and/or access multiple sources through different interfaces.
- Use the same speed on both ends of a connection, e.g., 10G and 10G, to avoid asymmetric delays.
- Avoid busy or intermittently congested networks for timing data. A dedicated timing network is best.
- VMs
- Determine if the hypervisor should set the guest VM’s clock at startup.
- Prevent VM integrations from steering the guest clock.
- Minimize migration, such as via vMotion. Otherwise, your applications could have to cope with sudden jumps in time.