Sourcecheck

The TimeKeeper ‘Sourcecheck’ feature is an analytics and automated threat detection tool that allows TimeKeeper to detect bad time sources and switch away from them. Bad time can come from misconfiguration, equipment and software failures or intentional time based attacks. Sourcecheck can detect and defeat them.

Sourcecheck overview

Sourcecheck is a configurable option in TimeKeeper and should only be selected if there are high quality alternate sources. When Sourcecheck is disabled, TimeKeeper will accept the highest priority configured source (NTP, PTP, GPS or other) as authoritative. TimeKeeper will apply sophisticated filtering to reduce jitter and it will failover to the next source and produce alerts if the highest priority source stops communicating new times (or violates protocol). If the highest priority source resumes operations later, TimeKeeper without Sourcecheck will revert. This behavior avoids a number of serious problems caused by ntp implementation attempts to second guess time and by the simplistic “Best Master Clock” operation of PTP 1588 while preserving interoperability. When Sourcecheck is enabled TimeKeeper will cross-check all the existing time sources with one another to validate them. If the highest priority time source fails the Sourcecheck validation TimeKeeper will mark it as invalid, generate alerts and switch to the next highest priority time source that is not currently marked invalid. Cross check detects bad time from equipment failures, misconfiguration, false leap seconds as well as malicious activities such as GPS spoofing and other time based attacks.

How it works

Sourcecheck combines a multi-source stochastic analysis, protocol verification, and source voting to detect outlier time sources. Each time source gets a vote for the correct time. TimeKeeper then looks for patterns over an interval to see if a second source is more in sync with a critical mass of sources.

Below you can see a simple example of a tie with 3 time sources. The primary (source 0, in green) and two secondary time sources all disagree. The secondary time sources disagree from the primary by +0.5 and -0.5 milliseconds respectively. Since there is no ‘quorum’ here source 0 remains valid. Note that for some examples below, the source disagreements are large for ease of illustration. In practice these discrepancies will generally be much smaller and Sourcecheck will be identifying issues with sources just a few microseconds apart.

Source 0 (green) disagreeing with both source 1 (blue) and source 2 (orange).

Next you can see a chart showing that source 0 and 1 agree while source 2 is offset by -0.5 milliseconds. Sourcecheck will declare source 2 as invalid in this case.

Source 0 (green) and source 1 (blue) agree, but source 2 (orange) is invalid.

Below you can see an example where source 0 would soon be declared invalid. That is because source 1 and 2 form a quorum, allowing source 0 to be rejected. Once Sourcecheck flags source 0, TimeKeeper would allow failover to source 1 so that the offset can be corrected for. Again please note the scales of disagreement here are for illustrative purposes, and in many cases Sourcecheck identifies similar issues but at a much smaller scale, down in the low microsecond range.

Source 0 (green) is invalid and will be rejected since source 1 (blue) and source 2 (orange) agree.

Sourcecheck also analyzes by both offset and frequency. In the example below without Sourcecheck taking action we see the secondary and tertiary time sources start slowly drifting in offset around 17:24 because a GPS spoof attack is under way. The attack is slowly moving the time on our primary time source (source 0) but in this graph it looks as though all the other time sources are moving because TimeKeeper is still tracking source 0. Eventually Sourcecheck’s time offset analysis would catch the compromise but frequency analysis spots the problem earlier.

Source 0 (green) is being spoofed but it can be some time before source 1 (blue) and source 2 (orange) can outvote it based on offset.

The graph below shows same situation above but analyzed by the rate of change (frequency). It’s quite clear from this graph there was a frequency disagreement at 17:23 and that Sourcecheck can invalidate source 0 based on frequency rather than waiting for the offset to grow large enough to cause an issue.

The same issue seen by frequency shows the issue with source 0 (green) much more quickly, just after 17:23:20.

Enabling Sourcecheck

The Sourcecheck feature is a global option and cannot be configured per-source. That’s done by selecting the ‘Sourcecheck’ box in the GUI or by setting the following in the configuration file.

SOURCE0() { NTPSERVER=...; }
SOURCE1() { NTPSERVER=...; }
SOURCE2() { NTPSERVER=...; }
SOURCECHECK=1

Note that any time sources marked as ‘MONITORONLY=1’ will not be included in the cross-check of time sources.

Restoring a previously invalid source

Once a time source corrects its time and begins passing Sourcecheck validation it will not be marked ‘valid’ immediately. Sourcecheck will wait 15 minutes before allowing a source to be considered valid. This avoids switching back to a bad time source because it momentarily shows good time or quickly switching between time sources that are in the ‘gray area’ of being marked invalid.

Leapsecond checks

In addition to the cluster analysis that Sourcecheck performs it will execute additional checks. To avoid false leap seconds which have plagued other solutions Sourcecheck will confirm that a majority of time sources show a leap second before accepting the leap second as valid. Sourcecheck will also only permit leap seconds during certain times of the year during which leap seconds are inserted. If a leap second is advertised by a time source outside of these ranges then it is rejected.

Some configurations can result in unexpected behavior so it’s important to be aware of that. If a primary time source is configured to ‘slew’ in response to a leapsecond but backup time sources are configured to ‘step’. In this case immediately after the leap second Sourcecheck would see the primary time source smoothly/slowly slewing the time to correct for the leap second while the backup time sources stepped immediately. That would cause Sourcecheck to reject the primary time source since it disagrees with the two backup time sources. After the primary time source completes the slew it would be accepted by Sourcecheck again as all time sources then agree.

How to setup Sourcecheck properly

The most critical aspect of setting up Sourcecheck is to select good quality time sources. If you select multiple time sources that are outside of your control, are unreliable and cannot be trusted then you have given these sources the ability to invalidate your primary source. So it is always recommended that you use only your own corporate or internal time sources (no public NTP servers). Sourcecheck requires at least three sources, but more is better. In general we recommend 5 sources for a proper configuration.

When all time sources are within a certain threshold (configurable, see ‘Tuning’ below) then they are all declared to be ‘valid’ and Sourcecheck will not invalidate any of them. The Sourcecheck threshold setting avoids spurious source changes where all time sources are accurate within a reasonable margin of error (perhaps a few hundred nanoseconds on a decent network). In general it’s not necessary to adjust the default value (30 microseconds) but it can be increased to allow inclusion of lower quality sources in the cross-check. For example, a known-high jitter network link or a poor quality time source that has a known quality can be used in with Sourcecheck by adjusting the threshold.

A good practice when configuring Sourcecheck for the first time or with new sources is to first setup the candidate time sources without Sourcecheck enabled. Over a day or two, TimeKeeper will collect data on how the time sources behave. The graph below shows source 0 as a local time source. Sources 1 and 2 are remote time servers. During this test the remote time sources are in agreement but are invalid and noisy because they’re both affected by the same shifting network bias. If Sourcecheck had been enabled it would have declared source 0 as invalid since it disagrees with sources 1 and 2 which agree with each other. These cross-check sources would not be good candidates here without additional stable sources that do not use this likely unstable network connection.

Source 0 (green) is correct, but it could be outvoted by sources 1 (blue) and 2 (orange) who noisily agree.

Avoiding loops

Something that must be avoided when configuring Sourcecheck is cycles or loops in a time network. For example, one can configure 3 different systems to cross-check with all other systems. It’s possible for a situation to occur where the primary time source is rejected and a backup time source is used on all of these systems. In that case, A might track B, B might track C and C might track A. No system is tracking a valid time but they are all tracking one another which can result in time being incorrect.

Tuning

Due to high network jitter or other variance in network time sources one may want to tune Sourcecheck. For example, it may be necessary to reduce the sensitivity so Sourcecheck won’t trigger as easily when a few time sources wander but still catch very large changes. SOURCECHECK_AUTOVALIDATE_THRESHOLD is a configuration value that allows this. This value can be described roughly as how tight the sync between sources must be before they are declared a quorum - or in agreement. The default value is 0.000030 (or 30 microseconds). To change that value to one millisecond:

SOURCECHECK_AUTOVALIDATE_THRESHOLD=0.001