Add minstratum and maxstratum directives to specify the minimum and
maximum allowed stratum of sources to be selected. The default values
are 0 and 15 respectively, allowing all NTP sources and refclocks.
Sources that are rejected due to having too large or too small stratum
are marked with 'r' in the selection log and selectdata report.
This is similar to the "tos floor" and "tos ceiling" settings of ntpd,
except that maxstratum is interpreted as one below the ceiling.
The recent rework of refclock reachability to better work with
driver-specific filtering (PHC driver dropping samples with unexpected
delay) introduced an issue that a PPS refclock is indicated as reachable
even when its "lock" refclock is permanently unreachable, or its samples
constistently fail in other sample checks, and no actual samples can be
accumulated. This breaks the new maxunreach option.
Rework the refclock code to provide samples from drivers together with
their quality level (all drivers except PHC provide samples with
constant quality of 1) and drop samples with quality 0 after passing
all checks, right before the actual accumulation in the median sample
filter. Increment the reachability counter only for samples that would
be accumulated.
This fixes the problem with refclocks indicated as reachable when their
samples would be dropped for other reasons than the PHC-specific delay
filter, and the maxunreach option can work as expected.
Fixes: b9b338a8df ("refclock: rework update of reachability")
Add maxunreach option to NTP sources and refclocks to specify the
maximum number of polls that the source can stay selected for
synchronization when it is unreachable (i.e. no valid sample was
received in the last 8 polls).
It is an additional requirement to having at least one sample more
recent than the oldest sample of reachable sources.
The default value is 100000. Setting the option to 0 disables selection
of unreachable sources, which matches RFC 5905.
If a clock step enabled by the makestep directive or requested by the
makestep command fails, accumulate the missing step back to keep the
tracking offset valid.
This fixes time served by an instance configured with the makestep
directive and the -x option (the null driver cannot perform steps) at
the same time. It will still generate error log messages.
Don't allow the NTP support and asynchronous name resolving to be
disabled. pthreads are now a hard requirement.
NTP is the primary task of chrony. This functionality doesn't seem to be
commonly disabled (allowing only refclocks and manual input).
This removes rarely (if ever) used code and simplifies testing.
Add "waitunsynced" option to specify how long chronyd needs to wait
before it can activate the local reference when the clock is not
synchronized to give the configured sources a chance to synchronize the
local clock after start. The default is 300 seconds when the orphan
option is enabled (same as the ntpd's default orphanwait), 0 otherwise.
Add "waitsynced" option to specify how long it should wait when the
clock is synchronized. It is an additional requirement to the distance
and activate options.
Include the number of unreachable sources in the "Can't synchronise: no
selectable sources" log message to provide a hint whether it might be a
networking issue.
Add the number of sources that form an agreement (overlapping
intervals), if at least two agree with each other, and number of
reachable sources to the "Can't synchronize: no majority" log message to
better explain why synchronization is failing and hint that adding more
sources might help.
Replace the hardcoded list of open commands (accessible over UDP),
with a list that can be configured with a new "opencommands" directive.
The default matches the original list. All read-only commands except
accheck and cmdaccheck can be enabled. The naming follows the chronyc
naming. Enable the N_SOURCES request only when needed.
This makes it possible to have a full monitoring access without access
to the Unix domain socket. It also allows restricting the monitoring
access to a smaller number of commands if some commands from the default
list are not needed.
Mention in the man page that the protocol of the non-default commands is
not consider stable and the information they provide may have security
implications.
Reduce the minimum number of samples required by the filter from
min(4, length) to 1.
This makes the filtering less confusing. The sample lifetime is limited
to one poll and the default filtering of the SOCK refclock (where the
maximum number of samples per poll is unknown) is identical to the other
refclocks.
A concern with potential variability in number of samples per poll below
4 is switching between different calculations of dispersion in
combine_selected_samples() in samplefilt.c.
The 106-refclock test shows how the order of refclocks in the config can
impact the first filtered sample and selection. If the PPS refclock
follows SHM, a single low-quality PPS sample is accepted in the same
poll where SHM is selected and the initial clock correction started,
which causes larger skew later and delays the first selection of the PPS
refclock.
Update the reachability register of a refclock source by 1 if a valid
measurement is received by the drivers between source polls, and not
only when it is accumulated to sourcestats, similarly to how
reachability works with NTP sources.
This avoids drops in the reported reachability when a PHC refclock is
dropping samples due to significant changes in the measured delay (e.g.
due to high PCIe load), or a PPS refclock dropping samples due to failed
lock.
Add ntsaeads directive to specify a list of AEAD algorithms enabled for
NTS. The list is shared between the server and client. For the client it
also specifies the order of priority. The default is "30 15", matching
the previously hardcoded preference of AES-128-GCM-SIV (30) over
AES-SIV-CMAC-256 (15).
The commit c43efccf02 ("sources: update source selection with
unreachable sources") caused a high rate of failures in the
148-replacement test (1 falseticker vs 2 unreachable sources). This was
due to a larger fraction of the replacement attempts being made for the
source incorrectly marked as a falseticker instead of the second
unreachable source and the random process needed more time to get to the
expected state with both unreachable sources replaced.
When updating reachability of an unreachable source, try to request the
replacement of the source before calling the source selection, where
other sources may be replaced, to better balance the different
replacement attempts.
Add ptpdomain directive to set the domain number of transmitted and
accepted NTP-over-PTP messages. It might need to be changed in networks
using a PTP profile with the same domain number. The default domain
number of 123 follows the current NTP-over-PTP specification.
When an NTP source is specified with the offset option, the corrected
offset may get outside of the supported NTP interval (by default -50..86
years around the build date). If the source passed the source selection,
the offset would be rejected only later in the adjustment of the local
clock.
Check the offset validity as part of the NTP test A to make the source
unselectable and make it visible in the measurements log and ntpdata
report.
Allow one message about failed selection (e.g. no selectable sources)
to be logged before first successful selection when a source has
full-size reachability register (8 polls with a received or missed
response).
This should make it more obvious that chronyd has a wrong configuration
or there is a firewall/networking issue.
Add "kod" option to the ratelimit directive to respond with the KoD
RATE code to randomly selected requests exceeding the configured limit.
This complements the client support of KoD RATE. It's disabled by
default.
There can be only one KoD code in one response. If both NTS NAK and RATE
codes are triggered, drop the response. The KoD RATE code can be set in
an NTS-authenticated response.
If the reload sources command was received in the chronyd start-up
sequence with initstepslew and/or RTC init (-s option), the sources
loaded from sourcedirs caused a crash due to failed assertion after
adding sources specified in the config.
Ignore the reload sources command until chronyd enters the normal
operation mode.
Fixes: 519796de37 ("conf: add sourcedirs directive")
Use leapseclist instead of leapsectz and test also negative leap
seconds. Add a test for leapsectz when the date command indicates
right/UTC is available on the system and mktime() works as expected.
Check TAI offset in the server's log.
The presend option in interleaved mode uses two presend requests instead
of one to get an interleaved response from servers like chrony which
delay the first interleaved response due to an optimization saving
timestamps only for clients actually using the interleaved mode.
After commit 0ae6f2485b ("ntp: don't use first response in interleaved
mode") the first interleaved response following the two presend
responses in basic mode is dropped as the preferred set of timestamps
minimizing error in delay was already used by the second sample in
basic mode. There are only three responses in the burst and no sample is
accumulated.
Increasing the number of presend requests to three to get a fourth
sample would be wasteful. Instead, allow reusing timestamps of the
second presend sample in basic mode, which is never accumulated.
Reported-by: Aaron Thompson
Fixes: 0ae6f2485b ("ntp: don't use first response in interleaved mode")
If the network correction is known for both the request and response,
and their sum is not larger that the measured peer delay, allowing the
transparent clocks to be running up to 100 ppm faster than the client's
clock, apply the corrections to the NTP offset and peer delay. Don't
correct the root delay to not change the estimated maximum error.
Rename the exp1 extension field to exp_mono_root (monotonic timestamp +
root delay/dispersion) to better distinguish it from future experimental
extension fields.
Avoid frequently ending in the middle of a client/server exchange with
long delays. This changed after commit 4a11399c2e ("ntp: rework
calculation of transmit timeout").
If noselect is present in the configured options, don't assume it
cannot change and the effective options are equal. This fixes chronyc
selectopts +noselect command.
Fixes: 3877734814 ("sources: add function to modify selection options")
Refresh NTP sources specified by hostname periodically (every 2 weeks
by default) to avoid long-running instances using a server which is no
longer intended for service, even if it is still responding correctly
and would not be replaced as unreachable, and help redistributing load
in large pools like pool.ntp.org. Only one source is refreshed at a time
to not interrupt clock updates if there are multiple selectable servers.
The refresh directive configures the interval. A value of 0 disables
the periodic refreshment.
Suggested-by: Ask Bjørn Hansen <ask@develooper.com>