This refclock uses an RTC as reference source. If the RTC doesn't
support reporting an update event this source is quite coarse as it
usually needs a slow bus access to be read and has a precision of only
one second. If reporting an update event is available, the time is read
just after such an event which improves precision.
Depending on hardware capabilities you might want to combine it with a
PPS reference clock sourced from the same chip.
Note that you can enable UIE emulation in the Linux kernel to make a RTC
without interrupt support look like one with irqs in return for some
system and bus overhead.
Co-authored-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
For use with RCL_AddSample in the incoming RTC reference clock driver,
we'll want access not to the cooked timestamps, but to the raw ones.
Otherwise, the core reference clock code complains:
(valid_sample_time) RTC0 refclock sample time 1721242673.092211891
not valid age=-3.092007
Support both use cases by have the RTC_Linux_ReadTime_* functions take
two nullable pointers, one for cooked and the other for raw time.
We have code to read RTC time and handle the error associated with
having no UIE interrupt, which we currently use as part of maintaining
the correction file.
In a later commit, we will need the same functionality for using the RTC
as reference clock, so export the function and give it a descriptive
name appropriate for a globally visible function.
We have code to read time after an RTC's UIE interrupt, which we
currently use as part of maintaining the correction file.
In a later commit, we will need the same functionality for using the RTC
as reference clock, so export the function and give it a descriptive
name appropriate for a globally visible function.
We have code to enable and disable the RTC's UIE interrupt, as well as
check whether it occurred and skip the first interrupt as it may be
bogus.
This is currently used as part of maintaining the correction file.
In a later commit, we will need the same functionality for using the RTC
as reference clock, so export the functions and give them descriptive
names appropriate for globally visible functions.
rtc_from_t() and t_from_rtc() call either gmtime or localtime depending
on the value of a global rtc_on_utc variable.
This will not be appropriate anymore when we start exporting functions
that call rtc_from_t() and t_from_rtc() for use outside of rtc_linux.c
as the rtc_on_utc variable may not have been initialized yet or at all.
Therefore make whether the RTC is on UTC a function parameter of these
functions, so the value can be propagated from the callers.
Both functions operate on struct tm even though the struct tm is only
used as intermediary format from struct rtc_time.
In the case of rtc_from_t, the code to convert from struct tm to struct
rtc is even duplicated.
Let's simplify code a bit by moving the struct translation code into
these conversion functions.
Close the other end of the pipe in the grandparent process before exit
to avoid valgrind error.
Also, in the daemon process avoid closing the pipe for second time in
the 0-1024 close() loop to avoid another error.
Add a new parameter to limit the negative value of the step state
variable. It's set as a maximum delay in number of updates before the
actual step applied to the quantile estimate starts growing from the
minimum step when the input value is consistently larger or smaller than
the estimate.
This prevents the algorithm from effectively becoming the slower 1U
variant if the quantile estimate is stable most of the time.
Set it to 100 updates for the NTP delay and 1000 updates for the hwclock
delay. An option could be added later to make it configurable.
The algorithm was designed for estimating quantiles in streams of
integer values. When the estimate is equal to the input value, the
step state variable does not change. This causes problems for the
floating-point adaptation used for measurents of delay in chrony.
One problem is numerical instability due to the strict comparison of
the input value and the current estimate.
Another problem is with signals that are so stable that the nanosecond
resolution of the system functions becomes the limitation. There is a
large difference in the value of the step state variable, which
determines how quickly the estimate will adapt to a new distribution,
between signals that are constant in the nanosecond resolution and
signals that can move in two nanoseconds.
Change the estimate update to never consider the input value equal to
the current estimate and don't set the estimate exactly to the input
value. Keep it off by a quarter of the minimum step to force jumping
around the input value if it's constant and decreasing the step variable
to negative values. Also fix the initial adjustment to step at least by
the minimum step (the original algorithm is described with ceil(), not
fabs()).
Extend the interval of accepted delays by half of the quantile minimum
step in both directions to make room for floating-point errors in the
quantile calculation and an error that will be intentionally added in
the next commit.
Nettle (>=3.6) and GnuTLS (>=3.6.14) with the AES-SIV-CMAC support
required for NTS are now widely available in operating systems. Drop
the internal Nettle-based implementation.
Reduce the minimum number of samples required by the filter from
min(4, length) to 1.
This makes the filtering less confusing. The sample lifetime is limited
to one poll and the default filtering of the SOCK refclock (where the
maximum number of samples per poll is unknown) is identical to the other
refclocks.
A concern with potential variability in number of samples per poll below
4 is switching between different calculations of dispersion in
combine_selected_samples() in samplefilt.c.
The 106-refclock test shows how the order of refclocks in the config can
impact the first filtered sample and selection. If the PPS refclock
follows SHM, a single low-quality PPS sample is accepted in the same
poll where SHM is selected and the initial clock correction started,
which causes larger skew later and delays the first selection of the PPS
refclock.
Update the reachability register of a refclock source by 1 if a valid
measurement is received by the drivers between source polls, and not
only when it is accumulated to sourcestats, similarly to how
reachability works with NTP sources.
This avoids drops in the reported reachability when a PHC refclock is
dropping samples due to significant changes in the measured delay (e.g.
due to high PCIe load), or a PPS refclock dropping samples due to failed
lock.
If the client included the NTS-KE record requesting compliant key
exporter context for AES-128-GCM-SIV, but the server doesn't select this
AEAD algorithm (it's not supported by the crypto library or it is
disabled by the ntsaeads directive), don't include the NTS-KE record in
the response. It's not relevant to the other AEAD algorithms.
Add ntsaeads directive to specify a list of AEAD algorithms enabled for
NTS. The list is shared between the server and client. For the client it
also specifies the order of priority. The default is "30 15", matching
the previously hardcoded preference of AES-128-GCM-SIV (30) over
AES-SIV-CMAC-256 (15).
Make sure the TLS session is not NULL in NKSN_GetKeys() before trying to
export the keys in case some future code tried to call the function
outside of the NTS-KE message handler.
Implement a fallback for the NTS-NTP client to switch to the compliant
AES-128-GCM-SIV exporter context when the server is using the compliant
context, but does not support the new NTS-KE record negotiating its use,
assuming it can respond with an NTS NAK to the request authenticated
with the incorrect key.
Export both sets of keys when processing the NTS-KE response. If an NTS
NAK is the only valid response from the server after the last NTS-KE
session, switch to the keys exported with the compliant context for the
following requests instead of dropping all cookies and restarting
NTS-KE. Don't switch back to the original keys if an NTS NAK is received
again.
When the NTS client and server negotiated use of AES-128-GCM-SIV keys,
the keys exported from the TLS session and used for authentication and
encryption of NTP messages do not comply to RFC8915. The exporter
context value specified in the section 5.1 of RFC8915 function is
incorrect. It is a hardcoded string which contains 15 (AES-SIV-CMAC-256)
instead of 30 (AES-128-GCM-SIV). This causes chrony to not interoperate
with NTS implementations that follow RFC8915 correctly. (At this time,
there doesn't seem to be another implementation with AES-128-GCM-SIV
support yet.)
Replace the string with a proper construction of the exporter context
from a specified AEAD ID and next protocol.
Keep using the incorrect AEAD ID for AES-128-GCM-SIV to not break
compatibility with existing chrony servers and clients. A new NTS-KE
record will be added to negotiate the compliant exporter context.
Reported-by: Martin Mayer <martin.mayer@m2-it-solutions.de>
The commit c43efccf02 ("sources: update source selection with
unreachable sources") caused a high rate of failures in the
148-replacement test (1 falseticker vs 2 unreachable sources). This was
due to a larger fraction of the replacement attempts being made for the
source incorrectly marked as a falseticker instead of the second
unreachable source and the random process needed more time to get to the
expected state with both unreachable sources replaced.
When updating reachability of an unreachable source, try to request the
replacement of the source before calling the source selection, where
other sources may be replaced, to better balance the different
replacement attempts.
If the pid file path is specified as '/', skip handling it,
as it is not only unnecessary but complicates managing the
service. A systemd unit can manage the program without any
need for this functionality, and it makes process tracking
simpler and more robust.
The implementation matches the bindcmdaddress directive.
DNSSEC requires the system time to be synced in order to work,
as the signature date and expiration need to be checked by
resolvers. But it is possible that syncing the times requires
doing DNS queries. Add a paragraph to the FAQ explaining how
to break this cycle by asking nss-resolved to always avoid
DNSSEC when chronyd tries to resolve hostnames.