SCK_AcceptConnection() always returns a non-blocking socket. Clear the
O_NONBLOCK flag in the socket unit test, which relies on blocking, to
avoid failures.
Reported-by: Matthias Andree <matthias.andree@gmx.de>
Modify the HCL_ProcessReadings() function to try to always provide
a valid sample. Instead of dropping a sample outside of the expected
delay, provide its assumed quality level as a small integer (relative to
already accumulated samples), and let the caller decide what quality is
acceptable.
Add maxunreach option to NTP sources and refclocks to specify the
maximum number of polls that the source can stay selected for
synchronization when it is unreachable (i.e. no valid sample was
received in the last 8 polls).
It is an additional requirement to having at least one sample more
recent than the oldest sample of reachable sources.
The default value is 100000. Setting the option to 0 disables selection
of unreachable sources, which matches RFC 5905.
Switch from memcmp() to the new constant-time function to compare the
received and expected authentication data generated with a symmetric key
(NTP MAC or AES CMAC).
While this doesn't seem to be strictly necessary with the current
code, it is a recommended practice to prevent timing attacks. If
memcmp() compared the MACs one byte at a time (a typical memcmp()
implementation works with wider integers for better performance) and
chronyd as an NTP client/server/peer was leaking the timing of the
comparison (e.g. in the monitoring protocol), an attacker might be able
for a given NTP request or response find in a sequence the individual
bytes of the MAC by observing differences in the timing over a large
number of attempts. However, this process would likely be so slow the
authenticated request or response would not be useful in a MITM attack
as the expected origin timestamp is changing with each poll.
Extend the keys unit test to compare the time the function takes to
compare two identical MACs and MACs differing in the first byte
(maximizing the timing difference). It should fail if the compiler's
optimizations figure out the function can return early. The test is not
included in the util unit test to avoid compile-time optimizations with
the function and its caller together. The test can be disabled by
setting NO_TIMING_TESTS environment variable if it turns out to be
unreliable.
Add a function to check if two buffers of the same length contain the
same data, but do the comparison in a constant time with respect to the
returned value to avoid creating a timing side channel, i.e. the time
depends only on the buffer length, not on the content.
Use the gnutls_memcmp() or nettle_memeql_sec() functions if available,
otherwise use the same algorithm as nettle - bitwise ORing XORed data.
Don't allow the NTP support and asynchronous name resolving to be
disabled. pthreads are now a hard requirement.
NTP is the primary task of chrony. This functionality doesn't seem to be
commonly disabled (allowing only refclocks and manual input).
This removes rarely (if ever) used code and simplifies testing.
Add a new parameter to limit the negative value of the step state
variable. It's set as a maximum delay in number of updates before the
actual step applied to the quantile estimate starts growing from the
minimum step when the input value is consistently larger or smaller than
the estimate.
This prevents the algorithm from effectively becoming the slower 1U
variant if the quantile estimate is stable most of the time.
Set it to 100 updates for the NTP delay and 1000 updates for the hwclock
delay. An option could be added later to make it configurable.
The algorithm was designed for estimating quantiles in streams of
integer values. When the estimate is equal to the input value, the
step state variable does not change. This causes problems for the
floating-point adaptation used for measurents of delay in chrony.
One problem is numerical instability due to the strict comparison of
the input value and the current estimate.
Another problem is with signals that are so stable that the nanosecond
resolution of the system functions becomes the limitation. There is a
large difference in the value of the step state variable, which
determines how quickly the estimate will adapt to a new distribution,
between signals that are constant in the nanosecond resolution and
signals that can move in two nanoseconds.
Change the estimate update to never consider the input value equal to
the current estimate and don't set the estimate exactly to the input
value. Keep it off by a quarter of the minimum step to force jumping
around the input value if it's constant and decreasing the step variable
to negative values. Also fix the initial adjustment to step at least by
the minimum step (the original algorithm is described with ceil(), not
fabs()).
Make sure the TLS session is not NULL in NKSN_GetKeys() before trying to
export the keys in case some future code tried to call the function
outside of the NTS-KE message handler.
Implement a fallback for the NTS-NTP client to switch to the compliant
AES-128-GCM-SIV exporter context when the server is using the compliant
context, but does not support the new NTS-KE record negotiating its use,
assuming it can respond with an NTS NAK to the request authenticated
with the incorrect key.
Export both sets of keys when processing the NTS-KE response. If an NTS
NAK is the only valid response from the server after the last NTS-KE
session, switch to the keys exported with the compliant context for the
following requests instead of dropping all cookies and restarting
NTS-KE. Don't switch back to the original keys if an NTS NAK is received
again.
When the NTS client and server negotiated use of AES-128-GCM-SIV keys,
the keys exported from the TLS session and used for authentication and
encryption of NTP messages do not comply to RFC8915. The exporter
context value specified in the section 5.1 of RFC8915 function is
incorrect. It is a hardcoded string which contains 15 (AES-SIV-CMAC-256)
instead of 30 (AES-128-GCM-SIV). This causes chrony to not interoperate
with NTS implementations that follow RFC8915 correctly. (At this time,
there doesn't seem to be another implementation with AES-128-GCM-SIV
support yet.)
Replace the string with a proper construction of the exporter context
from a specified AEAD ID and next protocol.
Keep using the incorrect AEAD ID for AES-128-GCM-SIV to not break
compatibility with existing chrony servers and clients. A new NTS-KE
record will be added to negotiate the compliant exporter context.
Reported-by: Martin Mayer <martin.mayer@m2-it-solutions.de>
Add a third return value to CLG_LimitServiceRate() to indicate the
server should send a response requesting the client to reduce its
polling rate. It randomly selects from a fraction (configurable to 1/2,
1/4, 1/8, 1/16, or disabled) of responses which would be dropped
(after selecting responses for the leak option).
Add a new parameter to the NSR_AddSourceByName() function to allow
individual sources to be limited to IPv4 or IPv6 addresses. This doesn't
change the options passed to the resolver. It's just an additional
filter in the processing of resolved addresses following the -4/-6
command-line option of chronyd.
Before opening new IPv4/IPv6 server sockets, chronyd will check for
matching reusable sockets passed from the service manager (for example,
passed via systemd socket activation:
https://www.freedesktop.org/software/systemd/man/latest/sd_listen_fds.html)
and use those instead.
Aside from IPV6_V6ONLY (which cannot be set on already-bound sockets),
the daemon sets the same socket options on reusable sockets as it would
on sockets it opens itself.
Unit tests test the correct parsing of the LISTEN_FDS environment
variable.
Add 011-systemd system test to test socket activation for DGRAM and
STREAM sockets (both IPv4 and IPv6). The tests use the
systemd-socket-activate test tool, which has some limitations requiring
workarounds discussed in inline comments.
Add two new fields to the NTP_Local_Timestamp structure:
- receive duration as the time it takes to receive the ethernet frame,
currently known only with HW timestamping
- network correction as a generalized PTP correction
The PTP correction is provided by transparent clocks in the correction
field of PTP messages to remove the receive, processing and queueing
delays of network switches and routers. Only one-step end-to-end unicast
transparent clocks are useful for NTP-over-PTP. Two-step transparent
clocks use follow-up messages and peer-to-peer transparent clocks don't
handle delay requests.
The RX duration will be included in the network correction to compensate
for asymmetric link speeds of the server and client as the NTP RX
timestamp corresponds to the end of the reception (in order to
compensate for the asymmetry in the normal case when no corrections
are applied).
Clients sockets are closed immediately after receiving valid response.
Don't wait for the first early HW TX timestamp to enable waiting for
late timestamps. It may take a long time or never come if the HW/driver
is consistently slow. It's a chicken and egg problem.
Instead, simply check if HW timestamping is enabled on at least one
interface. Responses from NTP sources on other interfaces will always be
saved (for 1 millisecond by default).
Initialize the unused part of shorter server NTS keys (AES-128-GCM-SIV)
loaded from ntsdumpdir to avoid sending uninitialized data in requests
to the NTS-KE helper process.
Do that also for newly generated keys in case the memory will be
allocated dynamically.
Fixes: b1230efac3 ("nts: add support for encrypting cookies with AES-128-GCM-SIV")
Rework handling of late HW TX timestamps. Instead of suspending reading
from client-only sockets that have HW TX timestamping enabled, save the
whole response if it is valid and a HW TX timestamp was received for the
source before. When the timestamp is received, or the configurable
timeout is reached, process the saved response again, but skip the
authentication test as the NTS code allows only one response per
request. Only one valid response per source can be saved. If a second
valid response is received while waiting for the timestamp, process both
responses immediately in the order they were received.
The main advantage of this approach is that it works on all sockets, i.e.
even in the symmetric mode and with NTP-over-PTP, and the kernel does
not need to buffer invalid responses.
Add a structure for 64-bit integers without requiring 64-bit alignment
to be usable in CMD_Reply without struct packing.
Add utility functions for conversion to/from network order. Avoid using
be64toh() and htobe64() as they don't seem to be available on all
supported systems.
If the authenticator SIV encryption fails (e.g. due to wrong nonce
length), decrement the number of extension fields to keep the packet
info consistent.
Keep a server SIV instance for each available algorithm.
Select AES-128-GCM-SIV if requested by NTS-KE client as the first
supported algorithm.
Instead of encoding the AEAD ID in the cookie, select the algorithm
according to the length of decrypted keys. (This can work as a long as
all supported algorithms use keys with different lengths.)
If AES-128-GCM-SIV is available on the client, add it to the requested
algorithms in NTS-KE as the first (preferred) entry.
If supported on the server, it will make the cookies shorter, which
will get the length of NTP messages containing only one cookie below
200 octets. This should make NTS more reliable in networks where longer
NTP packets are filtered as a mitigation against amplification attacks
exploiting the ntpd mode 6/7 protocol.
While AES-SIV-CMAC allows nonces of any length, AES-GCM-SIV requires
exactly 12 bytes, which is less than the unpadded minimum length of 16
used in the NTS authenticator field. These functions will be needed to
support both ciphers in the NTS code.
This is a newer nonce misuse-resistant cipher specified in RFC 8452,
which is now supported in the development code of the Nettle library.
The advantages over AES-SIV-CMAC-256 are shorter keys and better
performance.
If the randomly generated timestamps are close to the current time, the
source can be selected for synchronization, which causes a crash when
logging the source name due to uninitialized ntp_sources.
Specify the source with the noselect option to prevent selection.
Call the function with current time instead of latest sample of the
first source to avoid undefined conversion of negative double to long
int.
Fixes: 07600cbd71 ("test: extend sources unit test")