Commit Graph

3041 Commits

Author SHA1 Message Date
Miroslav Lichvar
bf964673cb reference: add wait options for local reference activation
Add "waitunsynced" option to specify how long chronyd needs to wait
before it can activate the local reference when the clock is not
synchronized to give the configured sources a chance to synchronize the
local clock after start. The default is 300 seconds when the orphan
option is enabled (same as the ntpd's default orphanwait), 0 otherwise.

Add "waitsynced" option to specify how long it should wait when the
clock is synchronized. It is an additional requirement to the distance
and activate options.
2025-03-13 16:30:27 +01:00
Miroslav Lichvar
c66c774e26 sources: increase severity of can't-synchronise log messages
Switch the info-level "Can't synchronise" selection messages to
warnings.
2025-03-13 16:25:54 +01:00
Miroslav Lichvar
274c1767f2 sources: improve no-selectable-sources log message
Include the number of unreachable sources in the "Can't synchronise: no
selectable sources" log message to provide a hint whether it might be a
networking issue.
2025-03-13 16:25:54 +01:00
Miroslav Lichvar
c27a298d9c sources: switch unselect_selected_source() to variable arguments
Switch the function to the full printf style, which will be needed to
log an integer.
2025-03-13 16:25:42 +01:00
Miroslav Lichvar
072ec20e2c sources: delay maxjitter/maxdistance warning messages
Avoid logging the new warning messages about exceeded maxjitter or
maxdistance when only a small number of samples is collected after the
source becomes reachable and the values are unstable. Log the messages
only when a replacement attempt is made.
2025-03-13 16:16:28 +01:00
Miroslav Lichvar
711c7c0c8a rewrite some assertions for better readability
Some assertions are written as "if (x) assert(0)" to avoid having
the text of a long argument compiled in the binary. Rewrite them
to use a new BRIEF_ASSERT macro to make the condition easier to read in
its non-negated form and make it easier to turn it back to the full-text
assert if needed.
2025-03-06 15:22:36 +01:00
Miroslav Lichvar
454ce62672 sources: improve no majority log message
Add the number of sources that form an agreement (overlapping
intervals), if at least two agree with each other, and number of
reachable sources to the "Can't synchronize: no majority" log message to
better explain why synchronization is failing and hint that adding more
sources might help.
2025-03-06 15:22:36 +01:00
Miroslav Lichvar
ab850d41f4 sources: warn about sources exceeding maxdistance or maxjitter
Log a warning message if a source is rejected in the source selecting
due to exceeding the maxdistance or maxjitter limit to make it more
obvious when synchronization is failing for this reason. Delay the
message until the reachability register is full (8 updates), or a
replacement of the source is attempted.
2025-03-06 11:44:36 +01:00
Miroslav Lichvar
744768306b sources: refactor logging of source-specific selection status
Move logging of falsetickers and system peers to the mark_source()
function. Use an array to flag already logged messages in preparation
for logging of other status. Support variable number of arguments in the
logging functions.
2025-03-06 11:41:40 +01:00
Miroslav Lichvar
26d4f0c401 privops: mark res_fatal() for printf format compiler check 2025-03-05 16:39:13 +01:00
Miroslav Lichvar
717db2cfdd doc: improve description of refresh directive 2025-03-05 16:38:56 +01:00
Miroslav Lichvar
55898e9b07 client: fix memory leak of empty readline() string 2025-02-12 15:41:10 +01:00
Miroslav Lichvar
f7bb283536 doc: mention localhost exception in cmdallow description 2025-02-12 15:41:10 +01:00
Miroslav Lichvar
1967fbf1f2 cmdmon: make open commands configurable
Replace the hardcoded list of open commands (accessible over UDP),
with a list that can be configured with a new "opencommands" directive.
The default matches the original list. All read-only commands except
accheck and cmdaccheck can be enabled. The naming follows the chronyc
naming. Enable the N_SOURCES request only when needed.

This makes it possible to have a full monitoring access without access
to the Unix domain socket. It also allows restricting the monitoring
access to a smaller number of commands if some commands from the default
list are not needed.

Mention in the man page that the protocol of the non-default commands is
not consider stable and the information they provide may have security
implications.
2025-02-12 15:40:13 +01:00
Miroslav Lichvar
51da7a0694 cmdmon: refactor command authorization checks
Try to simplify the code and make it more robust to potential bugs.

Instead of maintaing a table mapping all commands to open/auth
permissions, use a short list of open commands. Split the processing
of the commands into two groups, read-write commands and read-only
(monitoring) commands, where the first group is processed only with full
access. Check both the socket descriptor and address type before giving
full access. While moving the code, reorder the commands alphabetically.
2025-02-12 15:10:56 +01:00
Miroslav Lichvar
9ba6e7655c cmdmon: drop handling of NULL and LOGON requests
Handle the NULL and LOGON requests as unknown (invalid) instead of
returning the success and failed status respectively. They have
been unused for very long time now.
2025-02-12 14:52:19 +01:00
Miroslav Lichvar
3dea7dd723 socket: rework setting of struct sockaddr_un
Add a function to fill the Unix sockaddr for binding, connecting and
sending to avoid code duplication. Use memcpy() instead of snprintf()
and provide the minimum length of the address containing the terminating
null byte. This makes it easier to support abstract sockets if needed in
future (e.g. for systemd support).
2025-02-05 16:10:31 +01:00
Miroslav Lichvar
cb1118456b main: add support for systemd notification
On Linux, if the NOTIFY_SOCKET variable is set, send a "READY=1"
and "STOPPING=1" message to the Unix domain socket after initialization
and before finalization respectively. This is used with the systemd
"notify" service type as documented in the sd_notity(3) man page. It's
a recommended alternative to the "forking" service type, which does not
need the PID file to determine the main process.

Support pathname Unix sockets only. Abstract sockets don't seem to be
used by systemd for notifications since version 212.

Switch the example services to the notify type, but keep the PID
file. It's still useful to prevent start of other chronyd instances.
systemd doesn't seem to care about the content of the file and should
just remove it in case chronyd didn't terminate cleanly.

Suggested-by: Luca Boccassi <bluca@debian.org>
2025-02-05 16:10:04 +01:00
Miroslav Lichvar
8ee725ff18 refclock: drop filter length adjustment for polling drivers
In the refclock initialization, if the driver provides a poll()
function and 2^(poll-dpoll) is smaller than the configured length of the
median filter (64 by default), the filter is shortened to 2^(poll-dpoll)
samples, assuming the driver provides samples only in the poll()
function and at most one per call, to avoid wasting memory and before
commit 12237bf283 ("refclock: stop requiring 4 samples in median
filter") also simplify configuration (for polling drivers only)

But this assumption is not always correct. The PHC driver can read
external PPS timestamps independently from the driver polling and the
RTC driver can timestamp interrupts. If the dpoll was too large to cover
the sample rate, some samples would be lost.

Drop the adjustment of the filter length to avoid this unexpected impact
on filtering and make it work as documented.
2024-12-11 11:45:13 +01:00
Bryan Christianson
a90a7cf51c refclock: fix build on non-Linux systems
Fixes: 5fd71e27831f ("refclock: add new refclock for RTCs")
2024-12-10 10:14:43 +01:00
Uwe Kleine-König
4f22883f4e refclock: add new refclock for RTCs
This refclock uses an RTC as reference source. If the RTC doesn't
support reporting an update event this source is quite coarse as it
usually needs a slow bus access to be read and has a precision of only
one second. If reporting an update event is available, the time is read
just after such an event which improves precision.

Depending on hardware capabilities you might want to combine it with a
PPS reference clock sourced from the same chip.

Note that you can enable UIE emulation in the Linux kernel to make a RTC
without interrupt support look like one with irqs in return for some
system and bus overhead.

Co-authored-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
2024-12-04 15:58:10 +01:00
Ahmad Fatoum
65be9d9a02 rtc: optionally return raw time from RTC_Linux_ReadTime*
For use with RCL_AddSample in the incoming RTC reference clock driver,
we'll want access not to the cooked timestamps, but to the raw ones.

Otherwise, the core reference clock code complains:

  (valid_sample_time) RTC0 refclock sample time 1721242673.092211891
    not valid age=-3.092007

Support both use cases by have the RTC_Linux_ReadTime_* functions take
two nullable pointers, one for cooked and the other for raw time.
2024-12-04 15:24:52 +01:00
Ahmad Fatoum
a1191f8392 rtc: factor out RTC_Linux_ReadTimeNow
We have code to read RTC time and handle the error associated with
having no UIE interrupt, which we currently use as part of maintaining
the correction file.

In a later commit, we will need the same functionality for using the RTC
as reference clock, so export the function and give it a descriptive
name appropriate for a globally visible function.
2024-12-04 15:24:48 +01:00
Ahmad Fatoum
924199ef69 rtc: factor out RTC_Linux_ReadTimeAfterInterrupt
We have code to read time after an RTC's UIE interrupt, which we
currently use as part of maintaining the correction file.

In a later commit, we will need the same functionality for using the RTC
as reference clock, so export the function and give it a descriptive
name appropriate for a globally visible function.
2024-12-04 14:54:04 +01:00
Ahmad Fatoum
8028984f34 rtc: export RTC_Linux_{SwitchInterrupt,CheckInterrupt}
We have code to enable and disable the RTC's UIE interrupt, as well as
check whether it occurred and skip the first interrupt as it may be
bogus.
This is currently used as part of maintaining the correction file.

In a later commit, we will need the same functionality for using the RTC
as reference clock, so export the functions and give them descriptive
names appropriate for globally visible functions.
2024-12-04 14:51:25 +01:00
Ahmad Fatoum
68301238a0 rtc: pass info whether RTC is on UTC as a function parameter
rtc_from_t() and t_from_rtc() call either gmtime or localtime depending
on the value of a global rtc_on_utc variable.

This will not be appropriate anymore when we start exporting functions
that call rtc_from_t() and t_from_rtc() for use outside of rtc_linux.c
as the rtc_on_utc variable may not have been initialized yet or at all.

Therefore make whether the RTC is on UTC a function parameter of these
functions, so the value can be propagated from the callers.
2024-12-04 12:09:03 +01:00
Ahmad Fatoum
13db3d6902 rtc: let t_from_rtc/rtc_from_t operate on struct rtc_time directly
Both functions operate on struct tm even though the struct tm is only
used as intermediary format from struct rtc_time.

In the case of rtc_from_t, the code to convert from struct tm to struct
rtc is even duplicated.

Let's simplify code a bit by moving the struct translation code into
these conversion functions.
2024-12-04 12:09:03 +01:00
Miroslav Lichvar
8c24086c6d test: pass make options in features compilation test 2024-12-04 12:08:57 +01:00
Miroslav Lichvar
e30d5a6e6f test: extend socket unit test 2024-12-04 12:06:29 +01:00
Miroslav Lichvar
a0d34eb372 test: add valgrind support to system tests 2024-11-28 16:28:58 +01:00
Miroslav Lichvar
7196943f11 nts: close socket in helper process on exit
Close the socket used for receiving helper requests before exit to avoid
another valgrind error.
2024-11-28 16:09:19 +01:00
Miroslav Lichvar
26fef1751a main: close pipe in grandparent process
Close the other end of the pipe in the grandparent process before exit
to avoid valgrind error.

Also, in the daemon process avoid closing the pipe for second time in
the 0-1024 close() loop to avoid another error.
2024-11-28 14:35:58 +01:00
Miroslav Lichvar
d71e22594e client: close /dev/urandom on exit 2024-11-28 14:35:01 +01:00
Miroslav Lichvar
d7a578fed2 test: enable valgrind fd tracking 2024-11-28 11:17:51 +01:00
Miroslav Lichvar
014a67b29e test: update 106-refclock
With the latest clknetsim all PHC readings should be valid (no
timestamps from future).

Also don't forget to remove refclock.log between tests.
2024-11-28 10:02:58 +01:00
Miroslav Lichvar
d22c8fbcb2 quantiles: add parameter to limit negative step
Add a new parameter to limit the negative value of the step state
variable. It's set as a maximum delay in number of updates before the
actual step applied to the quantile estimate starts growing from the
minimum step when the input value is consistently larger or smaller than
the estimate.

This prevents the algorithm from effectively becoming the slower 1U
variant if the quantile estimate is stable most of the time.

Set it to 100 updates for the NTP delay and 1000 updates for the hwclock
delay. An option could be added later to make it configurable.
2024-11-21 16:00:23 +01:00
Miroslav Lichvar
2da4e3ce53 quantiles: force step update with stable input values
The algorithm was designed for estimating quantiles in streams of
integer values. When the estimate is equal to the input value, the
step state variable does not change. This causes problems for the
floating-point adaptation used for measurents of delay in chrony.

One problem is numerical instability due to the strict comparison of
the input value and the current estimate.

Another problem is with signals that are so stable that the nanosecond
resolution of the system functions becomes the limitation. There is a
large difference in the value of the step state variable, which
determines how quickly the estimate will adapt to a new distribution,
between signals that are constant in the nanosecond resolution and
signals that can move in two nanoseconds.

Change the estimate update to never consider the input value equal to
the current estimate and don't set the estimate exactly to the input
value. Keep it off by a quarter of the minimum step to force jumping
around the input value if it's constant and decreasing the step variable
to negative values. Also fix the initial adjustment to step at least by
the minimum step (the original algorithm is described with ceil(), not
fabs()).
2024-11-21 15:59:56 +01:00
Miroslav Lichvar
c92358e041 ntp+hwclock: add margin to estimated delay quantiles
Extend the interval of accepted delays by half of the quantile minimum
step in both directions to make room for floating-point errors in the
quantile calculation and an error that will be intentionally added in
the next commit.
2024-11-21 15:59:47 +01:00
Miroslav Lichvar
77962bb527 quantiles: add functions to get max k and min step 2024-11-21 09:05:58 +01:00
Miroslav Lichvar
65b5f111d2 quantiles: fix assertion for requested k 2024-11-18 15:51:42 +01:00
Miroslav Lichvar
9856bf2d5f test: improve nts_ntp_auth test 2024-11-14 13:43:22 +01:00
Miroslav Lichvar
2205eae2a1 test: add 014-intermittent test 2024-11-14 13:43:18 +01:00
Miroslav Lichvar
6f381fa78b test: indicate failed sync stats 2024-11-14 13:43:18 +01:00
Miroslav Lichvar
bb8050d884 test: make system test users configurable 2024-11-14 13:43:18 +01:00
Miroslav Lichvar
c85ec4ff0f siv: drop internal implementation
Nettle (>=3.6) and GnuTLS (>=3.6.14) with the AES-SIV-CMAC support
required for NTS are now widely available in operating systems. Drop
the internal Nettle-based implementation.
2024-11-14 13:43:18 +01:00
Miroslav Lichvar
d9d97f76d7 reference: make driftfile update interval configurable
Add interval option to the driftfile directive to specify the minimum
interval between updates of the recorded frequency offset and skew.
2024-11-14 13:43:18 +01:00
lvgenggeng
837323d687 clientlog: simplify code 2024-11-07 12:10:04 +01:00
Miroslav Lichvar
12237bf283 refclock: stop requiring 4 samples in median filter
Reduce the minimum number of samples required by the filter from
min(4, length) to 1.

This makes the filtering less confusing. The sample lifetime is limited
to one poll and the default filtering of the SOCK refclock (where the
maximum number of samples per poll is unknown) is identical to the other
refclocks.

A concern with potential variability in number of samples per poll below
4 is switching between different calculations of dispersion in
combine_selected_samples() in samplefilt.c.

The 106-refclock test shows how the order of refclocks in the config can
impact the first filtered sample and selection. If the PPS refclock
follows SHM, a single low-quality PPS sample is accepted in the same
poll where SHM is selected and the initial clock correction started,
which causes larger skew later and delays the first selection of the PPS
refclock.
2024-11-05 16:03:40 +01:00
Miroslav Lichvar
b9b338a8df refclock: rework update of reachability
Update the reachability register of a refclock source by 1 if a valid
measurement is received by the drivers between source polls, and not
only when it is accumulated to sourcestats, similarly to how
reachability works with NTP sources.

This avoids drops in the reported reachability when a PHC refclock is
dropping samples due to significant changes in the measured delay (e.g.
due to high PCIe load), or a PPS refclock dropping samples due to failed
lock.
2024-11-05 15:51:43 +01:00
Vincent Blut
a5d73f692f doc: fix typo in chrony.conf man page 2024-10-09 09:05:41 +02:00