article
Times matter?
Computer relies on clocks to coordinate operations. Clocks are used in logging system, scheduling process and calculating duration of elapsed time such as timeouts and executions. Miss-assumptions of time in systems which heavily rely on time accuracy can make the entire system working in an unexpected way and eventually will lead to failure that can be impossible to detect and retrace why and where the problem has started from.
Every computer has its own clock usually driven by quartz crystal oscillator. They cannot provide perfect accuracy as it can drift apart due to environmental factors like temperature, voltage fluctuations and hardware imperfections. That clock drifting can lead to disagreement about current time in different machines.
Therefore, Systems try to synchronise with accurate time sources to reduce this drift. Two of the most common mechanisms are Network Time Protocol (NTP) and Global Positioning Systems (GPS). Those two mechanisms form the time network hierarchy.
Most high precise time systems are based on atomic clocks. Atomic clocks can provide time accuracy of time to nanoseconds measuring vibration frequency of atoms. They are mostly used in satellite systems, special time servers. GPS satellites carry atomic clock and they broadcast precise timestamp as a part of signal. GPS receivers listen to this signal and extract timestamp from that signal. So, systems which rely on precise time use GPS receivers as a reference of their time source.
Network Time Protocol (NTP)
It forms time network hierarchy with layers.
Stratum 0 is a layer where devices that provide precise timestamps exist such as atomic clocks, GPS receivers. It does not connect to the network but feed precise time to the servers below.
Stratum 1 directly connects to those devices and acts as a primary server for the network.
Stratum 2 is connected by organisation servers to synchronise with their times and stratum 2 itself synchronises its time with the layer above.
Starting on Stratum 3, each layer synchronises its time with the layer above and distributes to the layers below.
As a stand of now, You might think that it is not really efficient for every machine to contact with the GPS receiver or atomic clocks for synchronisation to compensate the discrepancy of time.
How computer actually synchronises the drifted time is by exchanging several timestamps with the nearest time servers. Client computer estimates the network delays, clock offsets from those timestamps which are used to adjust its clock. Stepping and Slewing are two ways by which system adjusts in different states. In stepping, if the difference between clocks is big, it straightly jump forward and backward, otherwise in slewing, the system gradually slows down or speeds up its clock to adjust the accuracy of time.
However, Clocks can not be perfectly synchronise even if NTP and GPS systems. The main reasons are unpredictable network latencies, variations of packet delays, hardware drift and processing delays. Because of those uncertainties, most systems represent time as a range rather than single value.
Operating systems provide two notions of time, Time-of-Day clock and Monotonic clock. Each of which is designed for different purposes. Many systems got unpredicted problems with using wrong notion of time.
Time-of-Day clock returns current date and time according to a calendar. Eg. 2026-02-17 10:05:00. It is measured in nanoseconds or milliseconds. It can be adjusted by Network Time Protocol or GPS Systems. If time difference is too high, it jumps forward and backward to synchronise. ToD is used in file timestamps and logging systems but not ideal for calculation in elapsed time durations as Time-of-Day clocks wouldn’t match between every other computers.
Monotonic clock returns time value as an always increasing value, defined by its definition “Monotonic” which literally means “always increasing, moving”. Unlike Time-of-Day clock, it can never move backward and it can’t be adjusted by NTP or GPS. Monotonic clocks are especially used in calculation of elapsed time duration like process execution time, timeouts, retries and scheduling internal processes. Every computer has different monotonic clock timestamps so that using this clock as real world date will be chaotic in distributed systems.
Time is not an issue to think about if you are using single node for server and one single database. Most DBMS provides built in function like now() from which time comes from one source. Imagine if you have multiple servers and multiple databases (shards or replicas) that means every machine (node) has its clock which is slightly different and independent with each others. Writes can be concurrently happening in the database, in which case relying on time-of-day clock of each machine has become a huge problem that has to be solved.
Inconsistent orders of events
Imagine if two users are editing a document from different locations of the world where one hits “save” first and the other hits a sec later. You may think the second change will win but it is not always the same in distributed systems. Sometimes, the later write can get discarded as one machine’s clock was a few milliseconds behind.
Most databases rely on timestamps to decide which operations happened last. In distributed systems, if the clocks across nodes are out of sync, an operation that happened later may appear earlier and lead to apply updates in the wrong orders.
Lost Updates (Last Write Wins)
Many database systems use LWW as a simple conflict resolution strategy. It can work only if timestamps are reliable. With clock skew, latest writes can be treated as older and discarded its changes which can lead to silent data loss.
Data replication
Distributed databases replicate its data across different nodes. And, they rely on applying changes in a consistent order. When it comes to replicating data across multiple nodes and database need to replicate concurrent writes, latest updates can be lost and data across multiple replicas can be inconsistent due to clock skew.
Distributed systems don’t try to make time perfect, it work around its imperfections. Instead of trusting on unreliable clock, they rely on logical ordering, versioning, careful conflict resolution strategy to keep the data consistent. In the end, correctness doesn’t come from knowing the exact time but from understanding how events relate to each others.