In this article, we will take a detailed look at how ClickHouse parses dates and times depending on the date_time_input_format
setting. We will cover general information and technical implementation details provide practical examples compare performance and offer recommendations on choosing the optimal parsing mode.
date_time_input_format
SettingClickHouse has a special setting called date_time_input_format
, which determines how the server interprets incoming string data when converting it to DateTime
, DateTime64
types, and during input/output formats (e.g., FORMAT JSONEachRow
, FORMAT CSV
, etc.).
'base'
and 'best_effort'
The most commonly mentioned modes are:
base
(or alias basic
):
YYYY-MM-DD HH:MM:SS
or with an additional fractional part of seconds YYYY-MM-DD HH:MM:SS[.fractional]
(e.g., 2025-01-14 09:31:30.123456
).best_effort
:
T
separator, Z
suffix, fractional seconds, timezone offsets, etc.).YYYY-MM-DDTHH:MM:SSZ
, 2025-01-14T09:31:30.725617089Z
, and others.Different applications, systems, and libraries may output date and time differently. The most common standard is ISO 8601, but there are other variations (with or without spaces, fractional seconds, timezone letters, etc.). To simplify integration as much as possible, ClickHouse introduced the idea of having a "strict" mode (base
), which expects a single format, and a "flexible" mode (best_effort
), which attempts to recognize the input date in several popular formats.
The implementation of date and time parsing in ClickHouse is divided into several functions. Two key ones are:
parseDateTimeBestEffort.cpp
— for the 'best_effort'
mode.parseDateTime.cpp
— for the 'base'
mode (sometimes referred to as 'basic'
).
The relevant source code links in ClickHouse (GitHub repository) are:
parseDateTimeBestEffort.cpp#L86
— where the "best effort" mode iterates over a list of formats.parseDateTime.cpp#L638
— the code section where strict date and time reading in the YYYY-MM-DD HH:MM:SS[.fractional]
format occurs.best_effort
ModeIn best_effort
mode, the input string is sequentially matched against predefined templates in ClickHouse. These include:
T
separator (e.g., YYYY-MM-DDTHH:MM:SS
).Z
suffix (UTC) (YYYY-MM-DDTHH:MM:SSZ
).YYYY-MM-DDTHH:MM:SS.ssssss[Z]
.+HH:MM
or HH:MM
.
If the string doesn't fit one format, the algorithm tries the next one, and so on. If no matches are found, parsing fails with an error.
base
Mode (or basic
)The base
mode (often referred to as basic
in documentation and code) assumes a strictly defined format:
YYYY-MM-DD HH:MM:SS
(e.g., 2025-01-14 09:31:30
).YYYY-MM-DD HH:MM:SS.fffffffff
(e.g., 2025-01-14 09:31:30.123456789
).
Any deviation (e.g., T
instead of a space, presence of Z
, missing seconds section) will result in an error:
Code: 41. DB::ParsingException: Cannot parse datetime ...
Below are several scenarios that illustrate the differences between base
and best_effort
.
best_effort
Is EssentialImagine receiving data in ISO 8601 format, including a T
separator and Z
(UTC) suffix. For instance, 2025-01-14T09:31:30.725617089Z
.
In base
mode, ClickHouse expects a string like 2025-01-14 09:31:30.725617089
(with a space, without Z
). Therefore, an insertion attempt will produce an error. In best_effort
mode, ClickHouse correctly recognizes this timestamp, accounts for fractional seconds, and interprets Z
as UTC.
Suppose we have a table:
CREATE TABLE events (
event_date_time DateTime64(9),
event_value Int32
)
ENGINE = MergeTree
ORDER BY event_date_time;
And we want to insert data in JSONEachRow
format:
INSERT INTO events FORMAT JSONEachRow
{"event_date_time": "2025-01-14T09:31:30.725617089Z", "event_value": 100}
With date_time_input_format = 'base'
:
A parsing error occurs because 2025-01-14 09:31:30.725617089
(without T
and Z
) is expected.
With date_time_input_format = 'best_effort'
:
The string is successfully parsed and written to the table.
SET date_time_input_format='best_effort'
To enable flexible parsing locally within the current session, you can execute:
SET date_time_input_format = 'best_effort';
After this, any date-time insertion/read within the session will work in "best effort" mode.
An analogous example with INSERT
:
INSERT INTO events
SETTINGS date_time_input_format='best_effort'
FORMAT JSONEachRow
{"event_date_time": "2025-01-14T09:31:30.725617089Z", "event_value": 100}
best_effort
via users.xml
If you want all requests from a particular user or role profile to use best_effort
by default, you can configure this in the users.xml
configuration file. For example:
<yandex>
<profiles>
<my_profile>
...
<date_time_input_format>best_effort</date_time_input_format>
</my_profile>
</profiles>
<users>
<my_user>
<profile>my_profile</profile>
...
</my_user>
</users>
</yandex>
When ClickHouse is restarted, all new sessions of user my_user
will run in best_effort
mode.
When parsing JSON data in ClickHouse, you often encounter a situation where a JSON file may contain fields that are not present in the target table. For example:
{"event_date_time": "2025-01-14T09:31:30Z", "event_value": 42, "some_extra_field": "ignored"}
If FORMAT JSONEachRow
is used, the default “extra” fields may cause an error. To skip them, you can enable the setting:
SET input_format_skip_unknown_fields = 1;
Then ClickHouse will safely ignore some_extra_field
, and parse the event_date_time
according to the logic of the current date_time_input_format
. If we have best_effort
, then 2025-01-14T09:31:30Z
will be successfully read.
base
is fasterThe base
mode uses a clearly fixed parsing function: it checks strictly positively that the year consists of 4 digits, followed by a hyphen, followed by 2 digits of the month, and so on. Virtually no branching or checks for “what if there's a T or a space, if there's a Z suffix” and the like.
The best_effort
mode, on the other hand, stores a set of patterns - sometimes quite large (especially in modern versions of ClickHouse that support time zone offsets, different delimiter options, fractional seconds, etc.). Each string is checked sequentially by several rules, which gives additional load.
On small amounts of data, the difference is unlikely to be noticeable. However, if you are loading, say, billions of rows, and a large number of values have to be converted from rows to dates/times, the parsing time can become a bottle neck.
To test the difference, you can prepare a dataset yourself (e.g. with 10^7 or 10^8 rows in CSV/JSON format) and run the insert twice - once with date_time_input_format='base'
, another time with best_effort'
. Measure the query execution time or average insertion speed (rows/sec).
There are no official benchmarks from ClickHouse developers on this topic (at least at the time of writing), so you will have to make your own estimation based on actual usage conditions.
Benchmark:
-- Data preparation (10M rows)
SELECT now() + number AS ts FROM numbers(10000000) INTO OUTFILE 'data.tsv'
-- Insertion time measurement
clickhouse-client --query "INSERT INTO test_table FORMAT TSV" < data.tsv
best_effort
T
, Z
, or timezone offsets).base
DateTime64(N)
)For DateTime64(N)
types, fractional second parsing is particularly important. In base
mode, the exact number of digits after the decimal point must match the declared precision. In best_effort
, ClickHouse can recognize and truncate fractional seconds to the required precision.
T
, Z
, and Timezone Offsetsbase
mode expects a space between the date and time and does not account for Z
(UTC).best_effort
mode recognizes T
as a separator, interprets the Z
suffix as UTC, and parses timezone offsets (e.g., +03:00
, 05:00
).Use INSERT INTO ... FORMAT JSONEachRow
with additional settings:
SET input_format_skip_unknown_fields = 1;
Functions like parseDateTime64BestEffort(...)
or parseDateTimeBestEffort(...)
can assist during SELECT
/INSERT
operations to convert string columns into dates/times.
Check ClickHouse logs (clickhouse-server.log
) for specific parsing error messages.
For detailed logs, increase the logging level carefully (large logs may be generated).
The choice between 'base'
and 'best_effort'
primarily depends on the nature of your data and performance requirements:
base
(or basic
):
YYYY-MM-DD HH:MM:SS[.fractional]
template.best_effort
:
T
, Z
, offsets, fractional seconds).
In practice, if you exchange data with microservices that output ISO 8601 formats (T
, Z
), enabling 'best_effort'
is reasonable. For fixed formats with performance-critical requirements, 'base'
is better.
Thus, the date_time_input_format
setting in ClickHouse helps balance strict format control with flexibility in handling diverse date/time formats. Choose the approach that best fits your project.