Scenario
HikariCP
+ mariadb-java-client
+
MySQL
Although the network or database has been restored, the application will not restore itself, only after a restart.
1 | java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. |
Thread Dump
1 | "HikariPool-1 connection adder" ... runnable ... |
How HikariCP creates connections
main
thread call stack as follows:
1 | |-com.zaxxer.hikari.HikariDataSource#getConnection() |
addConnectionExecutor
is responsible for creating
connections, but it is a thread pool with only one thread, which means
creating connections is blocking and needs to be queued.
1 | // com.zaxxer.hikari.pool.HikariPool |
HikariPool-1 connection adder
thread call stack as
follows:
1 | |-com.zaxxer.hikari.pool.HikariPool.PoolEntryCreator#call() |
org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol#createSocket
mehtod will establish a TCP connection through
java.net.Socket
.
1 | xxx@xxx ~ % lsof -i tcp:3306 |
The default socketTimeout is 0, which means no timeout, and the default connectTimeout is 30s.
1 | // org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol |
After the connection is established, parse the server greeting packet.
1 | // org.mariadb.jdbc.internal.com.read.ReadInitialHandShakePacket |
What went wrong
Consider the network or database to be unstable at some point, your
application thread will get stuck in this
java.net.SocketInputStream#socketRead0()
API until it has
completely read the response data. What's more serious is that
addConnectionExecutor
has only one thread, which will
prevent the task POOL_ENTRY_CREATOR
from creating new
connections, even if the network or database has returned to normal.
How to fix
Configure socketTimeout
on
spring.datasource.url
property.
1 | spring: |
Update
Faster PostgreSQL connection recovery
Unacknowledged TCP
The reason that HikariCP is powerless to recover connections that are out of the pool is due to unacknowledged TCP traffic. TCP is a synchronous communication scheme that requires "handshaking" from both sides of the connection as packets are exchanged (
SYN
andACK
packets).When TCP communication is abruptly interrupted, the client or server can be left awaiting the acknowledgement of a packet that will never come. The connection is therefore "stuck", until an operating system level TCP timeout occurs. This can be as long as several hours, depending on the operating system TCP stack tuning.
TCP Timeouts
In order to avoid this condition, it is imperative that the application configures the driver-level TCP socket timeout . Each driver differs in how this timeout is set, but nearly all drivers support it.
HikariCP recommends that the driver-level socket timeout be set to (at least) 2-3x the longest running SQL transaction, or 30 seconds, whichever is longer. However, your own recovery time targets should determine the appropriate timeout for your application.
See the specific database sections below for some common configurations.