网上很多文章,都说MySQL驱动并没有实现"真正的"batchUpdate,执行的时候还是一条一条按顺序将SQL发送到MySQL服务器,其实这是错误的。
先贴一段源码(基于MySQL 5.1.40驱动),执行batchUpdate的时候最终执行如下方法:executeBatchInternal
protected long[] executeBatchInternal() throws SQLException {
synchronized (checkClosed().getConnectionMutex()) {
if (this.connection.isReadOnly()) {
throw new SQLException(Messages.getString("PreparedStatement.25") + Messages.getString("PreparedStatement.26"),
SQLError.SQL_STATE_ILLEGAL_ARGUMENT);
}
if (this.batchedArgs == null || this.batchedArgs.size() == 0) {
return new long[0];
}
// we timeout the entire batch, not individual statements
int batchTimeout = this.timeoutInMillis;
this.timeoutInMillis = 0;
resetCancelledState();
try {
statementBegins();
clearWarnings();
if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
if (canRewriteAsMultiValueInsertAtSqlLevel()) {
return executeBatchedInserts(batchTimeout);
}
if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null
&& this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) {
return executePreparedBatchAsMultiStatement(batchTimeout);
}
}
return executeBatchSerially(batchTimeout);
} finally {
this.statementExecuting.set(false);
clearBatch();
}
}
}
为了测试网上文章说法的正误,本地测试写了一个batch批量执行,跟踪源码的时候发现最后进入到
return executeBatchSerially(batchTimeout);
该方法的javadoc 如是说:
Executes the current batch of statements by executing them one-by-one.
executeBatchSerially核心源码如下:
for (this.batchCommandIndex = 0; this.batchCommandIndex < nbrCommands; this.batchCommandIndex++) {
Object arg = this.batchedArgs.get(this.batchCommandIndex);
try {
if (arg instanceof String) {
updateCounts[this.batchCommandIndex] = executeUpdateInternal((String) arg, true, this.retrieveGeneratedKeys);
// limit one generated key per OnDuplicateKey statement
getBatchedGeneratedKeys(this.results.getFirstCharOfQuery() == 'I' && containsOnDuplicateKeyInString((String) arg) ? 1 : 0);
} else {
BatchParams paramArg = (BatchParams) arg;
//核心代码,for循环执行每一条SQL
updateCounts[this.batchCommandIndex] = executeUpdateInternal(paramArg.parameterStrings, paramArg.parameterStreams,
paramArg.isStream, paramArg.streamLengths, paramArg.isNull, true);
// limit one generated key per OnDuplicateKey statement
getBatchedGeneratedKeys(containsOnDuplicateKeyUpdateInSQL() ? 1 : 0);
}
} catch (SQLException ex) {
updateCounts[this.batchCommandIndex] = EXECUTE_FAILED;
if (this.continueBatchOnError && !(ex instanceof MySQLTimeoutException) && !(ex instanceof MySQLStatementCancelledException)
&& !hasDeadlockOrTimeoutRolledBackTx(ex)) {
sqlEx = ex;
} else {
long[] newUpdateCounts = new long[this.batchCommandIndex];
System.arraycopy(updateCounts, 0, newUpdateCounts, 0, this.batchCommandIndex);
throw SQLError.createBatchUpdateException(ex, newUpdateCounts, getExceptionInterceptor());
}
}
}
通过代码分析,也确实是一条一条SQL执行,而不是把batch的SQL发送到服务器
但是
重点来了,执行executeBatchSerially是有条件的,再次贴一下源码:
if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
...
}
return executeBatchSerially(batchTimeout);
也就是说,如果没做任何配置,默认情况下if条件是进不去的,会直接执行if块后边的
return executeBatchSerially(batchTimeout);
那,我们来看一下if条件是什么:
!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()
batchHasPlainStatements默认初始化就是false,可以不用管,重点是connection.getRewriteBatchedStatements()。这个是Connection的一个参数rewriteBatchedStatements,会在读取jdbcUrl的时候读取进来:jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true
我们将jdbcUrl添加一个rewriteBatchedStatements试试,即变成:jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&rewriteBatchedStatements=true
这时候就会进if块了。进入if块之后,再根据执行的是insert 还是update、 delete,会走不同方法
- 如果是insert语句,满成条件情况下,会整合成形如:"insert into xxx_table values (xx),(yy),(zz)..."这样的语句
- 如果是update\delete语句,满成条件情况下,会整合成形如:"update t set … where id = 1; update t set … where id = 2; update t set … where id = 3 ..."这样的语句
然后分批次发送给MySQL(会有一次发送的package大小限制,所以需要拆分批次)
int maxAllowedPacket = this.connection.getMaxAllowedPacket();
if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
//insert会在这里进行再次判断
if (canRewriteAsMultiValueInsertAtSqlLevel()) {
return executeBatchedInserts(batchTimeout);
}
//update、delete会在这里进行再次判断
//1. mysql版本>=4.1.0
//2. batchHasPlainStatements为false
//3. batch的数量>3
if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null
&& this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) {
return executePreparedBatchAsMultiStatement(batchTimeout);
}
}
return executeBatchSerially(batchTimeout);
在这里总结一下,如果想要达到MySQL真正batchUpdate效果,需要有以下几个条件:
- 需要在jdbcUrl后添加参数rewriteBatchedStatements=true
- this.batchHasPlainStatements 为false
- 如果是update \ delete 语句,还需要mysql版本>=4.1.0,并且batch的数量>3
因此,如果可能的情况下,请在jdbcUrl后添加参数rewriteBatchedStatements=true,尽可能利用上MySQL给我们提供的便利,提高性能。