How to insert more than 100 thousand records in a table?

1

I have a ResultSet which brings more than 100 thousand data in total to be inserted, the problem is that my method inserts value by value in the query since there are fields that are empty, the method is as follows, only It's a portion:

try
        {
            rs.beforeFirst();
            stmt = cnn.createStatement();
            while(rs.next())
            {
               documentID = rs.getInt(1);
               folio = rs.getString(5);
               ficha = rs.getString(6);
               nombre = rs.getString(7);
               tipoArchivo = "PDF";
               created = rs.getString(9);
               created = created.substring(0, 19);
               createdBy = rs.getString(10);
               modified = rs.getString(11);
               modified = modified.substring(0, 19);
               modifiedBy = rs.getString(12);
               String Qry = "INSERT INTO wfattxdoc(document_id, folio, ficha, nombre, document_created_date, document_createdby, document_modified_date, document_modifiedby)" +
                       " VALUES (" + documentID + ", " + "'" + folio + "'" + ", " + "'" + ficha + "'" + ", " + "'" + nombre + "'" + ", " +
                       "'" + created + "'" + ", " + "'" + createdBy + "'" + ", " +
                       "'" + modified + "'" + ", " + "'" + modifiedBy + "'" + ");";
               stmt.executeUpdate(Qry);
            }
            stmt.close();
        }

This causes it to go very slowly, and it takes more than 6 hours (I do not lie) to perform the whole process. My question is: Is there another more efficient way to insert thousands of records in a faster way? I do not get any error, the only problem is the slowness.

    
asked by Kevin M. 18.11.2016 в 17:29
source

2 answers

5

You should use a PreparedStatement and execute the statements in batches. For this, there are the methods addBatch and executeBatch (inherited from Statement ) . This allows you to send a batch of statements to the database instead of sending 1-to-1 sentences. This way of working applies to any database engine with a compatible JDBC driver, not just MySQL.

Your code would look like this:

//esto decláralo fuera del método
private static final int REGISTROS_BATCH = 1000;

//dentro de tu método
String sql = "INSERT INTO wfattxdoc(document_id, folio, ficha, nombre, document_created_date, document_createdby, document_modified_date, document_modifiedby) VALUES (?, ?, ?, ?, ?, ?, ?, ?)";
try {
    rs.beforeFirst();
    try (PreparedStatement pstmt = cnn.prepareStatement(sql)) {
        int counter = 0;
        while(rs.next()) {
            documentID = rs.getInt(1);
            folio = rs.getString(5);
            ficha = rs.getString(6);
            nombre = rs.getString(7);
            tipoArchivo = "PDF";
            created = rs.getString(9);
            created = created.substring(0, 19);
            createdBy = rs.getString(10);
            modified = rs.getString(11);
            modified = modified.substring(0, 19);
            modifiedBy = rs.getString(12);

            pstmt.setInt(1, document_id);
            pstmt.setString(2, folio);
            pstmt.setString(3, ficha);
            pstmt.setString(4, nombre);
            //sería mejor usar java.sql.Timestamp en lugar de String
            pstmt.setString(5, created);
            pstmt.setString(6, createdBy);
            pstmt.setString(7, modified);
            pstmt.setString(8, modifiedBy);
            //agregamos la sentencia al lote
            pstmt.addBatch();
            //aumentamos el contados de lote
            counter++;
            //al tener 1000 o más sentencias, mandamos todas a ejecutar
            //y reiniciamos el contador
            if (counter == REGISTROS_BATCH) {
                pstmt.executeBatch();
                counter = 0;
            }
        }
        //revisamos si todavía hay sentencias pendientes de ejecutar
        if (counter > 0) {
            pstmt.executeBatch();
        }
    }
} (catch SQLException e) {
    //maneja tus excepciones...
}

In MySQL, to further improve the performance of these statements, you can add the parameters useServerPrepStmts and rewriteBatchedStatements when opening the connection 1 . For example, your connection string may look like this:

jdbc:mysql://<servidor>:<puerto>/<nombre bd>?useServerPrepStmts=false&rewriteBatchedStatements=true

1 Adapted from JDBC batch insert performance

    
answered by 18.11.2016 / 17:43
source
0

In case the transfer of data via SQL is slow since they are executed sequentially and the INSERT in case they are expensive since the table requires to be ordered by some field and keep their indexes updated, so that for each INSERT a binary search should be done on the records to know where to place it, in addition the indices in each query should be updated.

First of all, if you do not mind that the queries on the BD are quick, eliminate all the indices that they may have, eliminate their triggers and restrictions (A conscientiously); this will accelerate INSERT .

A priori, the bottleneck must be in the sending of the query and the waiting response.

At first glance, your code looks correct, although to rule out other things that may be encouraging the process, I recommend you do it in two steps.

  • Parse the Data Set and generate an SQL file with all the instructions.
  • You execute the file with all the corresponding SQL lines directly on the database.
  • This will show you where the bottleneck is. I hope I have helped you.

        
    answered by 18.11.2016 в 17:43