Spring Data JPA Batch Insertion
Posted By : Rahul Chauhan | 23-Apr-2019
1. Overview
Going out to the database is expensive. We may be able to improve performance and consistency by batching multiple inserts into one.
In this tutorial, we’ll look at how to do this with Spring Data JPA.
2. Spring JPA Repository
First, we’ll need a simple entity. Let’s call it Customer:
@Entity
public class Customer {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
private String firstName;
private String lastName;
// constructor, getters, setters
}
And then, we need our repository:
public interface CustomerRepository extends CrudRepository<Customer, Long> {
}
This exposes a saveAll method for us, which will batch several inserts into one.
So, let’s leverage that in a controller:
@RestController
public class CustomerController {
@Autowired
CustomerRepository customerRepository;
@PostMapping("/customers")
public ResponseEntity<String> insertCustomers() {
Customer c1 = new Customer("James", "Gosling");
Customer c2 = new Customer("Doug", "Lea");
Customer c3 = new Customer("Martin", "Fowler");
Customer c4 = new Customer("Brian", "Goetz");
List<Customer> customers = Arrays.asList(c1, c2, c3, c4);
customerRepository.saveAll(customers);
return ResponseEntity.created("/customers");
}
// ... @GetMapping to read customers
}
3. Testing Our Endpoint
Testing our code is simple with MockMvc:
@Autowired
private MockMvc mockMvc;
@Test
public void whenInsertingCustomers_thenCustomersAreCreated() throws Exception {
this.mockMvc.perform(post("/customers"))
.andExpect(status().isCreated()));
}
4. Are We Sure We’re Batching?
Actually, there is just a bit more configuration to do – let’s do a quick demo to illustrate the difference.
First, let’s add the following property to application.properties to see some statistics:
spring.jpa.properties.hibernate.generate_statistics=true
At this point, if we run the test, we’ll see stats like the following:
11232586 nanoseconds spent preparing 4 JDBC statements;
4076610 nanoseconds spent executing 4 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
So, we created four customers, which is great, but note that none of them were inside a batch.
The reason is that batching is not switched on by default in some cases.
In our case, it’s because we are using id auto-generation. So, by default, saveAll does each insert separately.
So, let’s switch it on:
spring.jpa.properties.hibernate.jdbc.batch_size=4
spring.jpa.properties.hibernate.order_inserts=true
The first property tells Hibernate to inserts in batches of four. The order_inserts property tells Hibernate to take the time to group inserts by the entity, creating larger batches.
So, the second time we run our test, we’ll see the inserts were batched:
16577314 nanoseconds spent preparing 4 JDBC statements;
2207548 nanoseconds spent executing 4 JDBC statements;
2003005 nanoseconds spent executing 1 JDBC batches;
We can use the same approach to deletes and updates (remembering that Hibernate also has an order_updates property).
5. Conclusion
By using the batch inserts, we can see some performance gains.
We need to be aware that batching is automatically disabled in some cases, and we should check and plan for this before we ship.
Cookies are important to the proper functioning of a site. To improve your experience, we use cookies to remember log-in details and provide secure log-in, collect statistics to optimize site functionality, and deliver content tailored to your interests. Click Agree and Proceed to accept cookies and go directly to the site or click on View Cookie Settings to see detailed descriptions of the types of cookies and choose whether to accept certain cookies while on the site.
About Author
Rahul Chauhan
Rahul Chauhan is Java Developer having good knowledge of java, have a knowledge of Spring Boot.