I had a task of inserting 50,000 data records reading from a CSV file,I used the following code-
def reader = CSVReader(new FileReader(file)) def fields = null while((fields = reader.readNext()) != null) { new RawData(name:fields[0]).save(flush:true) }
The above code took around 25 minutes to save a recordset of 50,000 data.I used Batch Processing and you'll not believe what happened.Let me explain the another way of doing above task-
def reader = CSVReader(new FileReader(file)) def batch = [] def fields = null while((fields = reader.readNext()) != null) { batch.add(new RawData(name:fields[0])) if(batch.size() > 1000){ RawData.withTransaction{ for(RawData r in batch){ r.save(flush:true) } } batch.clear() } RawData.withSession{ session -> session.clear() } }
The above code save data in chunks of 1000 records and withTransaction closure mark the start and end of transaction.Clearing associated Hibernate Session in order to remove memory issue as Hiberante save object on flushing the session.Doing this the above time i.e 25 min which was increasing with the increase of data reduced to 71 seconds only.So This could be the way in order to optimize your CRUD operation on large amount of data.
More From Oodles
Ready to innovate? Let's get in touch
Cookies are important to the proper functioning of a site. To improve your experience, we use cookies to remember log-in details and provide secure log-in, collect statistics to optimize site functionality, and deliver content tailored to your interests. Click Agree and Proceed to accept cookies and go directly to the site or click on View Cookie Settings to see detailed descriptions of the types of cookies and choose whether to accept certain cookies while on the site.
About Author
Ravindra Jha
Ravindra is a seasoned Java and Grails lead developer with excellent experience in deployment , monitoring , optimisation of web applications for scalability and performance on Amazon EC2 and other Amazon Web Services.