Posted on May 31st 2016
Recently I had to optimize a method that was bottlenecked in one of our applications. Among other things, the method I was optimizing created a file based on a String concatenation. This is a very common task that usually takes a couple of lines along with a try statement. However this time I was trying to optimize the run time. So as any person looking for answers to programming questions in this modern day, I turned to Google and it lead me to Stack Overflow, specifically to this thread: http://stackoverflow.com/questions/1677194/dumping-a-java-stringbuilder-to-file.
There are several answers regarding how to dump a StringBuilder to file on the thread and I didn’t know which one to trust. I decided to run some tests and make my choice based on the results.
The Plan
My tests were very simple. For each algorithm I wanted to know:
- Execution time
- Memory usage
To measure the time I just used the following code:
long start = System.currentTimeMillis();
runnableMethod.run();
long end = System.currentTimeMillis() - start;
Memory profiling turned out to be more difficult so I used Java VisualVM to monitor the memory usage. I used a little bit of code to generate a log with information to help me figure out what was going on. Some of the data I added to the log were:
- Date and time just before starting the execution
- Name for each algorithm
- Execution time
Since garbage collection can be a time consuming task, I used the following VMconfiguration option: -verbose:gc.
I used a 100 MB string to make more evident the impact in the memory while monitoring the VM. Now it was time to get to work! I took the answers in the thread with the most votes and started the tests.
The Question
The thread starts by asking the following question:
So I started with this algorithm as reference to compare against the answers in the thread:
try (FileOutputStream oS = new FileOutputStream(new File("aFile"))) {
oS.write(aSB.toString().getBytes());
} catch (IOException e) {
e.printStackTrace();
}
Log:
2016/05/20 21:15:45
Iteration: 1
Total time in ms: 1787
Average time in ms: 1787
In terms of time, it is pretty fast—just 1787 ms to complete; but in terms of memory, it rises above 527 MB.
The Top Answer
The top answer (37 votes) suggested to use a buffered writer and append the StringBuffer/Builder to it:
Code:
//append string buffer/builder to buffered writer
try (BufferedWriter bw = new BufferedWriter(new FileWriter("TempFile2"))) {
bw.append(aSB);//Internally it does aSB.toString();
bw.flush();
} catch (IOException e) {
e.printStackTrace();
}
Internally the BufferedWriter does:
//writer internal code
public Writer append(CharSequence csq) throws IOException {
if (csq == null)
write("null");
else
write(csq.toString());
return this;
}
Therefore the following code is equivalent:
//Convert to String and then write to bw
try (BufferedWriter bw = new BufferedWriter(new FileWriter("TempFile2"))) {
bw.write(aSB.toString());
bw.flush();
} catch (IOException e) {
e.printStackTrace();
}
Log:
2016/05/19 01:59:07
Iteration: 1
Total time in ms: 1596
Average time in ms: 1596
It terms of time, it seems to be slightly better than the algorithm in the original question.
If you look at the charts and log you can see the memory usage starts to grow at 01:59:07, the memory usage peaks at 422 MB. We know that 100 MB are allocated for our StringBuilder; once it is converted into a String, another 100 MB are allocated for the String object and the last 200 MB are used getting the bytes from the original StringBuffer and writing the data to the file.
The Second Answer
The second answer (13 votes), and preferred by the author of the question, suggested using the Apache Commons IO library.
Code:
try {
FileUtils.writeStringToFile(new File("aFile"), aSB.toString(), java.nio.charset.StandardCharsets.UTF_8);
} catch (IOException e) {
e.printStackTrace();
}
Log:
2016/05/20 19:03:26
Iteration: 1
Total time in ms: 3248
Average time in ms: 3248
Surprisingly, the charts reveal it is using 800MB—honestly, I was expecting something similar to the previous algorithm. An additional 700 MB to process 100 MB is not the most efficient in terms of memory, but that is the reason I wanted to run the test in the first place.
The Third Answer
The third answer (also with 13 votes), stated that toString().getBytes() would require 2 or 3 times the size of the string and suggested the following:
Log:
Using 1kb buffer to read data and creating new CharReader for every kb read
2016/05/19 02:00:08
Iteration: 1
Total time in ms: 362540
Average time in ms: 362540
In terms of memory this is the most efficient so far with 245 MB maximum, but in terms of time it has the worst performance taking about 6 minutes to complete.
Note: The fourth answer (10 votes) is equivalent to the first answer so I skipped it.
The Fifth Answer
This answer suggests to avoid the use of StringBuffer/Builder altogether and instead append data directly to the BufferedWritter.
Code:
//Not using string builder/buffer
try (BufferedWriter bw = new BufferedWriter(new FileWriter("TempFile3"))) {
for (int i = 0; i < Size; i++) {
bw.append('a');
}
bw.flush();
} catch (IOException e) {
e.printStackTrace();
}
Log:
Append data directly to bufferedWriter
2016/05/19 01:50:31
Iteration: 1
Total time in ms: 3748
Average time in ms: 3748
That is a huge improvement in terms of memory usage, reaching a maximum of 5 MB. It turns out that not using StringBuilder/Buffer at all and appending data to the BufferedWriter directly is the most efficient way to dump a string concatenation. However, if you are working with an API you can’t change this is not something you can do.
My Answer
At the end I wasn’t completely satisfied with any of the answers. I decided to write my own code based on what I learned. I used the third answer as my base and applied some knowledge gained while reading the Apache Commons IO documentation:
Buffering streams
IO performance depends a lot from the buffering strategy. Usually, it’s quite fast to read packets with the size of 512 or 1024 bytes because these sizes match well with the packet sizes used on harddisks in file systems or file system caches. But as soon as you have to read only a few bytes and that many times performance drops significantly.
This is exactly what I needed to know to improve the performance and it was consistent with the results I got from the previous runs. My own answer ended up like this:
try (BufferedWriter bw = new BufferedWriter(new FileWriter("TempFile1mod"))) {
final int aLength = aSB.length();
final int aChunk = 1024;// 1 kb buffer to read data from
final char[] aChars = new char[aChunk];
for (int aPosStart = 0; aPosStart < aLength; aPosStart += aChunk) {
final int aPosEnd = Math.min(aPosStart + aChunk, aLength);
aSB.getChars(aPosStart, aPosEnd, aChars, 0); // Create no new buffer
bw.write(aChars, 0, aPosEnd - aPosStart);// This is faster than just copying one byte at the time
}
bw.flush();
} catch (IOException e) {
e.printStackTrace();
}
Log:
2016/05/19 01:52:22
Iteration: 1
Total time in ms: 1246
Average time in ms: 1246
As you see in the logs, the data shows that my code was the most balanced in terms of memory consumption and run time. It takes a little more than one second to return from the method execution and about five seconds to complete all its work. In terms of memory, it reaches a maximum of 215 MB.
On a remote dev team? Try CodeTogether—it’s free!
- Live share IDEs & coding sessions
- See changes in real time
- Cross-IDE support for VS Code, IntelliJ & Eclipse
- Guests join from Browser or IDE
- End-to-end source encryption
- www.codetogether.com
Conclusion
While Stack Overflow is a great resource, the top answer is not always the best answer. When searching for answers, use it as a tool, but do your own research to determine if the top answer is really the best answer. As far as the original question regarding dumping a Java StringBuilder to file, I hope you are able to use a BufferedWritter directly. If not, I hope the tests in the article help you choose the most appropriate option.
Let Us Hear from You!
If you have any comments or questions, we would love to hear from you @MyEclipseIDE on twitter or via the MyEclipse forum. Good luck coding!
If you’re not already subscribing to our blogs, why not do it today? Subscribed