Skip to content

Cropping lines

August 2, 2010

One thing I’ve written quickly a few weeks ago was an algorithm to crop the lines. The basic idea is to have a text where all the text is in an unique line and has to be written in separate lines. This is a typical problem when you try to send a text in a mail where the lines should not be larger than 72 characters.

I know many mail readers broke the lines if they can not be displayed correctly but I have written the following code you can reuse:

public static String cropLines(Reader in, int maxChars, boolean justify) throws IOException {
 StringBuilder buf = new StringBuilder(250);

 // Using a reader avoids issues with the fucking
 // end of lines.
 LineNumberReader reader = new LineNumberReader(in);
 String line;
 while ((line = reader.readLine()) != null) {
 // Read multiple lines if the "\" is discovered at the end of
 // the line (means the line is extended).
 while (line.endsWith("\\")) {
 line = line.substring(0, line.length() - 1)    + reader.readLine();

 if( line.length() == 0 ){
 else {
 // Calculate the number of spaces used for indentation.
 StringBuilder indent = new StringBuilder();
 int len = line.length();
 char c;
 while (indent.length() < len
 && Character.isWhitespace(c = line.charAt(indent.length())))

 // For indented lines, remove the first spaces as they will
 // added at the contenation part.
 if (indent.length() > 0) {
 line = line.trim();
 len = line.length();

 // Until the line is exhausted.
 while (len > 0) {
 int pos = 0;
 int last = 0;

 // Look for the biggest part we can cut
 while (pos < len
 && (pos + indent.length() < maxChars || last == 0)) {
 c = line.charAt(pos);
 if (c == ' ') last = pos;

 if (last == 0 || pos == len) {
 // No space, then put it all!
 len = 0;
 else {
 // Add the beginning of the line and truncate.
 StringBuilder cropped = new StringBuilder(line
 .substring(0, last));

 if (justify && pos < len) {
 // Justify on the left and on the right.
 boolean hasSpaces = true;
 while (cropped.length() < maxChars && hasSpaces ) {
 int i = cropped.length() - 2;
 hasSpaces = false;

 // Add spaces for the end to the beginning of
 // the cropped line.
 while (cropped.length() < maxChars && i > 0 ) {
 if (Character.isWhitespace(c = cropped.charAt(i--))) {
 while (i > 0 && Character.isWhitespace(c = cropped.charAt(i))){
 cropped.insert(i+1, ' ');
 hasSpaces = true;
 line = line.substring(last + 1);
 len = line.length();
 return buf.toString();

 * Crop the lines to a number of chars. The algorithm does
 * at its best. If there are no spaces, the line will be
 * not broken.
 * <p>
 * If your text begins with some spqces, those spaces are
 * kept if the line is divided (to keep the correct alignment).
 * </p>
 * <p>
 * The algorithm is not optimized. But, the book of Jules Verne,
 * <i>20000 lieux sous les mers<i> (about 800Kb), is processed in
 * about 350ms for lines of 30 characters only with justification
 * on the right and on the left. The times falls to 200 ms for
 * the same text with lines of 70 characters.
 * </p>
 * @param str the string contqining the text.
 * @param maxChars the maximum of characters per line. A value
 *         of 72 is reasonnqble, below 30 characters, the work to do
 *         is quite important.
 * @param justify <code>true</code> for a
 *         full justification (on the left and on the
 *         right).
 * @return the new string.
 public static String cropLines(String str, int maxChars, boolean justify) {
 String str2 = null;
 try {
 str2 = cropLines(new StringReader(str), maxChars, justify);
 } catch (IOException e) {
 LOG.error("I/O ERROR: " + e.getMessage());
 return str2;

The most interesting thing in this code are the performances. I was very sure that my code was very bad (and I think it is) because I manipulate Strings in many ways: adding chars, cutting, copying and so on. My first concern was to say: this algorithm has very bad performances and I should not use it on a heavy loaded server. But, rather than refactoring my code which was working fine (with a full justification by adding spaces for example), I’ve just tried to do the line breaking on a very big text, not a very simple mail.

The results are speaking by themselves: 350 milliseconds for an entire book in the worst case (my computer is not a fast one, you can trust me). When I’ve seen the results, I’ve said: Whouah, don’t bother with performances…!

This is a regular issue in development teams: the programmer knows the code is not performant and he tries to enhance it. This is a costly mistake. A performance-killer for the team: because even if the code is bad in terms of performance, this is not annoying because this part of code is called only twice a day! And sometimes, the solution could be worse than the original one, then follow my advice:

  1. Check the performance of the current code;
  2. Be sure the code is called often;
  3. Optimize only the revelant parts.

If you follow these 3 rules (in this order), you will have more time to drink a beer…


From → Other

  1. Hello William,

    Your program sounds good to me too compared to par unix command!

    I am running a Dell xps M1710 on ubuntu 10.4 and here the results for a 1143 lines book with 120162 words:

    time cat “l’éthique de la liberté.txt” | par 72j > test.txt

    real 0m0.119s
    user 0m0.108s
    sys 0m0.012s

    The book is available here:

  2. Hi Nicolas,

    I’ve done the test with my UNIX box (UBUNTU). JAVA takes about 300 ms and the “par” command about 100 ms with your file.

    That’s mean my code is 3 times slower than an optimized code. Not so bad, in the sense the usage is very limited and not for big volumes. The JAVA code is suitable for a variety of usages without any further optimization. And you confirm the rules I gave: do not optimize first!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: