Tuesday, May 11, 2010

google's guava library tutorial part 1: fun with string-related stuff

I was planning to create a Guava tutorial. But it seems like it'll be too large for a single post, so I opted on splitting it into several parts. The first part contains everything related to Strings. Four main classes are explained:
  • CharMatcher (which can be considered as a light form of JDK's Pattern+Matcher with string manipulation capabilities)
  • Joiner and MapJoiner (which are useful for joining iterables or arrays into string representations)
  • Splitter (which is split() of JDK on steroids).


CharMatcher can be thought as a Pattern+Matcher of JDK in a more simple and practical form. It's not a full fledged replacement because you can't use regular expressions as you do on JDK.

String string = "Scream 4";
// I get a predefined CharMatcher 
//which will accept letters or digits
CharMatcher matcher = CharMatcher.JAVA_LETTER_OR_DIGIT;
// You can find how many times a letter
// or a digit is seen
// Much more practical to use a Pattern 
//and a Matcher then iterate over the 
//Matcher results for counting
int count = matcher.countIn(string);
System.out.println("Letter or digit count: "+count);
// 7 characters

/*
* matchesAllOf (matchesNoneOf) checks 
* if all (none) of the elements
* in the given string matches with the
* matcher in hand.
* */
System.out.println(matcher.matchesAllOf("scream")); 
// true
System.out.println(matcher.matchesAllOf("scream ")); 
// false because there's an empty 
//space at the end
System.out.println(matcher.matchesNoneOf("_?=)(")); 
// true because no letters or 
//digits in it


You can negate the matcher so it accepts the complementary character set. e.g. if our CharMatcher was accepting {a, b, c}, it'll accept any character except {a, b, c}.

CharMatcher negatedMatcher = matcher.negate();
/*
* You can think that true, false,
* true will become false, true, false
* because now our matcher is a
* non-letter, non-digit matcher.
* But no, the result will be false,
* false, false.
* The interesting one is the second one.
* The negatedMatcher matches only
* the empty space part of "scream ".
* So it returns "false".
* */

System.out.println(negatedMatcher.matchesAllOf("scream")); 
//false
System.out.println(negatedMatcher.matchesAllOf("scream "));
//false
System.out.println(negatedMatcher.matchesNoneOf("_?=)("));
//false



removeFrom() and retainFrom() are really convenient methods. The first one removes the matching string while the second one extracts the matching string.


String review = "Scream 4 is the #1 teen-slasher!";
CharMatcher whitespaceMatcher = CharMatcher.JAVA_WHITESPACE;
String result = whitespaceMatcher.removeFrom(review); 
// This matcher will remove the 
//matching characters (whitespaces)
System.out.println("The sentence without whitespaces: "+result);
//output: Scream4isthe#1teen-slasher!

/*
* I want the numbers in the text above.
* I can do that by first taking 
*the predefined digit CharMatcher and
* then calling retainFrom() for 
* the string in hand.
* */
String result2 = CharMatcher.DIGIT.retainFrom(review);
System.out.println("Retained digits: "+result2); 
// I'll get '41' as a result 
// because I have 4 and 1 as digits



indexIn() returns the index of the first matching character.

//I'd like to learn the index
// of the digit result too.
//The first element is '4'

int indexOfDigit = CharMatcher.DIGIT.indexIn(review);
System.out.println("index Of Digit: "+indexOfDigit); 
// 4's index is 7


Although it's possible to use CharMatcher with predefined matcher setting you can as well build your own.

CharMatcher onlyEvenNumbersMatcher = CharMatcher.anyOf("2468"); 
// This accepts any even number
CharMatcher noEvenNumbersMatcher = CharMatcher.noneOf("2468"); 
// This accepts everything 
//but even numbers
CharMatcher largeAtoZ = CharMatcher.inRange('A', 'Z');
CharMatcher aToZ = CharMatcher.inRange('a', 'z').or(largeAtoZ);
// we added A-Z with 'or' here. 
// You can join CharMatchers
// with "and" too.

System.out.println(
"Even numbers matcher result: "
+onlyEvenNumbersMatcher.matchesAllOf("1354")); 
// false. 1,3,5 are not ok

System.out.println(
"Even numbers matcher result: "
+onlyEvenNumbersMatcher.matchesAllOf("starwars")); 
// false. only even numbers are ok

System.out.println(
"Even numbers matcher result: "
+onlyEvenNumbersMatcher.matchesAllOf("2466")); 
// true. all of them are even

System.out.println(
"No even numbers matcher result: "
+noEvenNumbersMatcher.matchesAllOf("1354")); 
// false. 4 is not ok

System.out.println(
"No even numbers matcher result: "
+noEvenNumbersMatcher.matchesAllOf("1337")); 
// true. none of them are even

System.out.println(
"No even numbers matcher result: "
+noEvenNumbersMatcher.matchesAllOf("supermario")); 
// true. none of them are even

System.out.println(
"a to Z matcher result: "+aToZ.matchesAllOf("sezin")); 
System.out.println(
"a to Z matcher result: "+aToZ.matchesAllOf("Sezin")); 
System.out.println(
"a to Z matcher result: "+aToZ.matchesAllOf("SeZiN")); 
System.out.println(
"a to Z matcher result: "+aToZ.matchesAllOf("SEZIN")); 
// true. all strings are ok.
// All of the characters are 
// in {a, .., z} and {A, .., Z} range

System.out.println(
"a to Z matcher result: "+aToZ.matchesAllOf("scream4")); 
// false. if 4 was not here every
// character in hand was in [a-Z]   


You can use trimFrom(), trimLeadingFrom() and trimTrailingFrom() for enhanced trimming capability. Next class is the Joiner class. You probably know splitting capabilities of JDK. It's a mystery why a string joining mechanism is not added to JDK. Guava's Joiner is here to help you in case you need one. Joiner basically takes an iterable or an array and joins all the elements inside as Strings. After that, you can directly add it to a StringBuilder, an Appendable (like PrintWriter, BufferedWriter ... etc), or obtain a String in the "element1 SEPARATOR element2...." form. We choose the separator with on() method of Joiner class. It's possible to use a CharMatcher, a Pattern or a String as separator.

// lets build an array list with 
//4 letters content
ArrayList charList = Lists.newArrayList("a", "b", "c", "d");
StringBuilder buffer = new StringBuilder();

// You can easily add the joined
// element list to a StringBuilder
buffer = Joiner.on("|").appendTo(buffer, charList);
System.out.println(
"Joined char list appended to buffer: "+buffer.toString());
// Joined char list appended to buffer: a|b|c|d

//Below we join a list with ", "
// separator for obtaining a String
String joinedCharList = Joiner.on(", ").join(charList);
System.out.println(
"Joined char list as String: "+joinedCharList);

//Joined char list as String: a, b, c, d

//I'm adding a null value for
// further exploration of Joiner features
charList.add(null);
System.out.println(charList);
//  null at the end: 
//[a, b, c, d, null]

// Below the Joiner will skip
// null valued elements automatically
String join4 = Joiner.on(" - ").skipNulls().join(charList);
System.out.println(join4); 
// output: a - b - c - d

// Below, the Joiner will give
// a value to null valued elements automatically
join4 = Joiner.on(" - ").
useForNull("defaultValue").join(charList);
System.out.println(join4);
// output: a - b - c - d - defaultValue




If you have predefined String values no need to create an array or an iterable for joining them. Notice that you can join an arbitrary number of objects with the method below. The method works with var-args.

join4 = Joiner.on("|").
join("first", "second", "third", "fourth", "rest");
System.out.println(join4); 
//output: first|second|third|fourth|rest


Notice that if neither skipNulls() nor useForNull(String)is used, the joining methods will throw NullPointerException if any given element is null.

Joiner is for iterables and arrays. Joiner.MapJoiner inner class is the map counterpart of Joiner. You can join the map content directly using Joiner.MapJoiner class. First you have to build a Joiner and assign it a separator(1) using on(). Then you can call withKeyValueSeparator() which takes the separator(2) between key value pairs This map joiner can be used to join a map for obtaining a string or this can be appended to an Appendable. The form of the result is "key1 SEPARATOR(1) value1 SEPARATOR(2) key2 SEPARATOR(1) value2 SEPARATOR(2)..." without the empty spaces.

Map < String, Long > employeeToNumber = Maps.newHashMap(); 
// Create a Map using static
// method of Maps

employeeToNumber.put("obi wan", 1L);
employeeToNumber.put("bobba", 2L);

MapJoiner mapJoiner = 
Joiner.on("|").withKeyValueSeparator("->"); 
// | between each key-value pair 
//and -> between a key and its value
String join5 = mapJoiner.join(employeeToNumber);
System.out.println(join5);
//output is "obi wan->1|bobba->2".


Google Guava library contains a cool Splitter class that harness more power than the JDK's split functionality.

String text = "I have to test my string splitter,
for this purpose I'm writing this text,  ";

//I want to split the text
// above using ","
// I'll have three elements
// with the usual splitter:
// first sentence, then the 
//second sentence and the empty space at the end.

String[] split = text.split(","); // split with ","
System.out.println(Arrays.asList(split)); 
//output: [I have to test my string splitter,   
//for this purpose I'm writing this text,   ]


I'd want to remove the empty elements and then trim each element to remove the unnecessary empty spaces before and after them. I can do this in several steps with the old splitter. It's quite easy with Guava's Splitter.

// Again, the on parameter is the separator. 
//You can use a CharMatcher, 
//a Pattern or a String as a separator.
Iterable split2  = Splitter.on(",").omitEmptyStrings()
.trimResults().split(text);
System.out.println(Lists.newArrayList(split2));

// output: 
//[I have to test my string splitter,
// for this purpose I'm writing this text]

// I can split tokens of length 5 
//from the string in hand
Iterable split3 = Splitter.fixedLength(5).split(text);
System.out.println(Lists.newArrayList(split3)); 
// each token's length is 5
//output:
//[I hav, e to , test , my st, ring , split,
// ter, ,  for , this , purpo, se I', m wri,
// ting , this , text,,   ]


Notice that trimming is applied before checking for an empty result, regardless of the order in which the trimResults() and omitEmptyStrings() methods were invoked.

Strings class contains a number of utility methods.Most of them are checking String objects'. emptyToNull() and nullToEmpty() are quite similar. emptyToNull (nullToEmpty) returns the given string if it is non-empty (non-null) else it returns an empty (null) string.

String emptyToNull = Strings.emptyToNull("test");
System.out.println(emptyToNull); 
// returns "test" because it's not empty

emptyToNull = Strings.emptyToNull("");
System.out.println(emptyToNull); 
// returns null because the argument is empty


isNullOrEmpty() is quite practical. I don't remember how many times I had to write (string != null && !string.isEmpty())in my code.
String arg = "";
boolean nullOrEmpty = Strings.isNullOrEmpty(arg); 
// arg is empty
System.out.println("Null or Empty?: "+nullOrEmpty); 
// true because it's empty

arg =  null;
nullOrEmpty = Strings.isNullOrEmpty(arg); // arg is null
System.out.println("Null or Empty?: "+nullOrEmpty); 
// true because it's null

arg = "something";
nullOrEmpty = Strings.isNullOrEmpty(arg); 
// arg is not null or empty so the result is 'false'
System.out.println("Null or Empty?: "+nullOrEmpty);



I'll show you repeat() which returns a string consisting of the given number of concatenated copies of the input string.

String repeat = Strings.repeat("beetlejuice", 3); 
System.out.println(repeat); 
// output is "beetlejuicebeetlejuicebeetlejuice"


padEnd() and padStart() are quite similar. The first one adds the given char at the end of the given string as many times as the given integer value allows. The second one adds to the start.

String padEnd = Strings.padEnd("star wars", 15, 'X');
String padStart = Strings.padStart("star wars", 15, 'X');
System.out.println("padEnd: "+padEnd); 
// padEnd: star warsXXXXXX

System.out.println("padStart: "+padStart); 
// padStart: XXXXXXstar wars

System.out.println(padStart.length() == 15);
// true, because we give 15 as character limit


This is all for string-related classes of Guava.

6 comments:

  1. Hi there, love your articles, very nice.

    Keep it up!

    ReplyDelete
  2. Thanks for a good tech read

    ReplyDelete
  3. Thanks, very informative with many nuggets

    ReplyDelete
  4. I liked it. Also i think there's mistake.

    Map employeeToNumber = Maps.newHashMap();

    Doesn't seem to be valid in Java ?

    ReplyDelete
  5. Map < String, Long > employeeToNumber = Maps.newHashMap();

    should work with guava. I had a typo there I assume.

    ReplyDelete
  6. Bro guava ile ilgili bir şeyler ararken istemeden bloguna denk geldim. Yabancı bir arkadaş bu sayfaya link vermiş. Dünya küçük işte. Burada da karşılaşmak varmış. Güzel iş çıkarmışsın. ;)

    ReplyDelete