Tuesday, May 11, 2010

google's guava library tutorial part 1: fun with string-related stuff

I was planning to create a Guava tutorial. But it seems like it'll be too large for a single post, so I opted on splitting it into several parts. The first part contains everything related to Strings. Four main classes are explained:
  • CharMatcher (which can be considered as a light form of JDK's Pattern+Matcher with string manipulation capabilities)
  • Joiner and MapJoiner (which are useful for joining iterables or arrays into string representations)
  • Splitter (which is split() of JDK on steroids).


CharMatcher can be thought as a Pattern+Matcher of JDK in a more simple and practical form. It's not a full fledged replacement because you can't use regular expressions as you do on JDK.

01String string = "Scream 4";
02// I get a predefined CharMatcher
03//which will accept letters or digits
04CharMatcher matcher = CharMatcher.JAVA_LETTER_OR_DIGIT;
05// You can find how many times a letter
06// or a digit is seen
07// Much more practical to use a Pattern
08//and a Matcher then iterate over the
09//Matcher results for counting
10int count = matcher.countIn(string);
11System.out.println("Letter or digit count: "+count);
12// 7 characters
13 
14/*
15* matchesAllOf (matchesNoneOf) checks
16* if all (none) of the elements
17* in the given string matches with the
18* matcher in hand.
19* */
20System.out.println(matcher.matchesAllOf("scream"));
21// true
22System.out.println(matcher.matchesAllOf("scream "));
23// false because there's an empty
24//space at the end
25System.out.println(matcher.matchesNoneOf("_?=)("));
26// true because no letters or
27//digits in it


You can negate the matcher so it accepts the complementary character set. e.g. if our CharMatcher was accepting {a, b, c}, it'll accept any character except {a, b, c}.

01CharMatcher negatedMatcher = matcher.negate();
02/*
03* You can think that true, false,
04* true will become false, true, false
05* because now our matcher is a
06* non-letter, non-digit matcher.
07* But no, the result will be false,
08* false, false.
09* The interesting one is the second one.
10* The negatedMatcher matches only
11* the empty space part of "scream ".
12* So it returns "false".
13* */
14 
15System.out.println(negatedMatcher.matchesAllOf("scream"));
16//false
17System.out.println(negatedMatcher.matchesAllOf("scream "));
18//false
19System.out.println(negatedMatcher.matchesNoneOf("_?=)("));
20//false


removeFrom() and retainFrom() are really convenient methods. The first one removes the matching string while the second one extracts the matching string.


01String review = "Scream 4 is the #1 teen-slasher!";
02CharMatcher whitespaceMatcher = CharMatcher.JAVA_WHITESPACE;
03String result = whitespaceMatcher.removeFrom(review);
04// This matcher will remove the
05//matching characters (whitespaces)
06System.out.println("The sentence without whitespaces: "+result);
07//output: Scream4isthe#1teen-slasher!
08 
09/*
10* I want the numbers in the text above.
11* I can do that by first taking
12*the predefined digit CharMatcher and
13* then calling retainFrom() for
14* the string in hand.
15* */
16String result2 = CharMatcher.DIGIT.retainFrom(review);
17System.out.println("Retained digits: "+result2);
18// I'll get '41' as a result
19// because I have 4 and 1 as digits


indexIn() returns the index of the first matching character.

1//I'd like to learn the index
2// of the digit result too.
3//The first element is '4'
4 
5int indexOfDigit = CharMatcher.DIGIT.indexIn(review);
6System.out.println("index Of Digit: "+indexOfDigit);
7// 4's index is 7


Although it's possible to use CharMatcher with predefined matcher setting you can as well build your own.

01CharMatcher onlyEvenNumbersMatcher = CharMatcher.anyOf("2468");
02// This accepts any even number
03CharMatcher noEvenNumbersMatcher = CharMatcher.noneOf("2468");
04// This accepts everything
05//but even numbers
06CharMatcher largeAtoZ = CharMatcher.inRange('A', 'Z');
07CharMatcher aToZ = CharMatcher.inRange('a', 'z').or(largeAtoZ);
08// we added A-Z with 'or' here.
09// You can join CharMatchers
10// with "and" too.
11 
12System.out.println(
13"Even numbers matcher result: "
14+onlyEvenNumbersMatcher.matchesAllOf("1354"));
15// false. 1,3,5 are not ok
16 
17System.out.println(
18"Even numbers matcher result: "
19+onlyEvenNumbersMatcher.matchesAllOf("starwars"));
20// false. only even numbers are ok
21 
22System.out.println(
23"Even numbers matcher result: "
24+onlyEvenNumbersMatcher.matchesAllOf("2466"));
25// true. all of them are even
26 
27System.out.println(
28"No even numbers matcher result: "
29+noEvenNumbersMatcher.matchesAllOf("1354"));
30// false. 4 is not ok
31 
32System.out.println(
33"No even numbers matcher result: "
34+noEvenNumbersMatcher.matchesAllOf("1337"));
35// true. none of them are even
36 
37System.out.println(
38"No even numbers matcher result: "
39+noEvenNumbersMatcher.matchesAllOf("supermario"));
40// true. none of them are even
41 
42System.out.println(
43"a to Z matcher result: "+aToZ.matchesAllOf("sezin"));
44System.out.println(
45"a to Z matcher result: "+aToZ.matchesAllOf("Sezin"));
46System.out.println(
47"a to Z matcher result: "+aToZ.matchesAllOf("SeZiN"));
48System.out.println(
49"a to Z matcher result: "+aToZ.matchesAllOf("SEZIN"));
50// true. all strings are ok.
51// All of the characters are
52// in {a, .., z} and {A, .., Z} range
53 
54System.out.println(
55"a to Z matcher result: "+aToZ.matchesAllOf("scream4"));
56// false. if 4 was not here every
57// character in hand was in [a-Z]  


You can use trimFrom(), trimLeadingFrom() and trimTrailingFrom() for enhanced trimming capability. Next class is the Joiner class. You probably know splitting capabilities of JDK. It's a mystery why a string joining mechanism is not added to JDK. Guava's Joiner is here to help you in case you need one. Joiner basically takes an iterable or an array and joins all the elements inside as Strings. After that, you can directly add it to a StringBuilder, an Appendable (like PrintWriter, BufferedWriter ... etc), or obtain a String in the "element1 SEPARATOR element2...." form. We choose the separator with on() method of Joiner class. It's possible to use a CharMatcher, a Pattern or a String as separator.

01// lets build an array list with
02//4 letters content
03ArrayList<string> charList = Lists.newArrayList("a", "b", "c", "d");
04StringBuilder buffer = new StringBuilder();
05 
06// You can easily add the joined
07// element list to a StringBuilder
08buffer = Joiner.on("|").appendTo(buffer, charList);
09System.out.println(
10"Joined char list appended to buffer: "+buffer.toString());
11// Joined char list appended to buffer: a|b|c|d
12 
13//Below we join a list with ", "
14// separator for obtaining a String
15String joinedCharList = Joiner.on(", ").join(charList);
16System.out.println(
17"Joined char list as String: "+joinedCharList);
18 
19//Joined char list as String: a, b, c, d
20 
21//I'm adding a null value for
22// further exploration of Joiner features
23charList.add(null);
24System.out.println(charList);
25//  null at the end:
26//[a, b, c, d, null]
27 
28// Below the Joiner will skip
29// null valued elements automatically
30String join4 = Joiner.on(" - ").skipNulls().join(charList);
31System.out.println(join4);
32// output: a - b - c - d
33 
34// Below, the Joiner will give
35// a value to null valued elements automatically
36join4 = Joiner.on(" - ").
37useForNull("defaultValue").join(charList);
38System.out.println(join4);
39// output: a - b - c - d - defaultValue
40 
41</string>



If you have predefined String values no need to create an array or an iterable for joining them. Notice that you can join an arbitrary number of objects with the method below. The method works with var-args.

1join4 = Joiner.on("|").
2join("first", "second", "third", "fourth", "rest");
3System.out.println(join4);
4//output: first|second|third|fourth|rest


Notice that if neither skipNulls() nor useForNull(String)is used, the joining methods will throw NullPointerException if any given element is null.

Joiner is for iterables and arrays. Joiner.MapJoiner inner class is the map counterpart of Joiner. You can join the map content directly using Joiner.MapJoiner class. First you have to build a Joiner and assign it a separator(1) using on(). Then you can call withKeyValueSeparator() which takes the separator(2) between key value pairs This map joiner can be used to join a map for obtaining a string or this can be appended to an Appendable. The form of the result is "key1 SEPARATOR(1) value1 SEPARATOR(2) key2 SEPARATOR(1) value2 SEPARATOR(2)..." without the empty spaces.

01Map < String, Long > employeeToNumber = Maps.newHashMap();
02// Create a Map using static
03// method of Maps
04 
05employeeToNumber.put("obi wan", 1L);
06employeeToNumber.put("bobba", 2L);
07 
08MapJoiner mapJoiner =
09Joiner.on("|").withKeyValueSeparator("->");
10// | between each key-value pair
11//and -> between a key and its value
12String join5 = mapJoiner.join(employeeToNumber);
13System.out.println(join5);
14//output is "obi wan->1|bobba->2".


Google Guava library contains a cool Splitter class that harness more power than the JDK's split functionality.

01String text = "I have to test my string splitter,
02for this purpose I'm writing this text,  ";
03 
04//I want to split the text
05// above using ","
06// I'll have three elements
07// with the usual splitter:
08// first sentence, then the
09//second sentence and the empty space at the end.
10 
11String[] split = text.split(","); // split with ","
12System.out.println(Arrays.asList(split));
13//output: [I have to test my string splitter,  
14//for this purpose I'm writing this text,   ]

I'd want to remove the empty elements and then trim each element to remove the unnecessary empty spaces before and after them. I can do this in several steps with the old splitter. It's quite easy with Guava's Splitter.

01// Again, the on parameter is the separator.
02//You can use a CharMatcher,
03//a Pattern or a String as a separator.
04Iterable<string> split2  = Splitter.on(",").omitEmptyStrings()
05.trimResults().split(text);
06System.out.println(Lists.newArrayList(split2));
07 
08// output:
09//[I have to test my string splitter,
10// for this purpose I'm writing this text]
11 
12// I can split tokens of length 5
13//from the string in hand
14Iterable<string> split3 = Splitter.fixedLength(5).split(text);
15System.out.println(Lists.newArrayList(split3));
16// each token's length is 5
17//output:
18//[I hav, e to , test , my st, ring , split,
19// ter, ,  for , this , purpo, se I', m wri,
20// ting , this , text,,   ]
21</string></string>


Notice that trimming is applied before checking for an empty result, regardless of the order in which the trimResults() and omitEmptyStrings() methods were invoked.

Strings class contains a number of utility methods.Most of them are checking String objects'. emptyToNull() and nullToEmpty() are quite similar. emptyToNull (nullToEmpty) returns the given string if it is non-empty (non-null) else it returns an empty (null) string.

1String emptyToNull = Strings.emptyToNull("test");
2System.out.println(emptyToNull);
3// returns "test" because it's not empty
4 
5emptyToNull = Strings.emptyToNull("");
6System.out.println(emptyToNull);
7// returns null because the argument is empty


isNullOrEmpty() is quite practical. I don't remember how many times I had to write (string != null && !string.isEmpty())in my code.
01String arg = "";
02boolean nullOrEmpty = Strings.isNullOrEmpty(arg);
03// arg is empty
04System.out.println("Null or Empty?: "+nullOrEmpty);
05// true because it's empty
06 
07arg =  null;
08nullOrEmpty = Strings.isNullOrEmpty(arg); // arg is null
09System.out.println("Null or Empty?: "+nullOrEmpty);
10// true because it's null
11 
12arg = "something";
13nullOrEmpty = Strings.isNullOrEmpty(arg);
14// arg is not null or empty so the result is 'false'
15System.out.println("Null or Empty?: "+nullOrEmpty);


I'll show you repeat() which returns a string consisting of the given number of concatenated copies of the input string.

1String repeat = Strings.repeat("beetlejuice", 3);
2System.out.println(repeat);
3// output is "beetlejuicebeetlejuicebeetlejuice"


padEnd() and padStart() are quite similar. The first one adds the given char at the end of the given string as many times as the given integer value allows. The second one adds to the start.

01String padEnd = Strings.padEnd("star wars", 15, 'X');
02String padStart = Strings.padStart("star wars", 15, 'X');
03System.out.println("padEnd: "+padEnd);
04// padEnd: star warsXXXXXX
05 
06System.out.println("padStart: "+padStart);
07// padStart: XXXXXXstar wars
08 
09System.out.println(padStart.length() == 15);
10// true, because we give 15 as character limit

This is all for string-related classes of Guava.

6 comments:

  1. Hi there, love your articles, very nice.

    Keep it up!

    ReplyDelete
  2. Thanks for a good tech read

    ReplyDelete
  3. Thanks, very informative with many nuggets

    ReplyDelete
  4. I liked it. Also i think there's mistake.

    Map employeeToNumber = Maps.newHashMap();

    Doesn't seem to be valid in Java ?

    ReplyDelete
  5. Map < String, Long > employeeToNumber = Maps.newHashMap();

    should work with guava. I had a typo there I assume.

    ReplyDelete
  6. Bro guava ile ilgili bir şeyler ararken istemeden bloguna denk geldim. Yabancı bir arkadaş bu sayfaya link vermiş. Dünya küçük işte. Burada da karşılaşmak varmış. Güzel iş çıkarmışsın. ;)

    ReplyDelete