Map & Reduce in Java

🗓️
🔄
4 min

Streams rundown

Since Java 8, you can use the Stream API to manipulate Collections.
They are a fairly semantic way to apply a series of changes to a Collection, producing another different Collection (or object, or variable) as a result.
No mutation occurs.

Generally speaking, you turn a Collection into a Stream with .stream() and pipe a series of operators to produce the desired result.

Each intermediate operator (map, filter, sorted, etc.) takes a Stream as input, while outputting another Stream.

You close the Stream by calling one of the terminal operators: collect will return a Collection (no surprises here), forEach will return Void, while reduce and find will be covered below.

Map

Runs a given function on each element of the Stream.

Something like inputStream > Map(myFunction(element)) > outputStream.

For example:

java
List<String> list = Arrays.asList("this", "is", "a", "test");
List<String> answer = list.stream()
.map(String::toUpperCase)
.map(str -> str + ".txt")
.map(str -> str.length())
.collect(Collectors.toList());
System.out.println(answer);

Output: [8, 6, 5, 8]

Filter

Similar to map, but in this case, the function it receives needs to return a boolean.
This is because filter will only output the elements of the incoming Stream that return true after being evaluated with the function it received.

Something like inputStream > Filter(myFilter()) > outputStream, where myFilter() returns a boolean.

java
List<String> list = Arrays.asList("this", "is", "another", "test");
List<String> answer = list.stream()
.filter(str -> str.length() > 3 && str.startsWith("a"))
.collect(Collectors.toList());
System.out.println(answer);

Output: [another]

Reduce

Produces a single result from a Stream, by applying a given combining operation to the incoming elements.

There are three possible components in this operation:

Something like inputStream > Reduce(myIdentity, myAccumulator, myCombiner) > result.

Accumulator

Unless the accumulator has some complexity to it, you’ll usually see it as a Lambda:

java
String[] array = { "Java", "Streams", "Rule" };
Optional<String> combined = Arrays.stream(array).reduce((str1, str2) -> str1 + "-" + str2);
if (combined.isPresent())
System.out.println(combined.get());

Output: Java-Streams-Rule

By default, reduce will return an Optional of the type it finds in the incoming Stream, hence the if statement at the end.

You can avoid that part by closing the Stream with an orElse().

Identity

Useful for avoiding NullPointerExceptions, especially when reducing complex objects.

java
int product = IntStream.range(2, 8)
.reduce(0, (num1, num2) -> num1 * num2);
System.out.println("The product is: " + product);

Output: The product is: 5040

Combiner

Due to some quirks of the JVM when under parallel execution, we’ll need a way to combine the results of each sub-stream in one.

A simple example with the three reduce components explicitly set might look something like:

java
int sumAges = Arrays.asList(25, 30, 45, 28, 32)
.parallelStream()
.reduce(0, (a, b) -> (a + b), Integer::sum);
System.out.println(sumAges);

Output: 160

The Combiner will also be necessary if different types are managed in the Accumulator.
In the example, the Accumulator has an int as partial result, but a User as next element:

java
List<User> users = Arrays.asList( new User("Dacil", 30), new User("Gabriel", 35));
int result = users.stream()
.reduce(0, (partialAge, user) -> (partialAge + user.getAge()), Integer::sum);

Find

There are two variants of the find function in Java:

One always gets the same element (given the same input Stream), while the other does not guarantee it.
Bear in mind, that in simple single-threaded examples like these, both are likely to behave in the same way.

java
String[] array = { "Stream", "Java", "Rule" };
Optional<String> combined = Arrays.stream(array).sorted().findFirst();
if (combined.isPresent())
System.out.println(combined.get());

Output: Java

By default, findFirst will return an Optional of the type it finds in the incoming Stream.
Just like we did with our reduce example, you can avoid handling the Optional by closing the Stream with an orElse().


Other posts you might like