Getting Started with the Java Streams API

June 11, 2019
Written by

Blog Header: Getting Started with the Java Streams API by Matthew Gilliard, Developer Evangelist

The Streams API was added in 2014 with the release of Java 8 so you’ve almost certainly got it available today. It is used to pass a series of objects through a chain of operations, so we can program in a more functional style than plain iteration allows. Still, when working with collections many developers still reach for the classic for loop.

In this post, I’ll introduce the terms used when talking about Streams, show some examples of each term and how they can be used together to create compact and descriptive code. Then I’ll show a real-world example of Streams code I wrote recently to pick winners in a raffle.

What’s a Stream?

A Stream is a (possibly never-ending) series of Objects. A Stream starts from a Source. The objects in a Stream flow through Intermediate Operations, each of which results in another stream, and are gathered up at the end in a Terminal Operation.  Streams are lazy, which means the source and intermediate operations do nothing until objects are needed by the terminal operation.

If we were making a fruit salad using the Streams API it might look like this:

A Collection<Fruit> is streamed through Wash, Peel and Chop. Then the terminal operation "Put in a big bowl" results in a TastyFruitSalad

Streams vs Collections

At first look, Streams may appear similar to Collections - here's a few differences:

Streams

  • may be infinite
  • do not allow you to directly access individual items
  • may include a way to create more elements on demand
  • can only be read once

 

Collections

  • designed for holding a finite number of existing objects
  • provide efficient ways to access individual items
  • may be accessed as many times as you need to

 

The ways of using Streams and Collections are quite different. As you will see below, it’s straightforward to create a Stream from a Collection and vice-versa.

Stream Sources

A Stream Source is something which creates a Stream, so what kinds of sources are there?

Collections

A common source for a Stream is a Collection, like a List or a Set (you cannot stream a Map directly, but you can stream its entrySet). Here’s a Stream from a List:

Stream<String> typesOfFruit = List.of(
        "apple", "banana", "clementine", "dragonfruit").stream();

I/O

I/O (or Input/Output) concerns moving data in or out of your application, for example reading data over a network, or reading a file from disk. A Stream of lines can be made from a File like this:

Stream<String> lines = Files.lines(Path.of("my-fruit-list.txt"));

Generators

A Stream can be generated from scratch by providing the contents up-front using Stream.of(...), or providing a function which returns each object for the Stream with generate or iterate, for example:

Random random = new Random();
Stream<Integer> randomNumbers = Stream.generate(random::nextInt);

The Stream class also has concat for combining multiple Streams into one.

Intermediate Stream Operations

An Intermediate Operation is something which takes a Stream and transforms it into another Stream, possibly containing a different type of Object. Each element in the original Stream will be passed one at a time to the operation, and depending on how the operation works the resulting Stream may be the same length as the original, but could also be shorter or longer:

// Same length: returns a Stream of "APPLE", "BANANA", "CLEMENTINE", "DRAGONFRUIT"
typesOfFruit.map(String::toUpperCase);

// Shorter: this returns a Stream of "clementine", "dragonfruit"
typesOfFruit.filter(fruit -> fruit.length() > 7);

// Longer: returns a Stream of "a", "p", "p", "l", "e", "b", "a" ...
typesOfFruit.flatMap(fruit -> Stream.of(fruit.split("")));

Notice the way that .flatMap(...) works is interesting. The lambda returns a Stream<String>, so if this were a regular .map(...) we would have a Stream<Stream<String>>. What flatMap does is join the inner Streams together to make a Stream<String> again.

Remember that you can only traverse a Stream once, so you won’t be able to do all three of those operations on the same typesOfFruit Stream. If you try it you will see java.lang.IllegalStateException: stream has already been operated upon or closed.

Chaining Intermediate Operations

The superpower of the Streams API, is that Intermediate Operations work on a Stream and return another Stream, so they can be chained together:

// These operations return a Stream of all the letters in the fruit names which 
// are after "G" in the alphabet, uppercased: "P", "P", "L", "N" ...
typesOfFruit
   .map(String::toUpperCase)
   .flatMap(fruit -> Stream.of(fruit.split("")))
   .filter(s -> s.compareTo("G") > 0);

If you try writing that without using Streams it will take far more code. You’ll have to create a lot of variables to hold intermediate values and the result will be far less compact and easy to read.

Terminal Stream Operations

Remember that Streams are lazy, which means that Sources and Intermediate Operations don’t do any work until they need to. It’s the Terminal Operations that create that need.

Another way to say that is that Terminal Operations turn a Stream back into something else. Until you need that something else the Stream just sits there, chilling.

There are lots of possible Terminal Operations including Collectors for gathering the Stream back into a Collection like a List or a Map, .foreach(...) to run some operation on each item and methods like .count() and .reduce(...) to compute some value from the Stream. Here’s a few examples:

// Collecting the Stream elements into a List
List<String> fruitCount = typesOfFruit.collect(Collectors.toList());

// Doing something with each element of the Stream
typesOfFruit.forEach(System.out::println);

// How many elements are in the Stream?
long fruitCount = typesOfFruit.count();

A Real-World Example of Streams

By now you’ve probably thought of lots of ways you could use Streams in your own code, but I’d like to end with an example where I used Streams to pick winners for a raffle we decided to run on the Twilio booth at JBcnConf. Folks could enter by sending an SMS to a phone number I manage with Twilio and we needed to pick three people to win space-themed Lego. The Streams API made this code pretty nice to write.

Three Space Legos as prizes for a raffle, which could be entered by SMS.

 

The code had to:

  • Fetch all the messages we received, using the Twilio Java helper library
  • Remove entries which arrived too early or too late
  • Grab the number each message was sent from
  • Remove duplicates so that each person has an equal chance of winning
  • Randomly shuffle the numbers
  • Take only as many numbers as we had prizes to give
  • Send an SMS to each winner

 

This looks like a case for the Streams API:

Flowchart: A ResourceSet<Message> is streamed through operations which result in 3 winners being sent an SMS.

 

Here’s the code (it’s also on GitHub):

First, fetch all the SMSs sent to the CONTEST NUMBER:

Twilio.init(System.getenv("ACCOUNT_SID"), System.getenv("AUTH_TOKEN"));
ResourceSet<Message> allMessages = Message.reader().setTo(CONTEST_NUMBER).read();

ResourceSet is an Iterable so we can use it as a Source via the StreamSupport class in java.util.stream. I thought Iterable might have its own .stream() method but it doesn’t:

StreamSupport.stream(allMessages.spliterator(), false)

Now for the Intermediate Operations:


.filter( m -> m.getDateSent().isAfter(DRAW_START))
.filter( m -> m.getDateSent().isBefore(DRAW_END))
.map(Message::getFrom)        // returns a PhoneNumber
.distinct()
.collect(Collectors.collectingAndThen(toList(), list -> {
       Collections.shuffle(list);
       return list.stream();
   }))
.limit(NUMBER_OF_WINNERS)

The trickiest operation is the shuffle (highlighted). Although Streams have .sorted(), they don’t have a .shuffled() so I used a .collect(...) as an Intermediate Operation by collecting into a List, shuffling it and returning a Stream of that List.

At this point we have a Stream<PhoneNumber> with the right number of items in it. Finally, the Terminal Operation takes each phone number and uses the Twilio API to send a message someone will be happy to receive:

.forEach( winner -> {
   Message.creator(
       winner, CONTEST_NUMBER,
       CONGRATULATIONS_YOU_WON_MESSAGE).create();
});

The code is on GitHub including the Maven config.

Wrapping up

The Streams API can help you write shorter and more direct code for dealing with bulk operations on Collections and more. In this post I’ve shown some of the basics - there’s more to Streams, such as Parallel Streams, Primitive Streams (Streams of int, double or long) which can help with performance.

If you find some code which could be improved with Streams your IDE might even suggest refactorings for you, and I’d love to hear how you get on.

@MaximumGilliard

mgilliard@twilio.com