Structure Your Concurrency

2024 04 01 head

Writing concurrent software is one of the greatest challenges for software developers [1, 2].

A professional software developer must understand the principles of concurrent programming and the tools available to write concurrent software.

Only concurrent applications can access all cores of modern multicore processors and provide the best performance.

Java thread model makes it a strong contender among concurrent languages, but multithreading has always been inherently tricky.

The introduction of virtual threads empowered Java to provide a unique and highly-optimized threading system that is also easy to understand. You can now create millions of threads without the overhead of creating a native operating system thread.

Now you need mechanisms to manage this huge number of threads.

Virtual threads, now an official feature of Java, create the possibility of cheaply spawning threads to gain concurrent performance. As a result, Java now has a unique and highly-optimized threading system that is also easy to understand.

Application designers often face the following questions:

  • How do you cope with a huge number of threads?

  • How do you process huge collections?

  • How do you process data flows with back pressure?

  • How do you design distributed and reactive systems?

Structured Concurrency

Structured concurrency enhances the maintainability, reliability, and observability of multithreaded code. It adopts a concurrent programming style that reduces the likelihood of thread leaks and cancellation delays. These are common risks associated with cancellation and shutdown.

As the JEP for structured concurrency says, If a task splits into concurrent subtasks, then they all return to the same place, namely the task’s code block.

Structured concurrency is a programming paradigm that provides a way to write concurrent software using familiar program flows and constructs. It guarantees that all concurrent tasks are properly managed and cleaned up when leaving the scope of the try with resources block.

try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    Future<Shelter> shelter = scope.fork(this::getShelter);
    Future<List<Dog>> dogs = scope.fork(this::getDogs);
    scope.join();
    Response response = new Response(shelter.resultNow(), dogs.resultNow());
    // ...
}

When exiting the try-with-resources block, the structured concurrency framework ensures that all concurrent tasks are properly managed and cleaned up.

Modern Java programmers never create threads directly.

Complex or slow algorithms shall be parallelized using structured concurrency to exploit modern multicore processors.

Concurrent Stream Processing

2024 04 01 parallel stream

Java 8 introduced the Stream API, which provides a way to process data in a declarative manner [3].

Java Streams support parallel processing, but the parallelism is not structured.

The parallelism is processing bound and not I/O bound. Therefore, the maximum number of active threads should be limited to the number of available processor cores.

The parallel stream is cleanup upon completion of the terminal operation.

List<Integer> listOfNumbers = Arrays.asList(1, 2, 3, 4);
int sum = listOfNumbers.parallelStream().reduce(5, Integer::sum);

Java streams library provides a rich set of functionalities that can work with any stream. The approach is similar to the sequence library in functional programming languages such as Clojure.

Modern Java code processes any collection through streams. Huge collections are very efficiently processed in parallel streams on modern multicore processors.

This approach transforms imperative code into a declarative style, which is easier to read and maintain.

Concurrent Data Flow Processing

2024 04 01 reactive

Concurrent data flow processing is based on the Reactive Streams API. It is a specification for asynchronous stream processing with non-blocking back pressure.

On one side, functional programming is the process of building software by composing pure functions, avoiding shared state, mutable data, and side effects.

On the other side, reactive programming is an asynchronous programming paradigm concerned with data streams and the propagation of change.

Together, functional reactive programming forms a combination of functional and reactive techniques that can represent an elegant approach to event-driven programming. Values change over time and where the consumer reacts to the data as it comes in.

The processing pipeline is composed of a source, a set of operators, and a sink.

String[] letters = {"a", "b", "c", "d", "e", "f", "g"};
Observable<String> observable = Observable.from(letters);
observable.subscribe(
  i -> result += i,                                        //OnNext
  Throwable::printStackTrace,                              //OnError
  () -> result += "_Completed"                             //OnCompleted
);
assertTrue(result.equals("abcdefg_Completed"));

The reactive library takes care of the threading and back pressure.

Distributed and Reactive Systems

2024 04 01 actor

The actor model is a programming model for concurrency in a distributed system. It is based on the concept of actors, which are independent entities that communicate with each other by sending messages.

Each actor has its own mailbox and processes messages one at a time. An actor only accesses its own state and does not share state with other actors.

An actor is an active object that encapsulates state and behavior and is implemented as a concurrent process.

I strongly recommend using the actor model when designing distributed and reactive systems. Avoid low-level concurrency primitives and thread pools.

Design your system as a set of actors that communicate with each other by sending messages. Eliminate shared mutable state and use message passing to communicate between actors.

Lessons Learnt

Between virtual threads and structured concurrency, Java developers have a compelling new mechanism for breaking up almost any code into concurrent tasks without much overhead. An application developer almost never uses concurrency primitives or thread pools directly. Beware of these design approaches and select wisely based on the requirements of the application.

Library developers can use the new concurrency primitives to build high-performance libraries that are straightforward to use.

Use parallel streams when processing huge collections.

Use reactive programming when processing data flows with back pressure, different sampling rates, and complex event processing.

Use the actor model when designing distributed and reactive systems.

References

[1] D. Farley, Modern Software Engineering. Pearson Education, Limited, 2022 [Online]. Available: https://www.amazon.com/dp/B09GG6XKS4

[2] J. Bloch, Effective Java, Third. Addison-Wesley Professional, 2017 [Online]. Available: https://www.amazon.com/dp/B078H61SCH

[3] V. Subramaniam, Functional Programming In Java Harnessing The Power Of Java 8 Lambda Expressions. The Pragmatic Programmers, 2014 [Online]. Available: https://www.amazon.com/dp/B0CJL7VKFL