Structure Your Concurrency
A professional software developer must understand the principles of concurrent programming and the tools available to write concurrent software.
Only concurrent applications can access all cores of modern multicore processors and provide the best performance.
Java thread model makes it a strong contender among concurrent languages, but multithreading has always been inherently tricky.
The introduction of virtual threads empowered Java to provide a unique and highly-optimized threading system that is also easy to understand. You can now create millions of threads without the overhead of creating a native operating system thread.
Now you need mechanisms to manage this huge number of threads.
Virtual threads, now an official feature of Java, create the possibility of cheaply spawning threads to gain concurrent performance. As a result, Java now has a unique and highly-optimized threading system that is also easy to understand.
Application designers often face the following questions:
-
How do you cope with a huge number of threads?
-
How do you process huge collections?
-
How do you process data flows with back pressure?
-
How do you design distributed and reactive systems?
Structured Concurrency
Structured concurrency enhances the maintainability, reliability, and observability of multithreaded code. It adopts a concurrent programming style that reduces the likelihood of thread leaks and cancellation delays. These are common risks associated with cancellation and shutdown.
As the JEP for structured concurrency says, If a task splits into concurrent subtasks, then they all return to the same place, namely the task’s code block.
Structured concurrency is a programming paradigm that provides a way to write concurrent software using familiar program flows and constructs. It guarantees that all concurrent tasks are properly managed and cleaned up when leaving the scope of the try with resources block.
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
Future<Shelter> shelter = scope.fork(this::getShelter);
Future<List<Dog>> dogs = scope.fork(this::getDogs);
scope.join();
Response response = new Response(shelter.resultNow(), dogs.resultNow());
// ...
}
When exiting the try-with-resources block, the structured concurrency framework ensures that all concurrent tasks are properly managed and cleaned up.
Modern Java programmers never create threads directly. Complex or slow algorithms shall be parallelized using structured concurrency to exploit modern multicore processors. |
Concurrent Stream Processing
Java 8 introduced the Stream API, which provides a way to process data in a declarative manner [3].
Java Streams support parallel processing, but the parallelism is not structured.
The parallelism is processing bound and not I/O bound. Therefore, the maximum number of active threads should be limited to the number of available processor cores.
The parallel stream is cleanup upon completion of the terminal operation.
List<Integer> listOfNumbers = Arrays.asList(1, 2, 3, 4);
int sum = listOfNumbers.parallelStream().reduce(5, Integer::sum);
Java streams library provides a rich set of functionalities that can work with any stream. The approach is similar to the sequence library in functional programming languages such as Clojure. Modern Java code processes any collection through streams. Huge collections are very efficiently processed in parallel streams on modern multicore processors. This approach transforms imperative code into a declarative style, which is easier to read and maintain. |
Concurrent Data Flow Processing
Concurrent data flow processing is based on the Reactive Streams API. It is a specification for asynchronous stream processing with non-blocking back pressure.
On one side, functional programming is the process of building software by composing pure functions, avoiding shared state, mutable data, and side effects.
On the other side, reactive programming is an asynchronous programming paradigm concerned with data streams and the propagation of change.
Together, functional reactive programming forms a combination of functional and reactive techniques that can represent an elegant approach to event-driven programming. Values change over time and where the consumer reacts to the data as it comes in.
The processing pipeline is composed of a source, a set of operators, and a sink.
String[] letters = {"a", "b", "c", "d", "e", "f", "g"};
Observable<String> observable = Observable.from(letters);
observable.subscribe(
i -> result += i, //OnNext
Throwable::printStackTrace, //OnError
() -> result += "_Completed" //OnCompleted
);
assertTrue(result.equals("abcdefg_Completed"));
The reactive library takes care of the threading and back pressure.
Distributed and Reactive Systems
The actor model is a programming model for concurrency in a distributed system. It is based on the concept of actors, which are independent entities that communicate with each other by sending messages.
Each actor has its own mailbox and processes messages one at a time. An actor only accesses its own state and does not share state with other actors.
An actor is an active object that encapsulates state and behavior and is implemented as a concurrent process.
I strongly recommend using the actor model when designing distributed and reactive systems. Avoid low-level concurrency primitives and thread pools. Design your system as a set of actors that communicate with each other by sending messages. Eliminate shared mutable state and use message passing to communicate between actors. |
Lessons Learnt
Between virtual threads and structured concurrency, Java developers have a compelling new mechanism for breaking up almost any code into concurrent tasks without much overhead. An application developer almost never uses concurrency primitives or thread pools directly. Beware of these design approaches and select wisely based on the requirements of the application.
Library developers can use the new concurrency primitives to build high-performance libraries that are straightforward to use.
Use parallel streams when processing huge collections.
Use reactive programming when processing data flows with back pressure, different sampling rates, and complex event processing.
Use the actor model when designing distributed and reactive systems.
Links
-
[1] Data Classes, Sealed Types and Pattern Matching Marcel Baumann. 2024
-
[2] Advanced Streams Marcel Baumann. 2024
-
[3] Java Modules Marcel Baumann. 2024
-
[4] Structure Your Concurrency Marcel Baumann. 2024
References
[1] D. Farley, Modern Software Engineering. Pearson Education, Limited, 2022 [Online]. Available: https://www.amazon.com/dp/B09GG6XKS4
[2] J. Bloch, Effective Java, Third. Addison-Wesley Professional, 2017 [Online]. Available: https://www.amazon.com/dp/B078H61SCH
[3] V. Subramaniam, Functional Programming In Java Harnessing The Power Of Java 8 Lambda Expressions. The Pragmatic Programmers, 2014 [Online]. Available: https://www.amazon.com/dp/B0CJL7VKFL