Optional<T> left = Optional.of(value).flatMap(F).flatMap(G);
Optional<T> right = Optional.of(value).flatMap(F.apply(value).flatMap(G));
assert(left.equals(right));
Functional Programming
Introduction
Functional  ObjectOriented 

Order of execution is not always important. It can be asynchronous. 
A specific execution sequence of statements is crucial 
Functions are basic building elements to work with data structures. 
Objects are the main abstractions for data 
Follows the declarative paradigm. 
Follows the imperative paradigm. 
Focuses on the result desired and its final conditions. 
Focuses on how the desired result can be achieved. 
Ensure immutability; programs should be stateless. 
State changes are an important part of the execution. 
Primary activity is writing new functions and 
Primary activity is building, extending and 
Concepts
Definitions
 Referential Transparency

A function, or more generally an expression, is called referentially transparent if a call can be replaced by its value without affecting the behavior of the program. Simply spoken, given the same input implies the output is always the same.
 Persistent Data Structure

A persistent data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not visibly update the structure inplace, but instead yield a new updated structure.
 Higher Order Functions

In computer science, a higherorder function is a function that does at least one of the following:

Takes one or more functions as arguments

Returns a function as its result

 Pure Functions

The function return values are identical for identical arguments (no variation with local static variables, nonlocal variables, mutable reference arguments or input streams). The function application has no side effects (no mutation of local static variables, nonlocal variables, mutable reference arguments or input/output streams).
 Functional Composition

In mathematics, function composition is an operation that takes two functions f and g and produces a function h such that h(x) = g(f(x)) Composition requires higher order functions. For instance, the functions \$f(x): X → Y\$ and \$g(y): Y → Z\$ are composed to yield a function \$h(x): g(f(x))\$ which maps \$X → Z\$.
 Lambda Lifting

is a metaprocess that restructures a computer program so that functions are defined independently of each other in a global scope. An individual "lift" transforms a local function into a global function. It is a twosteps process: Eliminating free variables in the function by adding parameters Moving functions from a restricted scope to broader or global scope.
 Currying

In computer science, currying is the technique of converting a function that takes multiple arguments into a sequence of functions that each take a single argument.
Monads
Monad is just a monoid in the category of endofunctors.
Examples of monads in modern day Java programming language:

Stream<T>

Optional<T>

Try<T>
Monad Laws
The last thing that needs to mention while speaking of monads is their laws. If we want to consider our implementation a real monad, we must obey them. There are three laws: left identity, right identity and associativity. In my opinion, it can be somewhat hard to understand what they actually mean. With the help of Optional<T>, I will try to explain the above laws in a more detailed way.
First a few assumptions:

\$F\$ is a function with the signature: \$(T > Optional<U>) = Optional<U>\$.

\$G\$ is a function with the signature \$(A > Optional<B>) = Optional<B>\$.

\$FG = F.apply(value).flatMap(G)\$ with the signature: \$(T > Optional<B>) = Optional<B>\$.
Left identity
If we create a new monad and bind it to the function, the result should be the same as applying the function to the value:
\$Optional.of(value).flatMap(F).equals(F.apply(value))\$
Right identity
The result of binding a unit function to a monad should be the same as the creation of a new monad:
\$Optional.of(value).flatMap(Optional::of).equals(Optional.of(value))\$
Associativity
In the chain of function applications, it should not matter how functions are nested:
Creation of Monad
The first thing we need is a parameterized type M<T>, which is a wrapper for our value of type T. Our type must implement two functions:

Unit which is used to wrap our value and has a following signature \$(T) = M<T>\$.

Bind is responsible for performing operations. Here we pass a function, which operates on value in our context and returns it with another type already wrapped in context. This method should have the following signature \$(T > M<U>) = M<U>\$.
To make it more understandable, I will use Optional one more time and show what the above structure looks like in its case.
Here, the first condition is met right away because Optional is a parameterized type. The role of the unit function is fulfilled by ofNullable and of methods. FlatMap plays the role of the bind function. Of course, in the case of Optional, type boundaries allow us to use more complex types than in the definition above. == Streams