Entities, Identifiers, External identifiers and Names

2020-08-24

Tags:

A software architect models customer domains and maps them to powerful software abstractions [1]. Soon you correlate internal efficient identifiers, meaningful external identifiers, and domain entities.

You enjoy long conversations with the enterprise data architect and identify the system owning a specific external identifier.

How do you relate to external systems and communicate with other companies?

How do you define internal identifiers, which grow with application success?

How can you bridge the customer world with your software solution?

What are the good practices to create a maintainable, legible, and efficient model of your domain model?

Entity Concept

We need an approach to model customer domain entities and map them to a legible and maintainable software construct [2, 3].

An entity is a user domain model abstraction and is mapped to a software type. The entity has:

A unique mandatory internal object identifier oid, we recommend the use of numeric values to improve performance.
A public and external identifier id, we recommend the use of text to adequately support various external identification schemes. An example is the European enterprise identifier EUID identifying a company or a certified natural person in Europe.
A human-readable name. A name does not need to be unique in the system.

A more sophisticated variant provides additional features.

A set of tags to classify the instance through a crowd-based ontology. Tags are often called labels.
A list of comments to add human-readable information to the instance. Comments have a timeline and can be sorted by creation date. The comments can be extended to add audit information such as functional change information or activities related to the instance.

These entity features should be defined as a set of mixin interfaces.

Internal Object Identifiers oid

The object identifier oid uniquely identifies an instance of a specific type or belonging to a specific type hierarchy.

This identifier shall be a numerical value to increase the performance of persistent solutions. It is used as an internal identifier in the application.

It should never be visible outside the system or published through an API.

Ideally, the identifier is universally valid and uniquely identifies an instance in all contexts. For example, the concept of UUID tries to provide such an identifier. The drawback is that the UUID is not a numerical value and cumbersome for a developer or a user to memorize.

An interesting approach is to support unique identifiers in the context of a bounded domain. All entities of the domain will have unique object identifier.

The implementation can use a sequence from the domain database or schema. A programmatic identity generator can also be established in the domain.

Because the oid is never exported, the solution is powerful and simple to realize.

History has taught developers not to spare on the size of identifiers. Please use a long value meaning 64 bits.

Avoid using the internal identifier to communicate with external systems.

If you respect this rule, you are free to migrate your objects to another identification scheme in the future. New schemes are often helpful when the application grows, or you have later to import and take over a lot of legacy data.

External Object Identifiers id

The external unique identifier id shall uniquely identify an entity instance. It is used as an external identifier to communicate with other systems. It should always be visible outside the system and is used in any public API.

Try to have exactly one external identifier per object to communicate with external systems. This restriction is a corollary of the rule stating an instance is owned by exactly one system. This system is the one defining and managing the external unique identifiers for the related instances.

You should clearly define the owning system for the external identification scheme. This information shall be documented and accessible to all involved parties. This process is part of the enterprise architecture activities of your ecosystem.

External identifier ownership is often a murky situation when working with legacy systems. You often have multiple sources of external identifiers, sometimes overlapping and sometimes not covering all instances. You have to formulate a long-term strategy to clean up your landscape and handle the problems until these cleanup activities are completed. By handling, we mean administrative and import rules matching the various external identifiers to the same object.

One possible solution is to use tags. Store externally defined identifiers as tags. Document these tags as specific for the external system. This approach scales to multiple external identifiers and multiple systems managing the same external identifier. So you have a scalable approach and do not pollute your domain model with spurious information defined in external systems. Upon completion of the refactoring activities, these tags document historical information and could safely be removed.

Multiple external object identifiers imply the existence of mapping functions to identify the object referenced. Because the ownership of external identifiers is outside your system, you are dependent on these systems and have to hope they are good citizens. The key rule of good systems is that they never change external identifiers. As soon as you modify identifiers, they are, per definition, no more real identifiers.

Names name

The entity name is a human-readable name to distinguish between entity instances. Ideally, it should be almost unique. The fallback is to use the external unique identifier id known to all external systems. The drawback is that we have no guarantee it is legible for users. Names are essential for well-designed user interfaces. Never require from your users memorizing external identifiers, please provide names.

For example, the first and last names of a person are the name for a natural entity. Social security number is a possible external identifier An internal identifier is used as a primary key in the persistence storage.

Advices

Internal object identifiers are identifiers. An identifier is immutable and should be numerical for performance reasons.
External object identifiers have exactly one application in charge to create them.
Names are human-readable and improve the legibility and usability of the user interface and reports.

External identifiers can be tricky. In Switzerland, we had an old social security number, which is still referenced in a lot of legal systems. For example, it is still part of your tax salary yearly form. This is the reason I strongly advocate internal identifiers. You have no control over external systems providing accepted external identifiers.

Identifiers are a key element to model entities using the domain driven design DDD approach.

We have a new social security number, which is used in social insurance workflows. The same number is also used in medical insurance workflows.

We also have a federal identity card number, a federal passport number, and a federal driver’s license number. Additional used identifiers are medical card insurance numbers, a state tax personal identification number.

All these external identifiers shall reference the same natural person.

More interesting is that a tourist living in the European zone has none of these numbers.

Please implement the internal identifier feature as an interface marker. The external identifier and name features can be grouped into one interface.

Additional information is available under models. Below, the source code in modern Java is:

public interface HasOid {
    long oid();
}

public interface HasId extends HasOid {
    String id();
    void id(String id);
    String name();
    void name(String name);
}

We provide a Java library core implementing these constructs. More information is available under tangly open source components.

Extensions

See our blog how to extend the entity concept with the powerful and flexible concepts of tags and comment approaches.

Another blog shows a constrained form of tags using the concept of reference codes, also called reference tables or lookup tables.

Related concepts are discussed in our blog series

Entities, Identifiers, External identifiers and Names Marcel Baumann. 2020.
The power of Tags and Comments Marcel Baumann. 2020.
Reference Codes Marcel Baumann. 2020.
Value Objects as Embedded Entities Marcel Baumann. 2021.
Meaningful Identifiers Marcel Baumann. 2021.

References

[1] S. Hofer and H. Schwentner, Domain Storytelling. Pearson Education, Limited, 2021 [Online]. Available: https://www.amazon.com/dp/B099ZNXCJT

[2] E. Evans, Domain-driven design. Addison-Wesley, 2004 [Online]. Available: https://www.amazon.com/dp/0321125215

[3] V. Vernon, Implementing Domain driven Design. Addison-Wesley Professional, 2012 [Online]. Available: https://www.amazon.com/dp/B00BCLEBN8