Code Scene as Crime Scene
You have a successful product and happy users. Slowly, the cost of adding new features is creeping up and product margins are shrinking. Which crimes did put us in this dreaded situation?
How can you analyze the history of your product’s source code?
How can you explore the social dimension of your product development? How can you find good approaches to increase time to market and improve development costs?
This article presents a set of tools you can use to better understand the alternatives and select the best approach for your product.
Our approach is based on the de facto standard for source code version management git. If your development department is using another tool, you should consider moving to another company.
First Steps
We want to investigate various contexts in our source code history and identify suspects. A context can be a single file, a class, a package, or a module.
Gather overall information concerning your repository. Use the optionals --before and --after, to restrict the time range you are interested in.
git log --numstat --pretty=format:'[%h] %an %ad %s' --date=short --before='YYY-MM-DD' --after='YYYY-MM-DD'
Find out which contexts are most often modified
git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -100
or find out which contexts with a specific extension are most often modified
git log --pretty=format: --name-only | sort | uniq -c | sort -rg | grep -I extension | head -100
Find out how many times a specific context was modified
git log -- contextPath --pretty=format:'%an' | grep 'Author' | sed -e 's/Author: \(.*\) <\(.*\)/\1/' | sort -rg | uniq -c | sort -rg
Find out how many errors were fixed in a specific context – e.g., identified through close #TicketId -
git log -- contextPath | grep 'close #' | wc -l
Deeper Insights
Analysis of various products found out that 5% to 10% of the source code is under active development. It is where your money is spent to make your customers happy. Put all your refactoring efforts to improve these 10% of your product currently impacting your customer satisfaction.
Technical debt reduction should always be prioritized to only these hotspots. Use the above-described approach to identify the most often changed files during a specific time interval.
Identify team and social metrics
To understand team dynamics, you should group individual developers to their team. Replace the name of the team member with the team name at the command line using grep or sed. Now you can analyze the set of files the team mainly modifies and maintains. If none can be found, you have diffuse code ownership and will have quality issues.
Using the domain-driven design DDD terminology, bounded domains are developed and maintained through one team. This approach encourages architectural purity, continuous refactoring, and accountability.
Manage Off-boarding Risks
Each time you find a set of source code artifacts only modified by a single individual, you have identified a major off-boarding risk. When this developer leaves the department or the company, all knowledge associated with the development and the maintenance of these components will immediately be lost.
Visualize
The above scripts provide information in textual forms. Over time, you will become more sophisticated with your inquiries and need better tools. Either visualize your findings with d3js or plotly.js or buy a commercial tool such as the Code Scene.
Another approach is to write a small framework to analyze the data and to implement more complex queries in your preferred environment.
Next Steps
The above techniques are part of the toolbox of professional development departments. Establish a software craftsmanship culture in your company. It helps you to avoid the invasion of gangs and eradicate crime in your neighborhood.
Find similar ideas in our blogs Pragmatic Craftsmanship, SonarLint for the impatient, and You need an Engineering Culture.
References
[1] A. Tornhill, Your Code as a Crime Scene. O’Reilly Media [Online]. Available: https://www.amazon.com/dp/1680500384
[2] A. Tornhill, Software Design X-Rays. Pragmatic Bookshelf [Online]. Available: https://www.amazon.com/dp/B07BVRLZ87