Data modelling as a BA technique is in decline

In my experience, very few business analysts produce data models.

Of course my experience is that of one person across the whole IT industry, so my view is a thin slice through a very big cake. What’s more, my experience is limited to process-driven projects. I have no experience in, for example, Data Warehousing.

By “data model” I mean a representation of data needs. For the purposes of business architecture, such a model must be technology-agnostic and easily understood by the people who operate the business (both the BIZBOK and the BABOK use the term ‘information model’). Lately, I have started using the term “business taxonomy” but it’s important to state that this is not just a stenographed glossary of terms as dictated by the business (see ‘Things aren’t the same just because you treat them the same‘).

I started in IT in 1996 as a graduate trainee ‘Systems Engineer’ with Electronic Data Systems (EDS). The first technical skill everyone on the training programme was taught was modelling a business’s data needs using an Entity Relationship Diagram. On my first big project a few years later, we also modelled data needs in the form of a Logical Class Model (which, although technology-agnostic, did not provide an adequate taxonomy of the business). However, once I got into the BPM world, I found the emphasis was on implementing not business processes but what amounted to screen flows. In the BPM world there was pressure to implement flows as quickly as possible and the business’s data needs were not analysed, they were merely documented on a screen-by-screen basis. What’s more, it was not unusual for the same data to appear as fields on multiple screens in several functional areas, often labelled differently. The result was that fields which were logically the same were being implemented multiple times in physical data bases. Outcome: the new systems passed function tests and user acceptance tests but over time became clogged with inconsistent and redundant data that caused technical performance problems as well as business problems.

By the way, there is no “tweaking” a database whose underlying data architecture does not reflect business reality. There is no avoiding a costly refactoring and data migration exercise.

Along with most of my colleagues, the skill of data modelling had withered because we weren’t allowed to use it and so we forgot its importance in describing the very nature of a business. However, over the past ten years, I have come to see data modelling as a basic skill not just for business analysts but also for software developers and testers. As a freelance consultant, I have rarely had the pleasure of working on a greenfields project. I am usually hired for BPM projects where a system is being rebuilt because the first implementation didn’t work, or for projects where the physical data architecture has already been laid down. In every case, there was no data architecture on the business side and the physical data architecture was based on screen requirements alone. Yet none of the highly experienced people on those projects saw that as a problem.

To be clear, I am not criticising those people, after all, until 2009 I had been working along the same lines. I’m criticising what I see as characteristics of BPM projects:

  • It’s all about process (with little or no attention to data or decision modelling)
  • Build screen flows (instead of realising business processes)
  • Get the solution into production as quickly as possible (delivery, instead of adoption by the business users, is
  • the measure of success)
  • You can always refine the process later (but you can’t refine a flawed physical data architecture)

Over the years, I’ve come to realise that a clear and unambiguous business taxonomy is key to getting the right physical data architecture for a software solution. It is the very foundation upon which a solution is built. I have realised this more with each project and since 2009, I have emphasised data modelling more and more in my analysis of a business. Let me be clear: I am not saying that in my role as a business architect I should be involved in designing the physical architecture. Far from it. However, in my experience, the absence of a clear, unambiguous, technology-agnostic business taxonomy leads to a physical data architecture that does not represent the business reality.

Example:

In the UK, if you wish to make a complaint to a business which is regulated by the Financial Conduct Authority (FCA), there are essentially three stages:

  1. You make a complaint directly to the business
  2. If you’re not satisfied with the response, you can take your complaint to the Financial Ombudsman’s Office (FOS)
  3. If you’re not satisfied with the outcome of that, you can take it to court

Note that businesses only have to report complaints to the FCA if they have not resolved them within a certain time limit.

Even that small amount of information tells you to start modelling a single business entity called “Complaint” and have the following attributes against it:

  • Date and time received (for determining the age of the Complaint, e.g., for identifying Complaints to be reported to the FCA)
  • Stage

The data held against an instance of the Complaint entity will change or be added to as it passes through the stages above but it remains a single thing. Some years ago I was involved in the complete rebuilding of a complaints handling BPM system (built using Pega). The previous version (which had gone into production two years earlier but was not deemed fit for purpose – see my blog post “Software delivery does not equal success“) had four different Complaint entities in its physical data architecture:

  • A Complaint that has been received but is not yet reportable to the FCA
  • A Complaint that is reportable to the FCA
  • A Complaint that is being addressed by the FOS
  • A Complaint that has gone to court

In the original BPM solution, as the process passed from one stage to the next, an instance of the next type of Complaint was created and the data from the previous type was copied into it, resulting in duplication of data across four Complaint Cases that were actually just a single case. The result of this (and several other problems caused by the data architecture’s not matching the business reality) was that after two years in production, the database became bloated and data was inconsistent and unreliable. Bear in mind that there was no business taxonomy in place for Version 1, the data needs were defined only in screen designs.

The root cause of the problem was that the business described it as four different complaints, instead of a single complaint that could pass through four stages, and the Lead Systems Architect built exactly what was described to him. Unfortunately, the tendency in software development is to assume that what the business says is correct, when in reality most business people do not have the linguistic rigour to clearly and unambiguously define their taxonomy. I have quoted John Ciardi in a previous blog post:

‘The language of experience is not the language of classification.’

Business people have not been trained in such rigour simply because the day-to-day operational needs of their business do not require it but the needs of business architecture and software development do require such rigour. Business analysts (in particular), software developers and testers should be so rigorous but they don’t seem to be. While they are not, businesses will continue to spend millions on rebuilds of software which they already spent millions on. A further problem is that staff churn means people move on from one project to another and are not around to learn from their mistakes. What mistakes? They delivered didn’t they? The software went into production, didn’t it? Oh, wait: Software delivery does not equal success.

I’ll write more on business taxonomy and data modelling in a future post.

Kind regards.

Declan Chellar

Source: http://www.chellar.com/AnalysisFu/?p=2382