Who is at fault – the language or the speaker?

As researchers, Jan Recker and I find it challenging to strike a balance between our efforts to meet the academic standards required by the wider research community and the demands regarding accessibility, relevance, timeliness and appropriateness instilled by the wider practitioner communities. We were happy to find that our blogging about research results inspires the BPM community to not only to take an interest in our research but also to critically assess this work and to post replies to it. We find this most welcome.

Our previous post on the frequency of BPMN construct usage has generated a passionate response by Bruce Silver who we know and respect as a very active contributor not only to BPM blogging in general but also to BPMN education and application specifically. Bruce makes many good points in his post and raises a number of interesting challenges. However, on some accounts we disagree with a number of the inferences he draws, so we want to clarify some aspects of our original post.

First of all, the paper and post are the result of a joint research effort between Jan Recker (QUT Brisbane) and myself (Stevens Institute of Technology), which we have stated. Jan and I started working together due to the complementary nature of our interests – standards in BPM (myself) and practical usage of modeling methods (Jan). Our study has been motivated by the fact that we know so very little about how standards such as BPMN are actually used – as opposed to what vendors, consultants and trainers think how they should be (or might be) used. – and this is what we try to explore and understand. The post and the related study are but one snapshot of our combined research. Agree with our results or not, but please give credit where credit is due. Jan is one of the most prolific researchers on BPMN; and it would be unfair to ignore his substantial contribution to this research.

We have outlined our research method in great detail in the full version of the BPMN paper, the PDF of which has been linked to from the original post. If you missed it, click here. Of course, we could have written a great deal more about the mode of analysis, but let’s be frank: how many blog readers would want to see this information in the post? (let us know if you do!) And for those of you taking an interest in research methods – both Jan and myself are more than happy to discuss the ways in which academic research is conducted. More than welcome.

We started with a simple question: BPMN is divided into a core and an extended set of constructs – does this separation hold in practice, or are there other common subsets (dialects or creoles) that can be found in practice? If there is such a common subset, we would expect a sizable number of models to share it. We found no evidence of a larger common core. Only 6 model pairs out of the 126 models used similar BPMN subsets (i.e. there were 6 subsets shared by 2 models).

We looked at the similarity among all subsets by coding the occurrence of symbols as a 50-bit string and computing the pairwise Hamming distance. On average 7 symbols differed between the BPMN subsets, and since the average model used only 9 symbols that makes the true common core very very small.

We performed a hierarchical cluster analysis on the models, trying to find the constructs that were used in groups. Indeed, several well-defined clusters emerged from this analysis: Basic Modeling Constructs, Annotations and Explanations (which include the blank XOR Gateway – not something we expected), Organization Modeling Constructs, and Control Flow Refinements. Users that move beyond these clusters seem to add individual constructs as needed, but in a rather random fashion.

Whether the 126 models we gathered are representative to all BPMN uses is a good question. Of course, we don’t claim this to be the case and we are in fact expanding our collection of models (hey Bruce, want to send us some of your seminar models?). However, so far our results have proven stable. We spend a great deal of our time with organizations using BPMN and we can assure you upfront – this is indeed indicative of how people use BPMN.

Bruce likens a frequency count of BPMN symbols to a character count in a document. We disagree – BPMN symbols are more like words, since they have semantics and are governed by formation rules. There are no formation rules at the character level in most languages. One could liken the frequency count to the frequency with which words in the English language are used – and that provides a much more useful metric than a character count. Linguists talk about the difference between an active and a passive vocabulary – words that we use versus words that we understand. It is possible that the use of BPMN is emerging along the same lines – a modeler might understand many of the symbols, but will frequently restrict him or herself to a more limited subset. To illustrate this: You may understand many entries in Merriam-Webster’s dictionary of the English language, but you do not use them frequently (or at all).

Do the models we collected have errors? Absolutely. Some of them we find useful in modeling courses – to show the types of errors usually made in practice. Our intention was not to analyze perfect BPMN models – we find those in every training course and in tool documentations, etc. The BPM reality looks different. Our intention was to analyze the current practice of BPMN modeling, not the indended application of the language. English speakers abuse their language – I know I do – but that does not mean that their sentences are meaningless.

Turning to some of the conclusions we draw from our research, we would like to clarify some aspects: What we call ‘the real core set of BPMN’ is what our analysis showed to be the most frequently used BPMN symbols found in the models considered. This does not mean we imply this set to be the core set of BPMN to be used by everyone. Rather, this is the minimal set of BPMN constructs actually used in practice so far. Is this set little more than flowcharting? Absolutely true. Absolutely.

But what does that tell us? People, and organizations, use BPMN for purposes similar to those organizations ten, twenty years ago that employed flowcharting – they want to describe their operations in simple, graphical terms. The process modeling efforts in most organizations at this stage are simply not advanced or mature enough to start specifying service-enable workflows with exception behavior in BPMN. No, most people use it simply for flowcharting.

What we conclude from this observation is that the ecosystem of vendors, consultants and trainers should be aware of this and should plan, manage and employ their efforts (be it tool development, BPMN training or modeling workshops) accordingly. We present a number of conjectures based on these observations, some of which appear to be troubling to Bruce. This is worrisome to us, we hope we can clarify this a bit more:

First, we see a great deal of training programs introducing the full BPMN specification to large number of stakeholders. Our results show, however, that most of this training is in fact only applicable to a small number of BPMN application areas. So we have to ask: Are there any tailored BPMN training programs? What should the ‘BPMN beginner’ course look like and how this body of knowledge then be extended by specialist courses? One of the suggestions we raise is indeed to start with the set of BPMN symbols that in fact are widely used in practice. Why? Because this would allow the BPMN beginner to instantly be able grasp, understand and use the majority of models in practice. Sure, (s)he would not yet be an expert, sure (s)he would not yet have learned about the benefits and expressive power of advanced BPMN. But (s)he can go out and leverage the knowledge instantly and make contributions. Without having to digest the complexity of a full-blown course. We do not imply that business users do not understand the more refined BPMN symbols, we have just found little evidence that they use them frequently.
Second, we suggest to tool vendors to rely more on empirical information about BPMN use when having to make trade-off decisions in BPMN support. Let’s face it – many BPMS do not support the full set of BPMN constructs. This makes sense, because if the system does not have the capability to execute the semantics of a specific construct (say, a transaction around a set of activities) then if would not make sense to allow a system analyst to draw this symbol. So which constructs can a vendor neglect initially and which need to be supported? We would argue that it is of best interest to vendors to focus on those constructs heavily used in practice. Why? Because this would give them access to the widest share of the market. Simple as that. This does not mean, that our suggestion is of a static nature. Of course not. Over time, full support should be given – and (relating to our previous conclusion) also BPMN users should learn the advanced features of BPMN. But organizations and tool vendors alike often face a need to achieve results very very fast. Which also means that releases are built and deployed that are far from finished.
Third, we think that our last conclusion was misread. Our intention is not to discredit the sizable development effort that went into the BPMN specification. More than 120 people participated in more than 120 interactions, be they face to face or conference calls. That’s a lot of BPM expertise leading to the current specification. We do not discourage advancement. We actually like BPMN’s advanced vocabulary. But have you asked end users what they think? Well, we did. Not only in this study but also in Jan’s large-scale BPMN usability studies we did find that users are in fact very troubled by the sheer number of, for example, event constructs. Are they used at a large scale? No. Do users understand their full capacity? Typically not. Why is this not at all reflected in BPMN development? That is exactly our point. Sure, our argument is a somewhat provocative statement. But if it helps to channel some attention to end usage, that’s fair by our standards.
We know a great deal about what BPMN can do in theory, how it is implemented in tools, how training programs (like Bruce’s) look like and even how we generate code from the diagrams and how the semantics can be tested and vigorously verified. But what do we know about how organizations engaged in BPM initiatives use it? Very little. Again, we were motivated by exactly this dearth of knowledge about real-life BPMN practice. Why? Because our own experiences with BPMN and with those organizations using it gave us this hunch that the theoretical usage (what vendors and consultants and trainers tell us) often has little to do with what the end users think or do (the practical usage). And why is it important to know what the end users think and do? Because it can help the researchers, vendors, consultants and trainers of this world to channel their attention and efforts to those problems real users face. Instead of the problems we think exist in practice.

We try to feed our empirical research back to the BPMN community – in the form of blogs, practitioner papers, or even directly by knocking on the door of OMG. Whether we are heard, and whether our findings have the type of impact we hoped is a different story. But we are always open for debate.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.