BPMN is the de facto standard for graphical process modeling. While there are other graphical languages to represent processes (EPCs, IDEF, Flowcharts, Petri Nets, among others), no other notation has seen such an uptake in such a short time as BPMN has. It is widely supported by both free and commercial process modeling tools, the WfMC has made XPDL 2.0 and 2.1 a de-facto persistency format for BPMN diagrams, and a large number of courses on modeling processes with BPMN are being offered.
Now, BPMN is a complex language. The current incarnation (BPMN 1.1) consists of 52 distinct graphical elements: 41 flow objects, 6 connecting objects, 2 grouping objects, and 3 artifacts. That’s a lot of vocabulary to learn, given that each graphical elements has meaning and rules associated with it. So what is the minimum subset of BPMN that a process modeler should know? The answer: Less than you think.
To answer this question we collected a large number of BPMN 1.0 diagrams (126 in total), from consultants, seminar participants, and online sources. We analyzed which BPMN symbols were actually used in these diagrams. The full version of our research, which we will present at the Conference on Advanced Information Systems Engineering in June, can be found here. But since this is an academic paper, here are the practical highlights of our study.
None of the diagrams we looked at used more than 15 different BPMN constructs, and none used less than 3. The models themselves contained considerably more elements, but a model with, e.g., 5 tasks connected by sequence flow was recorded as using the task symbol and the sequence flow symbol. The average subset of BPMN used in these models consisted of just 9 different symbols. That means that the average BPMN model uses less than 20% of the available vocabulary.
Figure 1 shows which construct we found across which percentage of the diagrams we collected.
Figure 1: Frequency distribution of BPMN construct usage
The results of our study are:
- Only five elements (normal flow, task, end event, start event, and pool) were used in more than 50% of the models we analyzed. These, plus the data-based XOR gateway form what we call the common core of BPMN (marked in yellow in fig. 1).
- Six additional elements were found in at least 25% of the models – gateways (parallel and unmarked XOR), lanes, text annotations, message flow, and start messages, we call these the extended core of BPMN (marked in green in fig. 1).
- 17 elements were used in less than 3 models – seven elements occurred in just two models, five in just one, and five elements were not used in any of the models we studied.
We then looked at the co-occurrence of BPMN symbols – i.e., are certain constructs used in combination, and how frequently? The combination of certain elements is mandated by the BPMN specification – you cannot use lanes without pools, or data objects without associations. But if there is a common subset used by many models, this would constitute a true “common core”. A detailed analysis revealed that BPMN elements fall into several well-defined groups. Figure 2 shows these groups as frames around the respective BPMN elements. The numbers within each frame represent the number of models (out of 126) that contain all elements within the frame.
Figure 2: Grouping of BPMN elements
Our findings are:
- The common core of BPMN is very small. The subset of BPMN across the different models varied considerably. While nearly all models contain tasks and sequence flow, adding symbols to this set leads to a near exponential drop in models that share the (bigger) set of symbols. For example, while 65 models contain tasks, sequence flow, start and end events, only 25 also contain parallel gateways, and just 10 contain parallel gateways and data-based XOR gateways.
- There are two types of BPMN modelers. While our sample is too small to explore this proposition in detail, we found anecdotal evidence that two groups of modelers use BPMN: Those who use pools and lanes to represent organizational responsibility for tasks, and those who use gateways to represent the control-flow rules of the process in detail. In other words, one group uses BPMN to specify inter-organizational settings (process choreography). Mostly, these users will be consultants or process analysts working on organizational (re-) engineering and process improvement. The other BPMN user group is leaning more towards workflow engineering (process orchestration). These users will likely be designers and analysts seeking to articulate precise flow conditions, for instance, in the context of workflow engineering or process simulation.
Our findings have implications for practitioners, software vendors, and standards makers alike.
- Practitioners can begin studying the use of BPMN by focusing on the most commonly used symbols first, leaving more specialized and lesser-used constructs for those who need more specialized BPMN training (e.g. systems analysts).
- Software vendors that are not supporting the entire BPMN vocabulary can assess what percentage of BPMN diagrams can be represented in their tool, and where enhancements should be made.
- Finally, Standards-makers should review whether a more complete, but also more complex language is a desirable result of the standardization process. Creating BPMN took six years. How much time was spent on defining those seventeen symbols that we found are hardly used? And will the extensions of BPMN 1.1 entice users to expand their commonly used vocabulary, or will they go unused?
If you would like to learn more about this research, we encourage you to read the full version of our paper:
- Michael zur Muehlen, Jan Recker. (Jun 16, 2008). “How Much Language is Enough? Theoretical and Practical Use of the Business Process Modeling Notation”, 20th International Conference on Advanced Information Systems Engineering (CAiSE 2008), Montpellier, France, June 16-20, 2008., Springer LNCS. Download
As always, your questions or comments are much appreciated.