The purpose of my work is to design a method to transform a textual process description (in English) into a business process model. This is of practical relevance, since process models are often designed by business analysts starting from textual documentation. The method to be designed aims at automating the text-to-diagram conversion phase as much as possible.
Natural languages are known to be highly complex and ambiguous. Accordingly, for this project we will approach the problem using a best-effort approach, meaning that the method is not intended to work always. Instead, the proposed approach will be able to detect certain sentence structures and extract actors, actions and objects/artifacts from them. Coordinating and subordinating conjunctions, as well as punctuation and other markers, will be used to identify sequencing, parallelism, conditional branching and repetition. The output of the method will be a block-structured process model.
The method is being implemented in Java based on open-source Natural-Language Processing (NLP) libraries. Specifically, Part-of-Speech (POS) tagging is performed using the Stanford parser and according to the POS tags, corresponding process entities are identified using Tregex and Tsurgeon. The current implementation is already able to identify actors, actions/tasks and artifacts from sentences that abide to certain common structures. Additionally the implementation is able to correctly interpret passive voice construction, avoid articles, parenthesis and other complex structures for the purpose of extracting essential information about the process.