Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 18
  • Item
    Thumbnail Image
    Advanced Process Discovery Techniques
    Augusto, A ; Carmona, J ; Verbeek, E ; van der Aalst, WMP ; Carmona, J (Springer Cham, 2022-01-01)
    Given the challenges associated to the process discovery task, more than a hundred research studies addressed the problem over the past two decades. Despite the richness of proposals, many state-of-the-art automated process discovery techniques, especially the oldest ones, struggle to systematically discover accurate and simple process models. In general, when the behavior recorded in the input event log is simple (e.g., exhibiting little parallelism, repetitions, or inclusive choices) or noise free, some basic algorithms such as the alpha miner can output accurate and simple process models. However, as the complexity of the input data increases, the quality of the discovered process models can worsen quickly. Given that oftentimes real-life event logs record very complex and unstructured process behavior containing many repetitions, infrequent traces, and incomplete data, some state-of-the-art techniques turn unreliable and not purposeful. Specifically, they tend to discover process models that either have limited accuracy (i.e., low fitness and/or precision) or are syntactically incorrect. While currently there exists no perfect automated process discovery technique, some are better than others when discovering a process model from event logs recording complex process behavior. In this chapter, we introduce four of such techniques, discussing their underlying approach and algorithmic ideas, reporting their benefits and limitation, and comparing their performance with the algorithms introduced in the previous chapter.
  • Item
    No Preview Available
    The connection between process complexity of event sequences and models discovered by process mining
    Augusto, A ; Mendling, J ; Vidgof, M ; Wurm, B (ELSEVIER SCIENCE INC, 2022-06)
  • Item
    Thumbnail Image
    Process Mining-Driven Analysis of the COVID19 Impact on the Vaccinations of Victorian Patients
    Augusto, A ; Deitz, T ; Faux, N ; Manski-Nankervis, J-A ; Capurro, D ( 2021-12-08)
    Process mining is a discipline sitting between data mining and process science, whose goal is to provide theoretical methods and software tools to analyse process execution data, known as event logs. Although process mining was originally conceived to facilitate business process management activities, research studies have shown the benefit of leveraging process mining tools in different contexts, including healthcare. However, applying process mining tools to analyse healthcare process execution data is not straightforward. In this paper, we report the analysis of an event log recording more than 30 million events capturing the general practice healthcare processes of more than one million patients in Victoria–Australia–over five years. Our analysis allowed us to understand benefits and limitations of the state-of-the-art process mining techniques when dealing with highly variable processes and large data-sets. While we provide solutions to the identified limitations, the overarching goal of this study was to detect differences between the patients‘ health services utilization pattern observed in 2020– during the COVID-19 pandemic and mandatory lock-downs –and the one observed in the prior four years, 2016 to 2019. By using a combination of process mining techniques and traditional data mining, we were able to demonstrate that vaccinations in Victoria did not drop drastically–as other interactions did. On the contrary, we observed a surge of influenza and pneumococcus vaccinations in 2020, contradicting research findings of similar studies conducted in different geographical areas.
  • Item
    Thumbnail Image
    Discovering data transfer routines from user interaction logs
    Leno, V ; Augusto, A ; Dumas, M ; La Rosa, M ; Maggi, FM ; Polyvyanyy, A (PERGAMON-ELSEVIER SCIENCE LTD, 2022-07)
    Robotic Process Automation (RPA) is a technology to automate routine work such as copying data across applications or filling in document templates using data from multiple applications. RPA tools allow organizations to automate a wide range of routines. However, identifying and scoping routines that can be automated using RPA tools is time consuming. Manual identification of candidate routines via interviews, walk-throughs, or job shadowing allow analysts to identify the most visible routines, but these methods are not suitable when it comes to identifying the long tail of routines in an organization. This article proposes an approach to discover automatable routines from logs of user interactions with IT systems and to synthetize executable specifications for such routines. The proposed approach focuses on discovering routines where a user transfers data from a set of fields (or cells) in an application, to another set of fields in the same or in a different application (data transfer routines). The approach starts by discovering frequent routines at a control-flow level (candidate routines). It then determines which of these candidate routines are automatable and it synthetizes an executable specification for each such routine. Finally, it identifies semantically equivalent routines so as to output a set of non-redundant routines. The article reports on an evaluation of the approach using a combination of synthetic and real-life logs. The evaluation results show that the approach can discover automatable routines that are known to be present in a UI log, and that it discovers routines that users recognize as such in real-life logs.
  • Item
    Thumbnail Image
    Discovering executable routine specifications from user interaction logs
    Leno, V ; Augusto, A ; La Rosa, M ; Polyvyanyy, A ; Dumas, M ; Maggi, F ( 2021)
  • Item
    Thumbnail Image
    Detection of statistically significant differences between process variants through declarative rules
    Augusto, A ; Cecconi, A ; Di Ciccio, C ; van der Aalst, W ; Mylopoulos, J ; Rosemann, M ; Shaw, MJ ; Szyperski, C (Springer, 2020)
    Services and products are often offered via the execution of processes that vary according to the context, requirements, or customisation needs. The analysis of such process variants can highlight differences in the service outcome or quality, leading to process adjustments and improvement. Research in the area of process mining has provided several methods for process variant analysis. However, very few of those account for a statistical significance analysis of their output. Moreover, those techniques detect differences at the level of process traces, single activities, or performance. In this paper, we aim at describing the distinctive behavioural characteristics between variants expressed in the form of declarative process rules. The contribution to the research area is two-pronged: the use of declarative rules for the explanation of the process variants and the statistical significance analysis of the outcome. We assess the proposed method by comparing its results to the most recent process variant analysis methods. Our results demonstrate not only that declarative rules reveal differences at an unprecedented level of expressiveness, but also that our method outperforms the state of the art in terms of execution time.
  • Item
    Thumbnail Image
    Automated Discovery of Process Models with True Concurrency and Inclusive Choices
    Augusto, A ; Dumas, M ; La Rosa, M (Springer International Publishing, 2021-01-01)
    Enterprise information systems allow companies to maintain detailed records of their business process executions. These records can be extracted in the form of event logs, which capture the execution of activities across multiple instances of a business process. Event logs may be used to analyze business processes at a fine level of detail using process mining techniques. Among other things, process mining techniques allow us to discover a process model from an event log – an operation known as automated process discovery. Despite a rich body of research in the field, existing automated process discovery techniques do not fully capture the concurrency inherent in a business process. Specifically, the bulk of these techniques treat two activities A and B as concurrent if sometimes A completes before B and other times B completes before A. Typically though, activities in a business process are executed in a true concurrency setting, meaning that two or more activity executions overlap temporally. This paper addresses this gap by presenting a refined version of an automated process discovery technique, namely Split Miner, that discovers true concurrency relations from event logs containing start and end timestamps for each activity. The proposed technique is also able to differentiate between exclusive and inclusive choices. We evaluate the proposed technique relative to existing baselines using 11 real-life logs drawn from different industries.
  • Item
    Thumbnail Image
    Identifying candidate routines for Robotic Process Automation from unsegmented UI logs
    Leno, V ; Augusto, A ; Dumas, M ; La Rosa, M ; Maggi, FM ; Polyvyanyy, A ; vanDongen, B ; Montali, M ; Wynn, MT (IEEE, 2020-10-22)
    Robotic Process Automation (RPA) is a technology to develop software bots that automate repetitive sequences of interactions between users and software applications (a.k. a. routines). To take full advantage of this technology, organizations need to identify and to scope their routines. This is a challenging endeavor in large organizations, as routines are usually not concentrated in a handful of processes, but rather scattered across the process landscape. Accordingly, the identification of routines from User Interaction (UI) logs has received significant attention. Existing approaches to this problem assume that the UI log is segmented, meaning that it consists of traces of a task that is presupposed to contain one or more routines. However, a UI log usually takes the form of a single unsegmented sequence of events. This paper presents an approach to discover candidate routines from unsegmented UI logs in the presence of noise, i.e. events within or between routine instances that do not belong to any routine. The approach is implemented as an open-source tool and evaluated using synthetic and real-life UI logs.
  • Item
    Thumbnail Image
    Automatic Repair of Same-Timestamp Errors in Business Process Event Logs
    Conforti, R ; La Rosa, M ; ter Hofstede, A ; Augusto, A ; Fahland, D ; Ghidini, C ; Becker, J ; Dumas, M (Springer, 2020)
    This paper contributes an approach for automatically correcting “same timestamp” errors in business process event logs. These errors consist in multiple events exhibiting the same timestamp within a given process instance. Such errors are common in practice and can be due to the logging granularity or the performance load of the logging system. Analyzing logs that have not been properly screened for such problems is likely to lead to wrong or misleading process insights. The proposed approach revolves around two techniques: one to reorder events with same-timestamp errors, the other to assign an estimated timestamp to each such event. The approach has been implemented in a software prototype and extensively evaluated in different settings, using both artificial and real-life logs. The experiments show that the approach significantly reduces the number of inaccurate timestamps, while the reordering of events scales well to large and complex datasets. The evaluation is complemented by a case study in the meat & livestock domain showing the usefulness of the approach in practice.
  • Item
    Thumbnail Image
    Optimization Framework for DFG-based Automated Process Discovery Approaches
    Augusto, A ; Dumas, M ; La Rosa, M ; Leemans, S ; Vanden Broucke, S (Springer Verlag, 2020)
    The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by means of metaheuristic optimization techniques. However, these studies have remained at the level of proposals without validation on real-life datasets or they have only considered one metaheuristic in isolation. This article presents a metaheuristic optimization framework for automated process discovery. The key idea of the framework is to construct a Directly-Follows Graph (DFG) from the event log, to perturb this DFG so as to generate new candidate solutions, and to apply a DFG-based automated process discovery approach in order to derive a process model from each DFG. The framework can be instantiated by linking it to an automated process discovery approach, an optimization metaheuristic, and the quality measure to be optimized (e.g. fitness, precision, F-score). The article considers several instantiations of the framework corresponding to four optimization metaheuristics, three automated process discovery approaches (Inductive Miner – directly follows, Fodina, and Split Miner), and one accuracy measure (Markovian F-score). These framework instances are compared using a set of 20 real-life event logs. The evaluation shows that metaheuristic optimization consistently yields visible improvements in F-score for all the three automated process discovery approaches, at the cost of execution times in the order of minutes, versus seconds for the baseline approaches.