Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 13
  • Item
    Thumbnail Image
    Discovering data transfer routines from user interaction logs
    Leno, V ; Augusto, A ; Dumas, M ; La Rosa, M ; Maggi, FM ; Polyvyanyy, A (PERGAMON-ELSEVIER SCIENCE LTD, 2022-07)
    Robotic Process Automation (RPA) is a technology to automate routine work such as copying data across applications or filling in document templates using data from multiple applications. RPA tools allow organizations to automate a wide range of routines. However, identifying and scoping routines that can be automated using RPA tools is time consuming. Manual identification of candidate routines via interviews, walk-throughs, or job shadowing allow analysts to identify the most visible routines, but these methods are not suitable when it comes to identifying the long tail of routines in an organization. This article proposes an approach to discover automatable routines from logs of user interactions with IT systems and to synthetize executable specifications for such routines. The proposed approach focuses on discovering routines where a user transfers data from a set of fields (or cells) in an application, to another set of fields in the same or in a different application (data transfer routines). The approach starts by discovering frequent routines at a control-flow level (candidate routines). It then determines which of these candidate routines are automatable and it synthetizes an executable specification for each such routine. Finally, it identifies semantically equivalent routines so as to output a set of non-redundant routines. The article reports on an evaluation of the approach using a combination of synthetic and real-life logs. The evaluation results show that the approach can discover automatable routines that are known to be present in a UI log, and that it discovers routines that users recognize as such in real-life logs.
  • Item
    Thumbnail Image
    Automated Discovery of Process Models with True Concurrency and Inclusive Choices
    Augusto, A ; Dumas, M ; La Rosa, M (Springer International Publishing, 2021-01-01)
    Enterprise information systems allow companies to maintain detailed records of their business process executions. These records can be extracted in the form of event logs, which capture the execution of activities across multiple instances of a business process. Event logs may be used to analyze business processes at a fine level of detail using process mining techniques. Among other things, process mining techniques allow us to discover a process model from an event log – an operation known as automated process discovery. Despite a rich body of research in the field, existing automated process discovery techniques do not fully capture the concurrency inherent in a business process. Specifically, the bulk of these techniques treat two activities A and B as concurrent if sometimes A completes before B and other times B completes before A. Typically though, activities in a business process are executed in a true concurrency setting, meaning that two or more activity executions overlap temporally. This paper addresses this gap by presenting a refined version of an automated process discovery technique, namely Split Miner, that discovers true concurrency relations from event logs containing start and end timestamps for each activity. The proposed technique is also able to differentiate between exclusive and inclusive choices. We evaluate the proposed technique relative to existing baselines using 11 real-life logs drawn from different industries.
  • Item
    Thumbnail Image
    Identifying candidate routines for Robotic Process Automation from unsegmented UI logs
    Leno, V ; Augusto, A ; Dumas, M ; La Rosa, M ; Maggi, FM ; Polyvyanyy, A ; vanDongen, B ; Montali, M ; Wynn, MT (IEEE, 2020-10-22)
    Robotic Process Automation (RPA) is a technology to develop software bots that automate repetitive sequences of interactions between users and software applications (a.k. a. routines). To take full advantage of this technology, organizations need to identify and to scope their routines. This is a challenging endeavor in large organizations, as routines are usually not concentrated in a handful of processes, but rather scattered across the process landscape. Accordingly, the identification of routines from User Interaction (UI) logs has received significant attention. Existing approaches to this problem assume that the UI log is segmented, meaning that it consists of traces of a task that is presupposed to contain one or more routines. However, a UI log usually takes the form of a single unsegmented sequence of events. This paper presents an approach to discover candidate routines from unsegmented UI logs in the presence of noise, i.e. events within or between routine instances that do not belong to any routine. The approach is implemented as an open-source tool and evaluated using synthetic and real-life UI logs.
  • Item
    Thumbnail Image
    Automatic Repair of Same-Timestamp Errors in Business Process Event Logs
    Conforti, R ; La Rosa, M ; ter Hofstede, A ; Augusto, A ; Fahland, D ; Ghidini, C ; Becker, J ; Dumas, M (Springer, 2020)
    This paper contributes an approach for automatically correcting “same timestamp” errors in business process event logs. These errors consist in multiple events exhibiting the same timestamp within a given process instance. Such errors are common in practice and can be due to the logging granularity or the performance load of the logging system. Analyzing logs that have not been properly screened for such problems is likely to lead to wrong or misleading process insights. The proposed approach revolves around two techniques: one to reorder events with same-timestamp errors, the other to assign an estimated timestamp to each such event. The approach has been implemented in a software prototype and extensively evaluated in different settings, using both artificial and real-life logs. The experiments show that the approach significantly reduces the number of inaccurate timestamps, while the reordering of events scales well to large and complex datasets. The evaluation is complemented by a case study in the meat & livestock domain showing the usefulness of the approach in practice.
  • Item
    Thumbnail Image
    Optimization Framework for DFG-based Automated Process Discovery Approaches
    Augusto, A ; Dumas, M ; La Rosa, M ; Leemans, S ; Vanden Broucke, S (Springer Verlag, 2020)
    The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by means of metaheuristic optimization techniques. However, these studies have remained at the level of proposals without validation on real-life datasets or they have only considered one metaheuristic in isolation. This article presents a metaheuristic optimization framework for automated process discovery. The key idea of the framework is to construct a Directly-Follows Graph (DFG) from the event log, to perturb this DFG so as to generate new candidate solutions, and to apply a DFG-based automated process discovery approach in order to derive a process model from each DFG. The framework can be instantiated by linking it to an automated process discovery approach, an optimization metaheuristic, and the quality measure to be optimized (e.g. fitness, precision, F-score). The article considers several instantiations of the framework corresponding to four optimization metaheuristics, three automated process discovery approaches (Inductive Miner – directly follows, Fodina, and Split Miner), and one accuracy measure (Markovian F-score). These framework instances are compared using a set of 20 real-life event logs. The evaluation shows that metaheuristic optimization consistently yields visible improvements in F-score for all the three automated process discovery approaches, at the cost of execution times in the order of minutes, versus seconds for the baseline approaches.
  • Item
    Thumbnail Image
    Metaheuristic Optimization for Automated Business Process Discovery
    Augusto, A ; Dumas, M ; La Rosa, M ; Hildebrandt, T ; VanDongen, BF ; Roglinger, M ; Mendling, J (Springer, 2019-09-01)
    The problem of automated discovery of process models from event logs has been intensely investigated in the past two decades, leading to a range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by using metaheuristic optimization. However, these studies have remained at the level of proposals without validation on real-life logs or they have only considered one metaheuristics in isolation. In this setting, this paper studies the following question: To what extent can the accuracy of automated process discovery approaches be improved by applying different optimization metaheuristics? To address this question, the paper proposes an approach to enhance automated process discovery approaches with metaheuristic optimization. The approach is instantiated to define an extension of a state-of-the-art automated process discovery approach, namely Split Miner. The paper compares the accuracy gains yielded by four optimization metaheuristics relative to each other and relative to state-of-the-art baselines, on a benchmark comprising 20 real-life logs. The results show that metaheuristic optimization improves the accuracy of Split Miner in a majority of cases, at the cost of execution times in the order of minutes, versus seconds for the base algorithm.
  • Item
    Thumbnail Image
    Discovering Automatable Routines From User Interaction Logs
    Bosco, A ; Augusto, A ; Dumas, M ; La Rosa, M ; Fortino, G (Springer, Cham, 2019)
    The complexity and rigidity of legacy applications in modern organizations engender situations where workers need to perform repetitive routines to transfer data from one application to another via their user interfaces, e.g. moving data from a spreadsheet to a Web application or vice-versa. Discovering and automating such routines can help to eliminate tedious work, reduce cycle times, and improve data quality. Advances in Robotic Process Automation (RPA) technology make it possible to conveniently automate such routines, but not to discover them in the first place. This paper presents a method to analyse user interactions in order to discover routines that are fully deterministic and thus amenable to automation. The proposed method identifies sequences of actions that are always triggered when a given activation condition holds and such that the parameters of each action can be deterministically derived from data produced by previous actions. To this end, the method combines a technique for compressing a set of sequences into an acyclic automaton, with techniques for rule mining and for discovering data transformations. An initial evaluation shows that the method can discover automatable routines from user interaction logs with acceptable execution times, particularly when there are one-to-one correspondences between parameters of an action and those of previous actions, which is the case of copy pasting routines.
  • Item
    Thumbnail Image
    Measuring Fitness and Precision of Automatically Discovered Process Models: A Principled and Scalable Approach
    Augusto, A ; Conforti, R ; Armas-Cervantes, A ; Dumas, M ; La Rosa, M (IEEE COMPUTER SOC, 2022-04-01)
  • Item
    Thumbnail Image
    Automated Discovery of Process Models from Event Logs: Review and Benchmark
    Augusto, A ; Conforti, R ; Dumas, M ; La Rosa, M ; Maggi, FM ; Marrella, A ; Mecella, M ; Soo, A (Institute of Electrical and Electronics Engineers, 2019-04-01)
    Process mining allows analysts to exploit logs of historical executions of business processes to extract insights regarding the actual performance of these processes. One of the most widely studied process mining operations is automated process discovery. An automated process discovery method takes as input an event log, and produces as output a business process model that captures the control-flow relations between tasks that are observed in or implied by the event log. Various automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy, and complexity of the resulting models. However, these methods have been evaluated in an ad-hoc manner, employing different datasets, experimental setups, evaluation measures, and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of closed datasets. This article provides a systematic review and comparative evaluation of automated process discovery methods, using an open-source benchmark and covering 12 publicly-available real-life event logs, 12 proprietary real-life event logs, and nine quality metrics. The results highlight gaps and unexplored tradeoffs in the field, including the lack of scalability of some methods and a strong divergence in their performance with respect to the different quality metrics used.
  • Item
    Thumbnail Image
    Split Miner: Automated Discovery of Accurate and Simple Business Process Models from Event Logs
    Augusto, A ; Conforti, R ; Dumas, M ; La Rosa, M ; Polyvyanyy, A (Springer Verlag, 2019-05)
    The problem of automated discovery of process models from event logs has been intensively researched in the past two decades. Despite a rich field of proposals, state-of-the-art automated process discovery methods suffer from two recurrent deficiencies when applied to real-life logs: (i) they produce large and spaghetti-like models; and (ii) they produce models that either poorly fit the event log (low fitness) or over-generalize it (low precision). Striking a trade-off between these quality dimensions in a robust and scalable manner has proved elusive. This paper presents an automated process discovery method, namely Split Miner, which produces simple process models with low branching complexity and consistently high and balanced fitness and precision, while achieving considerably faster execution times than state-of-the-art methods, measured on a benchmark covering twelve real-life event logs. Split Miner combines a novel approach to filter the directly-follows graph induced by an event log, with an approach to identify combinations of split gateways that accurately capture the concurrency, conflict and causal relations between neighbors in the directly-follows graph. Split Miner is also the first automated process discovery method that is guaranteed to produce deadlock-free process models with concurrency, while not being restricted to producing block-structured process models