The goal of process mining is to use event data to extract process-related information.
2.1 Limitations of Modeling
Process mining can be viewed as the missing link between data science and process science.
Process mining starts from event data and uses process models in various ways. a process model expressed in terms of a Petri net
The model describes the handling of a request for compensation within an airline.
This activity is modeled by transition (变迁) register request. Each transition is represented by a square.
Transitions are connected through places (库所) that model possible states of the process. Each place is represented by a circle.
In a Petri net a transition is enabled, if all input places hold a token (令牌).
A process model used to configure a workflow management system is probably well-aligned with reality as the model is used to force people to work in a particular way. Unfortunately, most hand-made models are disconnected from reality and provide only an idealized view on the process.
The value of models is limited if too little attention is paid to the alignment of model and reality.
Given (a) the interest in process models, (b) the abundance of event data, and (c) the limited quality of hand-made models, it seems worthwhile to realate event data to process models. This way the actual processes can be discovered and existing process models can be evaluated and enhanced. This is precisely what process mining aims to achieve.
目前对2.1章的理解:
用模型来分析现实中的情况存在一些限制,因为模型可能过于理想化,脱离了现实。
实际中都是先制定模型,再实施,这很容易脱离实际。过程挖掘的目标是提取现有的过程,再加以分析,改进,丰富。
2.2 Process Mining
BPM life-cycleprocess models play a dominant role in the (re)design and configuration/implementation phases, whereas data plays a dominant role in the enactment/monitoring and diagnosis/requirements phases.
Process mining offers the possibility to truly "close" the BPM life-cycle.
The idea of process mining is to discover, monitor and improve real processes by extracting knowledge from event logs readily available in today's systems.
three main types of process mining: discovery, conformance, and enhancementMost information systems store such information in unstructured form. Data extraction is an intergral part of any process mining effort.
Event logs can be used to conduct three types of process mining.
-
discovery:
A discovery technique takes an event log and produces model without using any a-priori information. An example is the α-algorithm. This algorithm takes an event log and produces a Petri net explaining the behavior recorded in the log. -
conformance
Here, an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. -
enhancement
Here, the idea is to extend or improve an existing process model using information about the actual process recorded in some event log. It aims at changing or extending the a-priori model.
More and more process mining techniques can also be used in an online setting.
目前对2.2章的理解:
目前的信息系统可以记录许多数据,但这些数据大多都没有被有效地组织起来,从而不能直接用于过程挖掘。信息提取是过程挖掘不可缺少的一部分。
过程挖掘的三个用途:
(1)发现先验模型,文中提到的α-algorithm可直接根据日志生成一个Petri net.
(2)验证一致性,可检验模型是否符合日志,也可反过来检验日志是否符合模型。
(3)丰富模型,可根据日志对先验模型进行扩展。
过程挖掘不止可以用于之后分析与提升,还可以作用于线上。
2.3 Analyzing an Example Log
One of the challenges of process mining is to balance between "overfitting" (the model is too specific and only allows for the "accidental behavior" observed) and "underfitting" (the model is too general and allows for behavior unrelated to the behavior observed).
The α-algorithm is just one of many possible process discovery algorithms. For real-lift logs more advanced algorithms are needed to better balance between "overfitting" and "underfitting" and to deal with "incompleteness" and "noise".
Note that conformance can be viewed from two angles: (a) the model does not capture the real behavior ("the model is wrong") and (b) reality deviates from the desired model ("the event log is wrong").
Using process mining, the different perspectives can be cross-correlated to find surprising insights.
目前对2.3章的理解:
2.3章对机票补偿模型进行了更具体的分析。
过程挖掘需要在“过于针对”和“过于普通”之前找到一个权衡点,这样得出的模型才更有价值。
类似于α-algorithm的算法还有很多,有更多优秀的算法可以找到那个权衡点,并且规避一些问题。
过程挖掘不仅局限于对control-flow模型,还可以加上时间等层面,得出更多有潜在的内容。
2.4 Play-In, Play-Out, and Replay
One of the key elements of process mining is the emphasis on establishing a strong relation between a process model and "reality" captured int he form of an event log.
Three ways of relating event logs
-
Play-Out
Play-Out refers to the classical use of process models. Given a Petri net, it is possible to generate behavior. -
Play-In
Play-In is the opposite of Play-Out, i.e., example behavior is taken as input and the goal is to construct a model. Play-In is often referred to as inference. -
Replay
Replay uses an event log and a process model as input. The event log is "replayed" on top of the process model.
目前对2.4章的理解:
本章阐述了三种关系。
- Play-In: 给定 event log,生成 process model. (α-algorithm 等)
- Play-Out: 给定 process model, 生成 event log. (传统 process model 的用处)
- Replay: 给定 event log 和 process model,去得到一些东西。 (用于检验一致性,扩展模型)
网友评论