The Workflow Fragment Description Ontology (wf-fd) is a simple ontology designed to link the common workflow fragments detected by applying graph mining techniques to a collection of workflows to the original workflow collection. That is, wf-fd links the common workflow fragments to the workflows where they appear.
The latest owl encoding of the wf-fd ontology ontology can be found here
A scientific workflow can be seen as a digital instrument that allows scientists to encode a scientific experiment in the form of a set of computational or data manipulation steps. Scientific workflows play an important role in the reproducibility and replicability of scientific experiments, as well as in repurposing and reusing results from previous experiments [Goderis et. al.].
Given their importance in the research lifecycle, scientific workflows are beginning to be included in scientific publications, together with datasets and other elements used in the context of an experiment. At the same time, repositories of workflows like myExperiment, Crowdlabs or Galaxy facilitate workflow publication, exchange and reuse. These repositories currently store thousands of workflows (referred to as workflow templates), which have been uploaded by scientists in many different domains (ranging from life sciences to text analytics or astronomy).
These workflow templates and the provenance associated to their executions are used for different purposes: detection of the source of an error in a particular execution, determining workflow similarity among workflows, automatic workflow mining for helping in workflow design or detection of common workflow fragments among the workflow dataset.
The Workflow Fragment Description ontology (Wf-fd) aims to model workflow fragments by connecting them to the different workflows to which they correspond. The objective of the ontology is twofold:
The Wf-fd ontology complements the work described in [Garijo et. al.], where an approach for detecting common workflow fragments is described. Some of the descriptions and motivation used in this document are extracted from that publication.
Wf-fd extends the Ontology for provenance and plans (P-plan) to link workflow fragments to the workflow templates where they can be found. The next tables summarize the classes and properties that have been used to extend or complement P-plan to adapt it to our particular domain. No dataproperties are included, because Wf-fd doesn't define any:
As stated, the Wf-fd ontology aims to link workflow fragments to their occurrences in workflow templates. A
p-plan:Plan, since a fragment of a workflow is a workflow itself (which is also a type of p-plan:Plan). A workflow fragment has steps
p-plan:Step) which represent the individual data manipulation steps of a particular fragment.
There are two types of
former refers to those workflow fragments found as a result of applying graph mining techniques among a workflow collection (i.e., the results of the algorithms).
The latter is used to represent how a particular result workflow fragment is bound to a part of a workflow template. This separation is necessary to
properly point to the different parts of a workflow where a fragment appears. For instance, if we find that a fragment appears twice in a workflow, then we need
wffd:TiedWorkflowFragments to group the workflow steps belonging to each fragment. For this we use the relationship
which links a
wffd:DetectedResultWorkflowFragment to the
wffd:TiedWorkflowFragment which represents it in the workflow.
Workflow fragments may be included in other workflow fragments. In order to capture this overlap among the detected result workflow fragments, we use the relationship
This facilitates querying the results, being able to retrieve efficiently the fragments related to each other.
Another function to facilitate relating fragments is
wffd:foundIn. This function connects a
wffd:DetectedResultWorkflowFragment to the Workflow (
p-plan:Plan) where it was found (i.e., where one or more
wffd:TiedWorkflowFragments have been
Workflow fragments are composed by steps (
p-plan:Step). Since we are interested in representing the ordering of the steps in the result fragment, we use the property
between detected result fragment steps. The ordering of other workflow fragment steps, such as the ones belonging to a tied workflow fragment, is out of the scope of Wf-fd.
The Wf-fd complete diagram can be seen in Figure 1 below.
An example of usage of the Wf-fd ontology can be seen in Figure 2. Three workflows are represented on the top of the figure (Workflow 1, Workflow 2 and Workflow 3) with their steps coloured in orange. Each step has its URI represented on the top (e.g., :step1W1), while their type is represented in angle brackets. In order to avoid adding complexity to the figure, only the relevant type of the step for detecting the fragments is shown (e.g., <A>, <B>, etc).
On the bottom of the figure the two detected result workflow fragments are shown. As depicted in
Figure 1, each workflow fragment step belongs to a workflow fragment. The detected result workflow fragments (shown in yellow)
are linked to their tied workflow fragment (parts of the original workflows where they appear, shown in green) with the
wffd:foundAs relationship. Each fragment is also
directly connected to the original workflow with the property