PubFlow is tested in the context of Kiel Marine Science and is used for getting the research data out of the institutional archive, the OCN database, into the World Data Center for oceanographic data Pangaea (www.pangaea.de). Therefore we created the Workflow depicted above as a evaluation scenario.
In our evaluation scenario, users interact with the system by the use of the Jira ticket system (www.atlassian.com/jira) and start the data publication workflow for a specific data set, by simply creating a ticket. Then workflow is loaded by the PubFlow system and is executed automatically, whenever human interaction is needed PubFlow contacts the contact person for the ticket by creating a response to the ticket in Jira.
This first publication workflow we implemented for PubFlow is a workflow for publishing data collected by a CTD probe. This data is stored in an institutional repository and has to be transfered to the world data center mare - Pangaea. The Figure shows this workflow. It mainly consists of five steps. In a first phase the information, the scientist provided, when he started the task, is loaded from the ticketsystem into the workflow engine. The next step is to fetch the data from the institutional repository. Now PubFlow performs predefined mapping and conversion operations on the dataset, so the output data format is compatible to the one used by the world data center. At last the data is written to the specified output format and exported. Although it is still under active development, PubFlow has already proven to be a very helpful tool for data managers by automating simple or periodic data management tasks.
You can find more information about how PubFlow works in our paper section.