arrowCIRSS Home arrow Publications arrow Publication Detail

Computing Location-Based Lineage from Workflow Specifications to Optimize Provenance Queries

Full APA Reference

Dey, S., Köhler, S., Bowers, S., & Ludäscher, B.. (2015). Computing Location-Based Lineage from Workflow Specifications to Optimize Provenance Queries. In B. Ludäscher & Plale, B., Provenance and Annotation of Data and Processes (Vol. 8628, pp. 180-193). Springer International Publishing. doi:10.1007/978-3-319-16462-5_14

Publication Abstract

We present a location-based approach for executing provenance lineage queries that significantly reduces query execution cost without incurring additional storage costs. The key idea of our approach is to exploit the fact that provenance graphs resemble the workflow graphs that generated them and that many workflow computation models assume workflow steps have statically defined data consumptionproduction (i.e., data input-output) rates. We describe a new lineage computation technique that uses the structure of workflow specifications together with consumption-production rates to pre-compute (i.e., to forecast) the access paths of all dependent data items prior to workflow execution. We also present experimental results showing that our approach can significantly out perform traditional data lineage query techniques. © Springer International Publishing Switzerland 2015.