FORTE: an extensible framework for robustness and efficiency in data transfer pipelines

التفاصيل البيبلوغرافية
العنوان: FORTE: an extensible framework for robustness and efficiency in data transfer pipelines
المؤلفون: Hilgendorf, Martin, 1999, Gulisano, Vincenzo Massimiliano, 1984, Papatriantafilou, Marina, 1966, Engström, Jan, Mishra, Binay
المصدر: Relaxed Semantics Across the Data Analytics Stack (RELAX) EPITOME - Sammanfattning och strukturering av kontinuerlig data i pipelines för samtidig behandling 17th ACM International Conference on Distributed and Event-based Systems, DEBS 2023, Neuchatel, Switzerland DEBS 2023 - Proceedings of the 17th ACM International Conference on Distributed and Event-based Systems. :139-150
مصطلحات موضوعية: resource utilization, distributed processing, data transfer efficiency, data pipelines, internet of things, edge computing
الوصف: In the age of big data and growing product complexity, it is common to monitor many aspects of a product or system, in order to extract well-founded intelligence and draw conclusions, to continue driving innovation. Automating and scaling processes in data-pipelines becomes essential to keep pace with increasing rates of data generated by such practices, while meeting security, governance, scalability and resource-efficiency demands.We present FORTE, an extensible framework for robustness and transfer-efficiency in data pipelines. We identify sources of potential bottlenecks and explore the design space of approaches to deal with the challenges they pose. We study and evaluate synergetic effects of data compression and in-memory processing as well as task scheduling, in association with pipeline performance.A prototype implementation of FORTE is implemented and studied in a use-case at Volvo Trucks for high-volume production-level data sets, in the order of magnitude of hundreds of gigabytes to terabytes per burst. Various general-purpose lossless data compression algorithms are evaluated, in order to balance compression effectiveness and time in the pipeline.All in all, FORTE enables to deal with trade-offs and achieve benefits in latency and sustainable rate (up to 1.8 times better), effectiveness in resource utilisation, all while also enabling additional features such as integrity verification, logging, monitoring and traceability, as well as cataloguing of transferred data. We also note that the resource efficiency improvements achievable with FORTE, and its extensibility, can imply further benefits regarding scheduling, orchestration and energy-efficiency in such pipelines.
وصف الملف: electronic
الوصول الحر: https://research.chalmers.se/publication/537528Test
https://research.chalmers.se/publication/537528/file/537528_Fulltext.pdfTest
قاعدة البيانات: SwePub