JALAD: Joint Accuracy- and Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution

التفاصيل البيبلوغرافية
العنوان: JALAD: Joint Accuracy- and Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution
المؤلفون: Li, Hongshan, Hu, Chenghao, Jiang, Jingyan, Wang, Zhi, Wen, Yonggang, Zhu, Wenwu
سنة النشر: 2018
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Distributed, Parallel, and Cluster Computing
الوصف: Recent years have witnessed a rapid growth of deep-network based services and applications. A practical and critical problem thus has emerged: how to effectively deploy the deep neural network models such that they can be executed efficiently. Conventional cloud-based approaches usually run the deep models in data center servers, causing large latency because a significant amount of data has to be transferred from the edge of network to the data center. In this paper, we propose JALAD, a joint accuracy- and latency-aware execution framework, which decouples a deep neural network so that a part of it will run at edge devices and the other part inside the conventional cloud, while only a minimum amount of data has to be transferred between them. Though the idea seems straightforward, we are facing challenges including i) how to find the best partition of a deep structure; ii) how to deploy the component at an edge device that only has limited computation power; and iii) how to minimize the overall execution latency. Our answers to these questions are a set of strategies in JALAD, including 1) A normalization based in-layer data compression strategy by jointly considering compression rate and model accuracy; 2) A latency-aware deep decoupling strategy to minimize the overall execution latency; and 3) An edge-cloud structure adaptation strategy that dynamically changes the decoupling for different network conditions. Experiments demonstrate that our solution can significantly reduce the execution latency: it speeds up the overall inference execution with a guaranteed model accuracy loss.
Comment: conference, copyright transfered to IEEE
نوع الوثيقة: Working Paper
DOI: 10.1109/PADSW.2018.8645013
الوصول الحر: http://arxiv.org/abs/1812.10027Test
رقم الانضمام: edsarx.1812.10027
قاعدة البيانات: arXiv