التفاصيل البيبلوغرافية
العنوان: |
EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks |
المؤلفون: |
Chen, Sheng-Wei, Chou, Chun-Nan, Chang, Edward Y. |
سنة النشر: |
2018 |
المجموعة: |
ArXiv.org (Cornell University Library) |
مصطلحات موضوعية: |
Computer Science - Machine Learning |
الوصف: |
For training fully-connected neural networks (FCNNs), we propose a practical approximate second-order method including: 1) an approximation of the Hessian matrix and 2) a conjugate gradient (CG) based method. Our proposed approximate Hessian matrix is memory-efficient and can be applied to any FCNNs where the activation and criterion functions are twice differentiable. We devise a CG-based method incorporating one-rank approximation to derive Newton directions for training FCNNs, which significantly reduces both space and time complexity. This CG-based method can be employed to solve any linear equation where the coefficient matrix is Kronecker-factored, symmetric and positive definite. Empirical studies show the efficacy and efficiency of our proposed method. ; Comment: Change to AAAI-19 Version |
نوع الوثيقة: |
text |
اللغة: |
unknown |
العلاقة: |
http://arxiv.org/abs/1802.06502Test |
الإتاحة: |
http://arxiv.org/abs/1802.06502Test |
رقم الانضمام: |
edsbas.DD9AF9D3 |
قاعدة البيانات: |
BASE |