Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation

Lee, Donghyun; Park, Hosung; Seo, Soonshin; Kim, Changmin; Son, Hyunsoo; Kim, Gyujin; Kim, Ji-Hwan

doi:10.32604/cmc.2021.015430

Open Access icon Open Access

ARTICLE

Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation

by Donghyun Lee, Hosung Park, Soonshin Seo, Changmin Kim, Hyunsoo Son, Gyujin Kim, Ji-Hwan Kim^*

Department of Computer Science and Engineering, Sogang University, Seoul, 04107, Korea

* Corresponding Author: Ji-Hwan Kim. Email: email

Computers, Materials & Continua 2021, 68(1), 537-551. https://doi.org/10.32604/cmc.2021.015430

Received 20 November 2020; Accepted 02 February 2021; Issue published 22 March 2021

Abstract

A differentiable neural computer (DNC) is analogous to the Von Neumann machine with a neural network controller that interacts with an external memory through an attention mechanism. Such DNC’s offer a generalized method for task-specific deep learning models and have demonstrated reliability with reasoning problems. In this study, we apply a DNC to a language model (LM) task. The LM task is one of the reasoning problems, because it can predict the next word using the previous word sequence. However, memory deallocation is a problem in DNCs as some information unrelated to the input sequence is not allocated and remains in the external memory, which degrades performance. Therefore, we propose a forget gate-based memory deallocation (FMD) method, which searches for the minimum value of elements in a forget gate-based retention vector. The forget gate-based retention vector indicates the retention degree of information stored in each external memory address. In experiments, we applied our proposed NTM architecture to LM tasks as a task-specific example and to rescoring for speech recognition as a general-purpose example. For LM tasks, we evaluated DNC using the Penn Treebank and enwik8 LM tasks. Although it does not yield SOTA results in LM tasks, the FMD method exhibits relatively improved performance compared with DNC in terms of bits-per-character. For the speech recognition rescoring tasks, FMD again showed a relative improvement using the LibriSpeech data in terms of word error rate.

Keywords

Forget gate-based memory deallocation; differentiable neural computer; language model; forget gate-based retention vector

Cite This Article

APA Style

Lee, D., Park, H., Seo, S., Kim, C., Son, H. et al. (2021). Language model using differentiable neural computer based on forget gate-based memory deallocation. Computers, Materials & Continua, 68(1), 537-551. https://doi.org/10.32604/cmc.2021.015430

Vancouver Style

Lee D, Park H, Seo S, Kim C, Son H, Kim G, et al. Language model using differentiable neural computer based on forget gate-based memory deallocation. Comput Mater Contin. 2021;68(1):537-551 https://doi.org/10.32604/cmc.2021.015430

IEEE Style

D. Lee et al., “Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation,” Comput. Mater. Contin., vol. 68, no. 1, pp. 537-551, 2021. https://doi.org/10.32604/cmc.2021.015430

BibTex EndNote RIS

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation

Abstract

Keywords

Cite This Article

2182

1146

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link