Open Access
ARTICLE
Resource Allocation and Power Control Policy for Device-toDevice Communication Using Multi-Agent Reinforcement Learning
1 Beijing Key Laboratory of Work Safety Intelligent Monitoring, Beijing University of Posts and
Telecommunications, Beijing, 100876, China.
2 Alibaba Cloud Computing, Hangzhou, 311121, China.
3 Department of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6, Canada.
* Corresponding Author: Yifei Wei. Email: .