Exploiting Distributional Temporal Difference Learning to Deal with Tail Risk
AuthorBossaerts, P; Huang, S; Yadav, N
Document TypeJournal Article
CitationsBossaerts, P., Huang, S. & Yadav, N. (2020). Exploiting Distributional Temporal Difference Learning to Deal with Tail Risk. Risks, 8 (4), pp.1-20. https://doi.org/10.3390/risks8040113.
Access StatusOpen Access
In traditional Reinforcement Learning (RL), agents learn to optimize actions in a dynamic context based on recursive estimation of expected values. We show that this form of machine learning fails when rewards (returns) are affected by tail risk, i.e., leptokurtosis. Here, we adapt a recent extension of RL, called distributional RL (disRL), and introduce estimation efficiency, while properly adjusting for differential impact of outliers on the two terms of the RL prediction error in the updating equations. We show that the resulting “efficient distributional RL” (e-disRL) learns much faster, and is robust once it settles on a policy. Our paper also provides a brief, nontechnical overview of machine learning, focusing on RL.
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References