Our joint paper, with Romuald Elie and Carl Remlinger entitled Reinforcement Learning in Economics and Finance just appeared in Computational Economics, Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal rewards. As in online learning, the agent learns sequentially. As in multi-armed bandit problems, when an agent picks an action, he can not infer ex-post the rewards induced by … <a href=“https://freakonomet