With Mario Ghossoub, Alexander Schied, and our PhD student Hongda Hu, we recently uploaded a paper entitled Multiarmed Bandits Problem Under the Mean-Variance Setting on ArXiv, The classical multi-armed bandit (MAB) problem involves a learner and a collection of K independent arms, each with its own ex ante unknown independent reward distribution. At each one of a finite number of rounds, the learner selects one arm and receives new information. The learner often faces an exploration-exploitation dilemma: exploiting the current information by playing the … <a href=“https://freakonometric …