For the fourth course, we will discuss Wasserstein distance and Optimal Transport. Last week, we mentioned distances, dissimilarity and divergences. But before talking about Wasserstein, we should mention Cramer distance. Cramer and Wasserstein distances The definition of Cramér distance, for , is while Wasserstein will be (also for ) If we consider cumulative distribution functions, for the first one (Cramer), we consider some sort of “vertical” distance, while for the second one (Wasserstein), we consider some “horizontal” one, Obviously, when , the two distances …