@@ -32,9 +32,9 @@ Four experiments will be conducted that are widely used as MFG domains wherethe
Consider a 2D grid world of dimension 11 × 11. The action set is $\mathcal{A}$=\{up, down, left, right, stay\}. The dynamics are:
\begin{equation}
$$
x_{n+1}=x_n+a_n+\epsilon_n
\end{equation}
$$
where $\epsilon_n$ is an environment noise that perturbs each agent's movement ($\epsilon_n = $ no perturbation with probability $0.9$, and $\epsilon_n$ is one of the four directions with probability $0.025$ for each direction).
The reward function will discourage agents from being in a crowded location: $$ r(x, a, \mu)=-\log (\mu(x))-\frac{1}{|X|}|a| $$