Computer Vision News - November 2021

Figure 2: Emotion and wellbeing prediction performance improvements achieved through personalized multi-task learning (MTL-people). 2 Summary Congr ts, Doctor! 44 Multi-agent coordination and communication: Current RL systems require many samples to learn effectively, which makes learning from human data prohibitively expensive. Instead, we can use multi-agent reinforcement learning (MARL) to learn in simulation, as a way to pre-train agents to coordinate effectively with humans. Social influence [16] proposed a unified mechanism for achieving both coordination and communication in MARL. Agents use counterfactual reasoning over a model of other agents to compute and maximize the degree of causal influence they exert on other agents' actions (see Figure 3). This mechanism led agents to cooperate more effectively, because they learned to use their actions to communicate useful information to other agents in order to gain influence (for example, signaling the presence of resources outside the field-of- view of other agents). Our results demonstrate that influence, which can be computed in a fully decentralized manner, led agents to cooperate more effectively than existing state- of-the-art multi-agent methods which made use of centralized training, or privileged access to other agents' rewards or parameters. Figure 3: Model used to compute counterfactual causal influence of the agent’s action on other agents; requires both RL and supervised components

RkJQdWJsaXNoZXIy NTc3NzU=