Language models are bad at moral reasoning

Language models still struggle on moral reasoning, despite their impressive performance in many other tasks. In particular, the Moral Scenarios task in MMLU (Multi-task Language Understanding) is among the worst performing tasks for many language models, including GPT-3. In this work, we propose a new prompting framework, Thought Experiments, to teach language models to do better moral reasoning using counterfactuals. Experiment results show that our framework elicits counterfactual questions an...

Mercury Labs

Cyber Essentials Certified

25 Eccleston Place
SW1W 9NF
London
United Kingdom

Let's talk