DeepMind turns its attention to AI safety.

The folks at DeepMind are seeking to contribute to AI safety. They have designed a 2D testing environment for algorithms. The environment does not purport to cover all possible AI safety problems. For example, interpretability, multi-agent and formal verification safety problems are not covered. But a decent number is covered. These include safe interruptibility, absent supervisor and self-modification safety problems. It’s definitely a start. The Donald Rumsfeld problem is, however, always out there. What about the unknown unknowns?

Link to paper:

