Research finds AI training system filled with misogynistic, pornographic and racist data

‘Extremely problematic’ material was found in what claims to be the world’s largest openly available dataset

Adeba Birhane and Vinay Uday Prabhu have previously found concerning material in other datasets and last year their findings led to Massachusetts Institute of Technology withdrawing an 80 million image library and calling for researchers to cease using it.

A high level of misogynistic, pornographic and racist material has been found by a UCD researcher in a newly released library designed to train artificial intelligence systems.

Abeba Birhane, a cognitive science PhD student in University College Dublin’s Complex Software Lab, worked with Vinay Uday Prabhu, an independent researcher, and Emmanuel Kahembwe, from the University of Edinburgh, on the project.

The LAION-400M dataset was released in August and claims to be the world’s largest openly ...