Meta’s oversight board to probe subjective policy on AI sex image removals

Meta continues to slowly adapt Facebook and Instagram policies to account for increasing AI harms, this week confronting how it handles explicit deepfakes spreading on its platforms.

On Tuesday, the Meta Oversight Board announced it will be reviewing two cases involving AI-generated sexualized images of female celebrities that Meta initially handled unevenly to “assess whether Meta’s policies and its enforcement practices are effective at addressing explicit AI-generated imagery.”

The board is not naming the famous women whose deepfakes are being reviewed in hopes of mitigating “risks of furthering harassment,” the board said.

In one case, an Instagram user reported a nude AI image created to “resemble a public figure from India” that was posted to an account that “only shares AI-generated images of Indian women.” Instead of taking down the deepfake, Meta automatically closed the user’s report “because it was not reviewed within 48 hours.” The user’s attempt to appeal the report was also automatically closed.

Meta ultimately left the deepfake up until the user appealed to the oversight board. Meta’s spokesperson declined Ars’ request to comment on Meta’s delay in removing the image before the board intervened. “As a result of the Board selecting this case, Meta determined that its decision to leave the content up was in error and removed the post for violating the Bullying and Harassment Community Standard,” the board’s blog said.

A Facebook user had much better success in reporting a deepfake created to “resemble an American public figure” who was depicted nude “with a man groping her breast.” That AI-generated image was posted in “a Facebook group for AI creations” using a caption that named the famous woman, the board said.

In that case, another “user had already posted this image,” leading Meta to escalate its safety team’s review, removing the content “as a violation of the Bullying and Harassment policy” that bans “derogatory sexualized photoshop or drawings.”

Because Meta banned the image, it was also added to “Meta’s automated enforcement system that automatically finds and removes images that have already been identified by human reviewers as breaking Meta’s rules.”

For this case, the board was asked to review the case by a Facebook user who tried to share the AI image after an initial attempt to protest the decision to remove it was automatically closed by Meta.

Meta’s oversight board is currently reviewing both cases and for the next two weeks will be seeking public comments to help Meta’s platforms get up to speed with mitigating AI harms. Facebook and Instagram users, as well as organizations that “can contribute valuable perspectives,” will have until April 30 to submit comments.

Comments can be shared anonymously, and commenters are asked to “avoid naming or otherwise sharing private information about third parties or speculating on the identities of the people depicted in the content of these cases.”

The board is specifically asking for comments on the “nature and gravity of harms posed by deepfake pornography”—”especially” for “women who are public figures.” They also want experts to weigh in on how prevalent the deepfake problem is in the US and India, where celebrities in these cases reside.

In December, India’s IT minister Rajeev Chandrasekhar called out celebrity deepfakes spreading on social media as “dangerous” and “damaging” misinformation that needs “to be dealt with by platforms,” the BBC reported. Other countries, including the US and, most recently, the United Kingdom, have proposed laws to criminalize the creation and sharing of explicit AI deepfakes online.

Meta’s board has asked public commenters to offer insights into the best “policies and enforcement processes that may be most effective” to combat deepfakes and to weigh in on how good a job Meta is doing with enforcement of its “derogatory sexualized photoshop or drawings” rule. That includes fielding comments on “challenges” that users face reporting deepfakes when Meta relies “on automated systems that automatically close appeals in 48 hours if no review has taken place.”

Once the board submits its recommendations, Meta will have 60 days to respond. In a blog post, Meta has already confirmed that it “will implement the board’s decision.”