The Scramble to Save Twitter's Research From Elon Musk

The Scramble to Save Twitter's Research From Elon Musk
Feb 2023

The Scramble to Save Twitter's Research From Elon Musk

two years ago, Twitter launched what is perhaps the tech industry's most ambitious attempt at algorithmic transparency. Its researchers">researchers wrote papers showing that Twitter's AI system for cropping images in tweets favored white faces and women, and that posts from the political right in several countries, including the US, UK, and France, received a bigger algorithmic boost than those from the left.

By early October last year, as Elon Musk faced a court deadline to complete his $44 billion acquisition of Twitter, the company's newest research was almost ready. It showed that a machine-learning program incorrectly demoted some tweets mentioning any of 350 terms related to identity, politics, or sexuality, including "gay," "Muslim," and "deaf," because a system intended to limit views of tweets slurring marginalized groups also impeded posts celebrating those communities. The finding--and a partial fix Twitter developed--could help other social platforms better use AI to police content. But would anyone ever get to read the research?

Musk had months earlier supported algorithmic transparency, saying he wanted to "open-source" Twitter's content recommendation code. On the other hand, Musk had said he would reinstate popular accounts permanently banned for rule-breaking tweets. He also had mocked some of the same communities that Twitter's researchers were seeking to protect and complained about an undefined "woke mind virus." Additionally disconcerting, Musk's AI scientists at Tesla generally have not published research.

Twitter's AI ethics researchers ultimately decided their prospects were too murky under Musk to wait to get their study into an academic journal or even to finish writing a company blog post. So less than three weeks before Musk finally assumed ownership on October 27, they rushed the moderation bias study onto the open-access service Arxiv, where scholars post research that has not yet been peer reviewed.


"We were rightfully worried about what this leadership change would entail," says Rumman Chowdhury, who was then engineering director on Twitter's Machine Learning Ethics, Transparency, and Accountability group, known as META. "There's a lot of ideology and misunderstanding about the kind of work ethics teams do as being part of some like, woke liberal agenda, versus actually being scientific work."

Concern about the Musk regime spurred researchers throughout Cortex, Twitter's machine-learning and research organization, to stealthily publish a flurry of studies much sooner than planned, according to Chowdhury and five other former employees. The results spanned topics including misinformation and recommendation algorithms. The frantic push and the published papers have not been previously reported.

The researchers wanted to preserve the knowledge discovered at Twitter for anyone to use and make other social networks better. "I feel very passionate that companies should talk more openly about the problems that they have and try to lead the charge, and show people that it's like a thing that is doable," says Kyra Yee, lead author of the moderation paper.

Twitter and Musk did not respond to a detailed request by email for comment for this story.

"We knew the runway would shut down when the Elon jumbo jet landed."

The team on another study worked through the night to make final edits before hitting Publish on Arxiv the day Musk took Twitter, one researcher says, speaking anonymously out of fear of retaliation from Musk. "We knew the runway would shut down when the Elon jumbo jet landed," the source says. "We knew we needed to do this before the acquisition closed. We can stick a flag in the ground and say it exists."

The fear was not misplaced. Most of Twitter's researchers lost their jobs or resigned under Musk. On the META team, Musk laid off all but one person on November 4, and the remaining member, cofounder and research lead Luca Belli, quit later in the month.

Departing corporate researchers normally still have some collaborators at the company who can carry their work forward, submit to journals, and make edits. But the depth of cuts at Twitter has left science stranded. "I am not in position to make changes, which is really sad," says Belli on the prospects of journal acceptance.

But Belli posted a long-delayed team paper on Arxiv before he left. It was an inconclusive analysis that used county-level US Census race data to try to estimate whether Twitter's algorithms disproportionately displayed tweets from Black or White authors on home timelines. "I'm very grateful we were able to push the text out," Belli says. "The risk was potentially not having it all."

Also published on an accelerated schedule was a study on the significant effectiveness of Birdwatch, an experimental feature for crowdsourced fact-checking. The work was not necessarily secret. Some of the Birdwatch findings had appeared in an earlier company blog, and the race and language bias studies had been presented at academic conferences.

Not everything made it out on time. Lauren Fratamico, who was laid off in November, says she and her collaborator, who resigned later that month, had not received permission to publish from company attorneys by then. The researchers have continued emailing the attorneys, hoping they still may be able to publish the research, which evaluates different prompts to get users to reconsider posting potentially offensive tweets. "We've tried pinging lawyers," she says. "We've tried emailing Elon ourselves. We've had someone on the inside Slacking the lawyers."

Twitter has styled itself as the internet's public square, but to prevent total chaos it has deployed algorithms to help decide who gets the loudest metaphorical megaphone. The company's researchers investigated how to operate those systems more fairly, publishing their findings to help others inside and outside of Twitter understand potential improvements.

In that spirit, Yee and her colleagues on the META team a year ago had gone deep into the algorithm the company used to analyze every English-language tweet to flag any close to breaking Twitter's rules but not out of bounds--dubbed "marginally abusive" content. Tweets that score high are de-amplified, in the company's parlance, a process that blocks them from the feeds of users who do not follow the author and makes them less visible among the replies to a tweet. The author is also prompted to reconsider their message before it goes live.

One of Twitter's first ethics projects years before had found that the system inaccurately labeled some tweets as marginally abusive and applied a fix for only a small set of words, Yee says. The team now wanted to develop a more comprehensive patch. Otherwise, "over-penalization risks hurting the very communities that these tools are meant to protect," Yee said at a conference last year.

Abuse-detection algorithms at other companies, including Google and Facebook, had been shown to struggle with African American vernacular, as well as traditionally hateful speech that target populations had reclaimed for their own use. The speaker and context matters significantly in these cases, but those variables can be lost on algorithms that developers have not set up well.

At the time Yee studied it, Twitter's marginal abuse system scored tweets for content considered insulting or malicious, or that encouraged dangerous behavior. The company trained it on a sample of tweets it hired people to rate.

META's novel automated analysis of tweets determined the 350 terms most strongly associated with tweets that had been inaccurately flagged as marginally abusive. The team grouped them into several categories, including identity-related (Chinese, deaf), geographies (Palestine, Africa), political identity (feminist, Tories), and current events (cop, abortions). For 33 terms for which researchers considered errors particularly concerning, including "queer" and "Jewish," they re-trained the machine-learning system with nearly 50,000 additional samples of the terms being used within Twitter's rules.

The adjusted model wrongly flagged tweets less often for some of the terms without substantially worsening the system's overall ability to predict whether something was actually problematic. Although the improvements were small and not universal, the team considered the adjusted model better and deployed it onto Twitter mid-2022. Yee says the update laid a foundation for future research and updates. "This should really be seen not as the ending point, with like a perfect bow on it, but rather the starting point," she says.

But Twitter has lost at least two-thirds of its workforce in its first quarter under Musk, and work on further changes to abuse detection has likely stopped--just one of many projects put to rest, which could contribute to Twitter becoming a worse place to be over time.

"There's nobody there to do this work. There's nobody there who cares. There's no one picking up the phone," Chowdhury says. The META team's projects had included testing a dashboard to identify in real time the amplification of different political parties on Twitter in hopes of catching bot networks seeking to manipulate discussions.

By some accounts, Twitter is already growing sour. A survey by Amnesty International USA and other groups published this month of 20 popular LGBTQ+ organizations and personalities found that 12 had experienced increased hate and abusive speech on Twitter since Musk became owner; the rest had not noticed a change.

Shoshana Goldberg, director of public education and research at the Human Rights Campaign Foundation, says much of the most-viewed hate speech comes from relatively few accounts, and even with few researchers left, Musk can help marginalized communities on Twitter. "I appreciate when companies are willing to take an internal look," Goldberg says. "But we also kind of know who is doing this, and not enough is being done to address this."

You may be also interested in

Go to blog