Researchers at the Carnegie Mellon University have developed a system that leverages Artificial Intelligence (AI) to rapidly analyze hundreds of thousands of comments on social media and identify the fraction that defend or sympathize with disenfranchised minorities such as the Rohingya community.
The Rohingyas, who began fleeing Myanamar in 2017 to avoid ethnic cleansing, are ill-equipped to defend themselves from online attacks. Many of them have limited proficiency in global languages such as English, and they have little access to the Internet. Most are too busy trying to stay alive to spend much time posting their own content.
The technique from Carnegie Mellon University's Language Technologies Institute (LTI) could help counter the hate speech directed at them and other voiceless groups. Human social media moderators, who couldn't possibly manually sift through so many comments, would then have the option to highlight this "help speech" in comment sections, said the study.
"Even if there's lots of hateful content, we can still find positive comments," said Ashiqur R. KhudaBukhsh, a post-doctoral researcher who conducted the research with alumnus Shriphani Palakodety.
To find relevant help speech, the researchers used their technique to search more than a quarter of a million comments from YouTube in what they believe is the first AI-focused analysis of the Rohingya refugee crisis.
Similarly, in a study yet to be published, they used the technology to search for anti-war "hope speech" among almost a million YouTube comments surrounding the February 2019 Pulwama terror attack in Kashmir, which enflamed the longstanding India-Pakistan dispute over the region. The researchers developed a further innovation that made it possible to apply these models to short social media texts in South Asia.
"Short bits of text, often with spelling and grammar mistakes, are difficult for machines to interpret. It's even harder in South Asian countries, where people may speak several languages and tend to "code switch," combining bits of different languages and even different writing systems in the same statement," said the study. Samplings of the YouTube comments showed about 10 per cent of the comments were positive.
"When the researchers used their method to search for help speech in the larger dataset, the results were 88 per cent positive, indicating that the method could substantially reduce the manual effort necessary to find them," KhudaBukhsh said.
"No country is too small to take on refugees," said one text, while another argued "all the countries should take a stand for these people." But detecting pro-Rohingya texts can be a double-edged sword: some texts can contain language that could be considered hate speech against their alleged persecutors, the authors wrote.
However, finding and highlighting these positive comments might do as much to make the internet a safer, healthier place as would detecting and eliminating hostile content or banning the trolls responsible. The team would present their findings at the Association for the Advancement of Artificial Intelligence annual conference in New York City next month.