Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online

Macdonald, Stuart; Mattheis, Ashley; Wells, David

doi:https://doi.org/

Policy briefing report 387 views

Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online

Stuart Macdonald

, Ashley Mattheis, David Wells

Swansea University Authors: Stuart Macdonald , Ashley Mattheis, David Wells

Abstract

The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based app...

Full description

Published:	2024
Online Access:	https://tate.techagainstterrorism.org/news/tcoaireport
URI:	https://cronfa.swan.ac.uk/Record/cronfa65450

first_indexed	2024-01-15T17:06:58Z
last_indexed	2024-11-25T14:16:06Z
id	cronfa65450
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2024-03-23T12:06:52.3067121</datestamp><bib-version>v2</bib-version><id>65450</id><entry>2024-01-15</entry><title>Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online</title><swanseaauthors><author><sid>933e714a4cc37c3ac12d4edc277f8f98</sid><ORCID>0000-0002-7483-9023</ORCID><firstname>Stuart</firstname><surname>Macdonald</surname><name>Stuart Macdonald</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>20bd641e721999fbea309db74f2d60c5</sid><firstname>Ashley</firstname><surname>Mattheis</surname><name>Ashley Mattheis</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>d3eb40ca96e1df1931ef054d32fbc4cf</sid><firstname>David</firstname><surname>Wells</surname><name>David Wells</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-01-15</date><deptcode>HRCL</deptcode><abstract>The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified.</abstract><type>Policy briefing report</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Terrorism, counterterrorism, AI, machine learning, content moderation, social media</keywords><publishedDay>15</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2024</publishedYear><publishedDate>2024-01-15</publishedDate><doi/><url>https://tate.techagainstterrorism.org/news/tcoaireport</url><notes>https://tate.techagainstterrorism.org/news/tcoaireport</notes><college>COLLEGE NANME</college><department>Hillary Rodham Clinton Law School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>HRCL</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders>European Union</funders><projectreference>ISFP-2021-AG-TCO-101080101</projectreference><lastEdited>2024-03-23T12:06:52.3067121</lastEdited><Created>2024-01-15T17:02:07.1410064</Created><path><level id="1">Faculty of Humanities and Social Sciences</level><level id="2">Hilary Rodham Clinton School of Law</level></path><authors><author><firstname>Stuart</firstname><surname>Macdonald</surname><orcid>0000-0002-7483-9023</orcid><order>1</order></author><author><firstname>Ashley</firstname><surname>Mattheis</surname><order>2</order></author><author><firstname>David</firstname><surname>Wells</surname><order>3</order></author></authors><documents/><OutputDurs/></rfc1807>
spelling	2024-03-23T12:06:52.3067121 v2 65450 2024-01-15 Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online 933e714a4cc37c3ac12d4edc277f8f98 0000-0002-7483-9023 Stuart Macdonald Stuart Macdonald true false 20bd641e721999fbea309db74f2d60c5 Ashley Mattheis Ashley Mattheis true false d3eb40ca96e1df1931ef054d32fbc4cf David Wells David Wells true false 2024-01-15 HRCL The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified. Policy briefing report Terrorism, counterterrorism, AI, machine learning, content moderation, social media 15 1 2024 2024-01-15 https://tate.techagainstterrorism.org/news/tcoaireport https://tate.techagainstterrorism.org/news/tcoaireport COLLEGE NANME Hillary Rodham Clinton Law School COLLEGE CODE HRCL Swansea University Not Required European Union ISFP-2021-AG-TCO-101080101 2024-03-23T12:06:52.3067121 2024-01-15T17:02:07.1410064 Faculty of Humanities and Social Sciences Hilary Rodham Clinton School of Law Stuart Macdonald 0000-0002-7483-9023 1 Ashley Mattheis 2 David Wells 3
title	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
spellingShingle	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online Stuart Macdonald Ashley Mattheis David Wells
title_short	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_full	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_fullStr	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_full_unstemmed	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
title_sort	Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online
author_id_str_mv	933e714a4cc37c3ac12d4edc277f8f98 20bd641e721999fbea309db74f2d60c5 d3eb40ca96e1df1931ef054d32fbc4cf
author_id_fullname_str_mv	933e714a4cc37c3ac12d4edc277f8f98_*_Stuart Macdonald 20bd641e721999fbea309db74f2d60c5__Ashley Mattheis d3eb40ca96e1df1931ef054d32fbc4cf_**_David Wells
author	Stuart Macdonald Ashley Mattheis David Wells
author2	Stuart Macdonald Ashley Mattheis David Wells
format	Policy briefing report
publishDate	2024
institution	Swansea University
college_str	Faculty of Humanities and Social Sciences
hierarchytype
hierarchy_top_id	facultyofhumanitiesandsocialsciences
hierarchy_top_title	Faculty of Humanities and Social Sciences
hierarchy_parent_id	facultyofhumanitiesandsocialsciences
hierarchy_parent_title	Faculty of Humanities and Social Sciences
department_str	Hilary Rodham Clinton School of Law{{{_:::_}}}Faculty of Humanities and Social Sciences{{{_:::_}}}Hilary Rodham Clinton School of Law
url	https://tate.techagainstterrorism.org/news/tcoaireport
document_store_str	0
active_str	0
description	The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified.
published_date	2024-01-15T08:17:10Z
_version_	1825831943694974976
score	11.053243

Using Artificial Intelligence and Machine Learning to Identify Terrorist Content Online

Similar Items