No Cover Image

Journal article 919 views 314 downloads

Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper

Kerina Jones, Elizabeth M Ford, Nathan Lea, Lucy Griffiths Orcid Logo, Lamiece Hassan, Sharon Heys, Emma Squires, Goran Nenadic

Journal of Medical Internet Research, Volume: 22, Issue: 6, Start page: e16760

Swansea University Authors: Kerina Jones, Lucy Griffiths Orcid Logo, Sharon Heys, Emma Squires

  • 53733VOR.pdf

    PDF | Version of Record

    Released under the terms of a Creative Commons Attribution License (CC-BY).

    Download (495.45KB)

Check full text

DOI (Published version): 10.2196/16760

Abstract

Background: Clinical free-text data (eg, outpatient letters or nursing notes) represent a vast, untapped source of rich information that, if more accessible for research, would clarify and supplement information coded in structured data fields. Data usually need to be deidentified or anonymized befo...

Full description

Published in: Journal of Medical Internet Research
ISSN: 1438-8871
Published: JMIR Publications Inc. 2020
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa53733
first_indexed 2020-04-22T13:23:20Z
last_indexed 2025-04-08T03:56:03Z
id cronfa53733
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2025-04-07T12:50:02.3132251</datestamp><bib-version>v2</bib-version><id>53733</id><entry>2020-03-04</entry><title>Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper</title><swanseaauthors><author><sid>c13b3cd0a6f8cbac2e461b54b3cdd839</sid><firstname>Kerina</firstname><surname>Jones</surname><name>Kerina Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>e35ea6ea4b429e812ef204b048131d93</sid><ORCID>0000-0001-9230-624X</ORCID><firstname>Lucy</firstname><surname>Griffiths</surname><name>Lucy Griffiths</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>61f095d8f6942db1b4fd65e2053091f5</sid><firstname>Sharon</firstname><surname>Heys</surname><name>Sharon Heys</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>0088b5b395a477d268ce487544ea4738</sid><ORCID/><firstname>Emma</firstname><surname>Squires</surname><name>Emma Squires</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2020-03-04</date><abstract>Background: Clinical free-text data (eg, outpatient letters or nursing notes) represent a vast, untapped source of rich information that, if more accessible for research, would clarify and supplement information coded in structured data fields. Data usually need to be deidentified or anonymized before they can be reused for research, but there is a lack of established guidelines to govern effective deidentification and use of free-text information and avoid damaging data utility as a by-product. Objective: This study aimed to develop recommendations for the creation of data governance standards to integrate with existing frameworks for personal data use, to enable free-text data to be used safely for research for patient and public benefit. Methods: We outlined data protection legislation and regulations relating to the United Kingdom for context and conducted a rapid literature review and UK-based case studies to explore data governance models used in working with free-text data. We also engaged with stakeholders, including text-mining researchers and the general public, to explore perceived barriers and solutions in working with clinical free-text. Results: We proposed a set of recommendations, including the need for authoritative guidance on data governance for the reuse of free-text data, to ensure public transparency in data flows and uses, to treat deidentified free-text data as potentially identifiable with use limited to accredited data safe havens, and to commit to a culture of continuous improvement to understand the relationships between the efficacy of deidentification and reidentification risks, so this can be communicated to all stakeholders. Conclusions: By drawing together the findings of a combination of activities, we present a position paper to contribute to the development of data governance standards for the reuse of clinical free-text data for secondary purposes. While working in accordance with existing data governance frameworks, there is a need for further work to take forward the recommendations we have proposed, with commitment and investment, to assure and expand the safe reuse of clinical free-text data for public benefit.</abstract><type>Journal Article</type><journal>Journal of Medical Internet Research</journal><volume>22</volume><journalNumber>6</journalNumber><paginationStart>e16760</paginationStart><paginationEnd/><publisher>JMIR Publications Inc.</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>1438-8871</issnElectronic><keywords>ethical; legal; social implications; public engagement; free-text data; information governance</keywords><publishedDay>29</publishedDay><publishedMonth>6</publishedMonth><publishedYear>2020</publishedYear><publishedDate>2020-06-29</publishedDate><doi>10.2196/16760</doi><url/><notes/><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><apcterm>External research funder(s) paid the OA fee (includes OA grants disbursed by the Library)</apcterm><funders>TexGov was funded by the Engineering and Physical Sciences Research Council via Healtex, the UK health care text analytics research network (grant number EP/N027280/1).</funders><projectreference/><lastEdited>2025-04-07T12:50:02.3132251</lastEdited><Created>2020-03-04T14:01:36.4380897</Created><path><level id="1">Faculty of Medicine, Health and Life Sciences</level><level id="2">Swansea University Medical School - Health Data Science</level></path><authors><author><firstname>Kerina</firstname><surname>Jones</surname><order>1</order></author><author><firstname>Elizabeth M</firstname><surname>Ford</surname><order>2</order></author><author><firstname>Nathan</firstname><surname>Lea</surname><order>3</order></author><author><firstname>Lucy</firstname><surname>Griffiths</surname><orcid>0000-0001-9230-624X</orcid><order>4</order></author><author><firstname>Lamiece</firstname><surname>Hassan</surname><order>5</order></author><author><firstname>Sharon</firstname><surname>Heys</surname><order>6</order></author><author><firstname>Emma</firstname><surname>Squires</surname><orcid/><order>7</order></author><author><firstname>Goran</firstname><surname>Nenadic</surname><order>8</order></author></authors><documents><document><filename>53733__17639__ccfa1eb1b4a349d3bb0c8d484c2ffa7f.pdf</filename><originalFilename>53733VOR.pdf</originalFilename><uploaded>2020-07-05T12:15:11.1376650</uploaded><type>Output</type><contentLength>507343</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>Released under the terms of a Creative Commons Attribution License (CC-BY).</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling 2025-04-07T12:50:02.3132251 v2 53733 2020-03-04 Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper c13b3cd0a6f8cbac2e461b54b3cdd839 Kerina Jones Kerina Jones true false e35ea6ea4b429e812ef204b048131d93 0000-0001-9230-624X Lucy Griffiths Lucy Griffiths true false 61f095d8f6942db1b4fd65e2053091f5 Sharon Heys Sharon Heys true false 0088b5b395a477d268ce487544ea4738 Emma Squires Emma Squires true false 2020-03-04 Background: Clinical free-text data (eg, outpatient letters or nursing notes) represent a vast, untapped source of rich information that, if more accessible for research, would clarify and supplement information coded in structured data fields. Data usually need to be deidentified or anonymized before they can be reused for research, but there is a lack of established guidelines to govern effective deidentification and use of free-text information and avoid damaging data utility as a by-product. Objective: This study aimed to develop recommendations for the creation of data governance standards to integrate with existing frameworks for personal data use, to enable free-text data to be used safely for research for patient and public benefit. Methods: We outlined data protection legislation and regulations relating to the United Kingdom for context and conducted a rapid literature review and UK-based case studies to explore data governance models used in working with free-text data. We also engaged with stakeholders, including text-mining researchers and the general public, to explore perceived barriers and solutions in working with clinical free-text. Results: We proposed a set of recommendations, including the need for authoritative guidance on data governance for the reuse of free-text data, to ensure public transparency in data flows and uses, to treat deidentified free-text data as potentially identifiable with use limited to accredited data safe havens, and to commit to a culture of continuous improvement to understand the relationships between the efficacy of deidentification and reidentification risks, so this can be communicated to all stakeholders. Conclusions: By drawing together the findings of a combination of activities, we present a position paper to contribute to the development of data governance standards for the reuse of clinical free-text data for secondary purposes. While working in accordance with existing data governance frameworks, there is a need for further work to take forward the recommendations we have proposed, with commitment and investment, to assure and expand the safe reuse of clinical free-text data for public benefit. Journal Article Journal of Medical Internet Research 22 6 e16760 JMIR Publications Inc. 1438-8871 ethical; legal; social implications; public engagement; free-text data; information governance 29 6 2020 2020-06-29 10.2196/16760 COLLEGE NANME COLLEGE CODE Swansea University External research funder(s) paid the OA fee (includes OA grants disbursed by the Library) TexGov was funded by the Engineering and Physical Sciences Research Council via Healtex, the UK health care text analytics research network (grant number EP/N027280/1). 2025-04-07T12:50:02.3132251 2020-03-04T14:01:36.4380897 Faculty of Medicine, Health and Life Sciences Swansea University Medical School - Health Data Science Kerina Jones 1 Elizabeth M Ford 2 Nathan Lea 3 Lucy Griffiths 0000-0001-9230-624X 4 Lamiece Hassan 5 Sharon Heys 6 Emma Squires 7 Goran Nenadic 8 53733__17639__ccfa1eb1b4a349d3bb0c8d484c2ffa7f.pdf 53733VOR.pdf 2020-07-05T12:15:11.1376650 Output 507343 application/pdf Version of Record true Released under the terms of a Creative Commons Attribution License (CC-BY). true eng https://creativecommons.org/licenses/by/4.0/
title Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
spellingShingle Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
Kerina Jones
Lucy Griffiths
Sharon Heys
Emma Squires
title_short Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
title_full Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
title_fullStr Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
title_full_unstemmed Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
title_sort Toward the Development of Data Governance Standards for Using Clinical Free-Text Data in Health Research: Position Paper
author_id_str_mv c13b3cd0a6f8cbac2e461b54b3cdd839
e35ea6ea4b429e812ef204b048131d93
61f095d8f6942db1b4fd65e2053091f5
0088b5b395a477d268ce487544ea4738
author_id_fullname_str_mv c13b3cd0a6f8cbac2e461b54b3cdd839_***_Kerina Jones
e35ea6ea4b429e812ef204b048131d93_***_Lucy Griffiths
61f095d8f6942db1b4fd65e2053091f5_***_Sharon Heys
0088b5b395a477d268ce487544ea4738_***_Emma Squires
author Kerina Jones
Lucy Griffiths
Sharon Heys
Emma Squires
author2 Kerina Jones
Elizabeth M Ford
Nathan Lea
Lucy Griffiths
Lamiece Hassan
Sharon Heys
Emma Squires
Goran Nenadic
format Journal article
container_title Journal of Medical Internet Research
container_volume 22
container_issue 6
container_start_page e16760
publishDate 2020
institution Swansea University
issn 1438-8871
doi_str_mv 10.2196/16760
publisher JMIR Publications Inc.
college_str Faculty of Medicine, Health and Life Sciences
hierarchytype
hierarchy_top_id facultyofmedicinehealthandlifesciences
hierarchy_top_title Faculty of Medicine, Health and Life Sciences
hierarchy_parent_id facultyofmedicinehealthandlifesciences
hierarchy_parent_title Faculty of Medicine, Health and Life Sciences
department_str Swansea University Medical School - Health Data Science{{{_:::_}}}Faculty of Medicine, Health and Life Sciences{{{_:::_}}}Swansea University Medical School - Health Data Science
document_store_str 1
active_str 0
description Background: Clinical free-text data (eg, outpatient letters or nursing notes) represent a vast, untapped source of rich information that, if more accessible for research, would clarify and supplement information coded in structured data fields. Data usually need to be deidentified or anonymized before they can be reused for research, but there is a lack of established guidelines to govern effective deidentification and use of free-text information and avoid damaging data utility as a by-product. Objective: This study aimed to develop recommendations for the creation of data governance standards to integrate with existing frameworks for personal data use, to enable free-text data to be used safely for research for patient and public benefit. Methods: We outlined data protection legislation and regulations relating to the United Kingdom for context and conducted a rapid literature review and UK-based case studies to explore data governance models used in working with free-text data. We also engaged with stakeholders, including text-mining researchers and the general public, to explore perceived barriers and solutions in working with clinical free-text. Results: We proposed a set of recommendations, including the need for authoritative guidance on data governance for the reuse of free-text data, to ensure public transparency in data flows and uses, to treat deidentified free-text data as potentially identifiable with use limited to accredited data safe havens, and to commit to a culture of continuous improvement to understand the relationships between the efficacy of deidentification and reidentification risks, so this can be communicated to all stakeholders. Conclusions: By drawing together the findings of a combination of activities, we present a position paper to contribute to the development of data governance standards for the reuse of clinical free-text data for secondary purposes. While working in accordance with existing data governance frameworks, there is a need for further work to take forward the recommendations we have proposed, with commitment and investment, to assure and expand the safe reuse of clinical free-text data for public benefit.
published_date 2020-06-29T04:42:25Z
_version_ 1851729165169983488
score 11.089988