A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values

Batbooti, Raed S.; Ransing, Rajesh

doi:10.1016/j.cie.2023.109230

Journal article 810 views 281 downloads

A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values

Raed S. Batbooti, Rajesh Ransing

Computers and Industrial Engineering, Volume: 179, Start page: 109230

Swansea University Author: Rajesh Ransing

PDF | Version of Record

This is an open access article under the CC BY licence
Download (2.23MB)

Check full text

DOI (Published version): 10.1016/j.cie.2023.109230

Abstract

Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and conti...

Full description

Published in:	Computers and Industrial Engineering
ISSN:	0360-8352
Published:	Elsevier BV 2023
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa63117

first_indexed	2023-04-12T09:22:37Z
last_indexed	2024-11-15T18:00:57Z
id	cronfa63117
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2023-05-24T16:12:06.8918931</datestamp><bib-version>v2</bib-version><id>63117</id><entry>2023-04-12</entry><title>A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values</title><swanseaauthors><author><sid>0136f9a20abec3819b54088d9647c39f</sid><ORCID>0000-0003-4848-4545</ORCID><firstname>Rajesh</firstname><surname>Ransing</surname><name>Rajesh Ransing</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-04-12</date><deptcode>ACEM</deptcode><abstract>Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and continuous variables to minimize process output variation but also predict the extent to which the variation can be minimized.In foundry processes, an average rejection rate of 3%–5% within batches of castings produced is considered as acceptable and is considered as an effect of the common cause variation. As a result, the operating range for process input values is often not changed during the root cause analysis. The relevant available historical process data is normally limited with missing values and it combines both categorical and continuous variables (mixed dataset). However, technological advancements manufacturing processes provide opportunities to further refine process inputs in order to minimize undesired variation in process outputs.A new linear regression based algorithm is proposed to achieve lower prediction error in comparison to the commonly used linear factor analysis for mixed data (FAMD) method. This algorithm is further coupled with a novel missing data algorithm to predict the process response values corresponding to a given set of values for process inputs. This enabled the novel imputation based predictive algorithm to quantify the effect of a confirmation trial based on the proposed changes in the operating ranges of one or more process inputs. A set of values for optimal process inputs is generated from operating ranges discovered by a recently proposed quality correlation algorithm (QCA) using a Bootstrap sampling method. The odds ratio, which represents a ratio between the probability of occurrence of desired and undesired process output values, is used to quantify the effect of a confirmation trial.The limitations of the underlying PCA based linear model have been discussed and the future research areas have been identified.</abstract><type>Journal Article</type><journal>Computers and Industrial Engineering</journal><volume>179</volume><journalNumber/><paginationStart>109230</paginationStart><paginationEnd/><publisher>Elsevier BV</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>0360-8352</issnPrint><issnElectronic/><keywords>Common cause variation, Missing data, Predictive analytics, Quality improvement, Tolerance limit optimization, 7Epsilon</keywords><publishedDay>1</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2023</publishedYear><publishedDate>2023-05-01</publishedDate><doi>10.1016/j.cie.2023.109230</doi><url>http://dx.doi.org/10.1016/j.cie.2023.109230</url><notes/><college>COLLEGE NANME</college><department>Aerospace, Civil, Electrical, and Mechanical Engineering</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>ACEM</DepartmentCode><institution>Swansea University</institution><apcterm>SU Library paid the OA fee (TA Institutional Deal)</apcterm><funders>Swansea University</funders><projectreference/><lastEdited>2023-05-24T16:12:06.8918931</lastEdited><Created>2023-04-12T10:16:48.5936114</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering</level></path><authors><author><firstname>Raed S.</firstname><surname>Batbooti</surname><order>1</order></author><author><firstname>Rajesh</firstname><surname>Ransing</surname><orcid>0000-0003-4848-4545</orcid><order>2</order></author></authors><documents><document><filename>63117__27014__17a87651ae2f4d27a4a23ec6b2207866.pdf</filename><originalFilename>63117.pdf</originalFilename><uploaded>2023-04-12T10:21:45.1067772</uploaded><type>Output</type><contentLength>2341789</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>This is an open access article under the CC BY licence</documentNotes><copyrightCorrect>false</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling	2023-05-24T16:12:06.8918931 v2 63117 2023-04-12 A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values 0136f9a20abec3819b54088d9647c39f 0000-0003-4848-4545 Rajesh Ransing Rajesh Ransing true false 2023-04-12 ACEM Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and continuous variables to minimize process output variation but also predict the extent to which the variation can be minimized.In foundry processes, an average rejection rate of 3%–5% within batches of castings produced is considered as acceptable and is considered as an effect of the common cause variation. As a result, the operating range for process input values is often not changed during the root cause analysis. The relevant available historical process data is normally limited with missing values and it combines both categorical and continuous variables (mixed dataset). However, technological advancements manufacturing processes provide opportunities to further refine process inputs in order to minimize undesired variation in process outputs.A new linear regression based algorithm is proposed to achieve lower prediction error in comparison to the commonly used linear factor analysis for mixed data (FAMD) method. This algorithm is further coupled with a novel missing data algorithm to predict the process response values corresponding to a given set of values for process inputs. This enabled the novel imputation based predictive algorithm to quantify the effect of a confirmation trial based on the proposed changes in the operating ranges of one or more process inputs. A set of values for optimal process inputs is generated from operating ranges discovered by a recently proposed quality correlation algorithm (QCA) using a Bootstrap sampling method. The odds ratio, which represents a ratio between the probability of occurrence of desired and undesired process output values, is used to quantify the effect of a confirmation trial.The limitations of the underlying PCA based linear model have been discussed and the future research areas have been identified. Journal Article Computers and Industrial Engineering 179 109230 Elsevier BV 0360-8352 Common cause variation, Missing data, Predictive analytics, Quality improvement, Tolerance limit optimization, 7Epsilon 1 5 2023 2023-05-01 10.1016/j.cie.2023.109230 http://dx.doi.org/10.1016/j.cie.2023.109230 COLLEGE NANME Aerospace, Civil, Electrical, and Mechanical Engineering COLLEGE CODE ACEM Swansea University SU Library paid the OA fee (TA Institutional Deal) Swansea University 2023-05-24T16:12:06.8918931 2023-04-12T10:16:48.5936114 Faculty of Science and Engineering School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering Raed S. Batbooti 1 Rajesh Ransing 0000-0003-4848-4545 2 63117__27014__17a87651ae2f4d27a4a23ec6b2207866.pdf 63117.pdf 2023-04-12T10:21:45.1067772 Output 2341789 application/pdf Version of Record true This is an open access article under the CC BY licence false eng http://creativecommons.org/licenses/by/4.0/
title	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
spellingShingle	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values Rajesh Ransing
title_short	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_full	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_fullStr	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_full_unstemmed	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_sort	A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
author_id_str_mv	0136f9a20abec3819b54088d9647c39f
author_id_fullname_str_mv	0136f9a20abec3819b54088d9647c39f_***_Rajesh Ransing
author	Rajesh Ransing
author2	Raed S. Batbooti Rajesh Ransing
format	Journal article
container_title	Computers and Industrial Engineering
container_volume	179
container_start_page	109230
publishDate	2023
institution	Swansea University
issn	0360-8352
doi_str_mv	10.1016/j.cie.2023.109230
publisher	Elsevier BV
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering
url	http://dx.doi.org/10.1016/j.cie.2023.109230
document_store_str	1
active_str	0
description	Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and continuous variables to minimize process output variation but also predict the extent to which the variation can be minimized.In foundry processes, an average rejection rate of 3%–5% within batches of castings produced is considered as acceptable and is considered as an effect of the common cause variation. As a result, the operating range for process input values is often not changed during the root cause analysis. The relevant available historical process data is normally limited with missing values and it combines both categorical and continuous variables (mixed dataset). However, technological advancements manufacturing processes provide opportunities to further refine process inputs in order to minimize undesired variation in process outputs.A new linear regression based algorithm is proposed to achieve lower prediction error in comparison to the commonly used linear factor analysis for mixed data (FAMD) method. This algorithm is further coupled with a novel missing data algorithm to predict the process response values corresponding to a given set of values for process inputs. This enabled the novel imputation based predictive algorithm to quantify the effect of a confirmation trial based on the proposed changes in the operating ranges of one or more process inputs. A set of values for optimal process inputs is generated from operating ranges discovered by a recently proposed quality correlation algorithm (QCA) using a Bootstrap sampling method. The odds ratio, which represents a ratio between the probability of occurrence of desired and undesired process output values, is used to quantify the effect of a confirmation trial.The limitations of the underlying PCA based linear model have been discussed and the future research areas have been identified.
published_date	2023-05-01T05:13:09Z
_version_	1857438706936315904
score	11.461431

A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values

Similar Items