No Cover Image

Journal article 179 views 30 downloads

A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values

Raed S. Batbooti, Rajesh Ransing Orcid Logo

Computers and Industrial Engineering, Volume: 179, Start page: 109230

Swansea University Author: Rajesh Ransing Orcid Logo

  • 63117.pdf

    PDF | Version of Record

    This is an open access article under the CC BY licence

    Download (2.23MB)

Abstract

Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and conti...

Full description

Published in: Computers and Industrial Engineering
ISSN: 0360-8352
Published: Elsevier BV 2023
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa63117
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2023-04-12T09:22:37Z
last_indexed 2023-04-14T03:23:52Z
id cronfa63117
recordtype SURis
fullrecord <?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>63117</id><entry>2023-04-12</entry><title>A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values</title><swanseaauthors><author><sid>0136f9a20abec3819b54088d9647c39f</sid><ORCID>0000-0003-4848-4545</ORCID><firstname>Rajesh</firstname><surname>Ransing</surname><name>Rajesh Ransing</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-04-12</date><deptcode>MECH</deptcode><abstract>Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and continuous variables to minimize process output variation but also predict the extent to which the variation can be minimized.In foundry processes, an average rejection rate of 3%–5% within batches of castings produced is considered as acceptable and is considered as an effect of the common cause variation. As a result, the operating range for process input values is often not changed during the root cause analysis. The relevant available historical process data is normally limited with missing values and it combines both categorical and continuous variables (mixed dataset). However, technological advancements manufacturing processes provide opportunities to further refine process inputs in order to minimize undesired variation in process outputs.A new linear regression based algorithm is proposed to achieve lower prediction error in comparison to the commonly used linear factor analysis for mixed data (FAMD) method. This algorithm is further coupled with a novel missing data algorithm to predict the process response values corresponding to a given set of values for process inputs. This enabled the novel imputation based predictive algorithm to quantify the effect of a confirmation trial based on the proposed changes in the operating ranges of one or more process inputs. A set of values for optimal process inputs is generated from operating ranges discovered by a recently proposed quality correlation algorithm (QCA) using a Bootstrap sampling method. The odds ratio, which represents a ratio between the probability of occurrence of desired and undesired process output values, is used to quantify the effect of a confirmation trial.The limitations of the underlying PCA based linear model have been discussed and the future research areas have been identified.</abstract><type>Journal Article</type><journal>Computers and Industrial Engineering</journal><volume>179</volume><journalNumber/><paginationStart>109230</paginationStart><paginationEnd/><publisher>Elsevier BV</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>0360-8352</issnPrint><issnElectronic/><keywords>Common cause variation, Missing data, Predictive analytics, Quality improvement, Tolerance limit optimization, 7Epsilon</keywords><publishedDay>1</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2023</publishedYear><publishedDate>2023-05-01</publishedDate><doi>10.1016/j.cie.2023.109230</doi><url>http://dx.doi.org/10.1016/j.cie.2023.109230</url><notes/><college>COLLEGE NANME</college><department>Mechanical Engineering</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MECH</DepartmentCode><institution>Swansea University</institution><apcterm>SU Library paid the OA fee (TA Institutional Deal)</apcterm><funders>Swansea University</funders><projectreference/><lastEdited>2023-05-24T16:12:06.8918931</lastEdited><Created>2023-04-12T10:16:48.5936114</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering</level></path><authors><author><firstname>Raed S.</firstname><surname>Batbooti</surname><order>1</order></author><author><firstname>Rajesh</firstname><surname>Ransing</surname><orcid>0000-0003-4848-4545</orcid><order>2</order></author></authors><documents><document><filename>63117__27014__17a87651ae2f4d27a4a23ec6b2207866.pdf</filename><originalFilename>63117.pdf</originalFilename><uploaded>2023-04-12T10:21:45.1067772</uploaded><type>Output</type><contentLength>2341789</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>This is an open access article under the CC BY licence</documentNotes><copyrightCorrect>false</copyrightCorrect><language>eng</language><licence>http://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling v2 63117 2023-04-12 A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values 0136f9a20abec3819b54088d9647c39f 0000-0003-4848-4545 Rajesh Ransing Rajesh Ransing true false 2023-04-12 MECH Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and continuous variables to minimize process output variation but also predict the extent to which the variation can be minimized.In foundry processes, an average rejection rate of 3%–5% within batches of castings produced is considered as acceptable and is considered as an effect of the common cause variation. As a result, the operating range for process input values is often not changed during the root cause analysis. The relevant available historical process data is normally limited with missing values and it combines both categorical and continuous variables (mixed dataset). However, technological advancements manufacturing processes provide opportunities to further refine process inputs in order to minimize undesired variation in process outputs.A new linear regression based algorithm is proposed to achieve lower prediction error in comparison to the commonly used linear factor analysis for mixed data (FAMD) method. This algorithm is further coupled with a novel missing data algorithm to predict the process response values corresponding to a given set of values for process inputs. This enabled the novel imputation based predictive algorithm to quantify the effect of a confirmation trial based on the proposed changes in the operating ranges of one or more process inputs. A set of values for optimal process inputs is generated from operating ranges discovered by a recently proposed quality correlation algorithm (QCA) using a Bootstrap sampling method. The odds ratio, which represents a ratio between the probability of occurrence of desired and undesired process output values, is used to quantify the effect of a confirmation trial.The limitations of the underlying PCA based linear model have been discussed and the future research areas have been identified. Journal Article Computers and Industrial Engineering 179 109230 Elsevier BV 0360-8352 Common cause variation, Missing data, Predictive analytics, Quality improvement, Tolerance limit optimization, 7Epsilon 1 5 2023 2023-05-01 10.1016/j.cie.2023.109230 http://dx.doi.org/10.1016/j.cie.2023.109230 COLLEGE NANME Mechanical Engineering COLLEGE CODE MECH Swansea University SU Library paid the OA fee (TA Institutional Deal) Swansea University 2023-05-24T16:12:06.8918931 2023-04-12T10:16:48.5936114 Faculty of Science and Engineering School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering Raed S. Batbooti 1 Rajesh Ransing 0000-0003-4848-4545 2 63117__27014__17a87651ae2f4d27a4a23ec6b2207866.pdf 63117.pdf 2023-04-12T10:21:45.1067772 Output 2341789 application/pdf Version of Record true This is an open access article under the CC BY licence false eng http://creativecommons.org/licenses/by/4.0/
title A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
spellingShingle A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
Rajesh Ransing
title_short A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_full A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_fullStr A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_full_unstemmed A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
title_sort A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values
author_id_str_mv 0136f9a20abec3819b54088d9647c39f
author_id_fullname_str_mv 0136f9a20abec3819b54088d9647c39f_***_Rajesh Ransing
author Rajesh Ransing
author2 Raed S. Batbooti
Rajesh Ransing
format Journal article
container_title Computers and Industrial Engineering
container_volume 179
container_start_page 109230
publishDate 2023
institution Swansea University
issn 0360-8352
doi_str_mv 10.1016/j.cie.2023.109230
publisher Elsevier BV
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Aerospace, Civil, Electrical, General and Mechanical Engineering - Mechanical Engineering
url http://dx.doi.org/10.1016/j.cie.2023.109230
document_store_str 1
active_str 0
description Most process control algorithms need a predetermined target value as an input for a process variable so that the deviation is observed and minimized. In this paper, a novel machine learning algorithm is proposed that has an ability to not only suggest new target values for both categorical and continuous variables to minimize process output variation but also predict the extent to which the variation can be minimized.In foundry processes, an average rejection rate of 3%–5% within batches of castings produced is considered as acceptable and is considered as an effect of the common cause variation. As a result, the operating range for process input values is often not changed during the root cause analysis. The relevant available historical process data is normally limited with missing values and it combines both categorical and continuous variables (mixed dataset). However, technological advancements manufacturing processes provide opportunities to further refine process inputs in order to minimize undesired variation in process outputs.A new linear regression based algorithm is proposed to achieve lower prediction error in comparison to the commonly used linear factor analysis for mixed data (FAMD) method. This algorithm is further coupled with a novel missing data algorithm to predict the process response values corresponding to a given set of values for process inputs. This enabled the novel imputation based predictive algorithm to quantify the effect of a confirmation trial based on the proposed changes in the operating ranges of one or more process inputs. A set of values for optimal process inputs is generated from operating ranges discovered by a recently proposed quality correlation algorithm (QCA) using a Bootstrap sampling method. The odds ratio, which represents a ratio between the probability of occurrence of desired and undesired process output values, is used to quantify the effect of a confirmation trial.The limitations of the underlying PCA based linear model have been discussed and the future research areas have been identified.
published_date 2023-05-01T16:12:05Z
_version_ 1766788825742311424
score 10.988012