No Cover Image

Journal article

A Scalable Random Forest (SRF) Approach for Non-linear Predictive Modelling using Small Manufacturing Datasets.

Meshari A. Al-Ebrahim, Rajesh Ransing Orcid Logo

Journal of Intelligent Manufacturing

Swansea University Author: Rajesh Ransing Orcid Logo

Abstract

This paper presents an integrated, scalable Random Forest (SRF)–based predictive framework for estimating the effects of process interventions, including (i) adjusting operating ranges for continuous process parameters within specified tolerances, (ii) selecting specific categories for discrete proc...

Full description

Published in: Journal of Intelligent Manufacturing
Published:
URI: https://cronfa.swan.ac.uk/Record/cronfa71470
Abstract: This paper presents an integrated, scalable Random Forest (SRF)–based predictive framework for estimating the effects of process interventions, including (i) adjusting operating ranges for continuous process parameters within specified tolerances, (ii) selecting specific categories for discrete process parameters, and (iii) combining adjustments to both continuous and discrete parameters. The framework moves beyond linear assumptions by employing a non-linear ensemble approach to identify critical process inputs and quantify their contributions to predicting the process response. These contributions are then leveraged to derive optimal operating ranges for continuous parameters and optimal categories for discrete parameters through a Decision Path Search (DPS) procedure based on tree decision paths. The proposed framework scales to a large number of process factors with complex non-linear dependencies and enables data-driven process improvement without requiring extensive domain expertise. Missing values in mixed-type datasets are addressed using an iterative Random Forest–based imputation scheme, while automatic forest-size optimisation enhances modelstability. All preprocessing and modelling steps are embedded within a leakage-safe pipeline, supported by learning-curve analysis and leakage-sanity diagnostics to guard against overfitting. Across the evaluated case studies, SRF delivers accurate predictions together with transparent, practitioner-ready operating windows, translating complex mixed-type manufacturing data into actionable guidance.
Keywords: Random Forest, Common-Cause Variation, Predictive Analytics, Data Augmentation, Small Data, Quality Improvement
College: Faculty of Science and Engineering