Privacy-Preserving Feature Selection and Extraction for Federated Learning in Real World Applications

Main Article Content

Rahul Kumar
Chin-Shiuh Shieh
Prasun Chakrabarti

Abstract

Major challenges with data security, privacy, and regulatory compliance have also been brought on by the growing volume of sensitive data in industries like banking, healthcare, and e-commerce.  In such privacy-conscious environments, centralized machine learning techniques that collect raw data and send it to a central computer are no longer practical.  This study proposes a privacy-preserving, cross-domain classification model based on Federated Learning (FL), a decentralized method where only model updates are shared with a central server and data remains on the local client.  Three distinct real-world tabular datasets are used in this work to simulate a federated learning setup: banking (subscription prediction of term deposit), e-commerce (prediction of customer turnover), and healthcare (high billing identification).  Every dataset is treated as a distinct client, and local Random Forest models are trained on the unique data of each client.  To build a global model that generalizes knowledge across domains, the model parameters alone are purportedly gathered at a central server rather than sending raw data.  Although this could appear to be a synthetic application, it accurately captures FL's operating characteristics.  To add additional security layers for data during training and aggregation, our system is designed to incorporate Homomorphic Encryption (HE), Differential Privacy (DP), and Secure Multi-Party Computation (SMPC).  A comprehensive comparison is conducted, evaluating each local model's performance in relation to the federated global model.  Common classification metrics are employed, including confusion matrices, F1-score, recall, accuracy, and precision.  Transparency and interpretability are enhanced by charts like performance plots and feature importance graphs.  The global model outperformed some of the earlier research in these areas with an overall accuracy of 95.16%.

Article Details

Section

Articles

How to Cite

Kumar, R., Shieh, C.-S., & Chakrabarti, P. (2026). Privacy-Preserving Feature Selection and Extraction for Federated Learning in Real World Applications . International Journal of Aquatic Research and Environmental Studies, 6(S5), 228-240. https://doi.org/10.70102/531qch49

Similar Articles

You may also start an advanced similarity search for this article.