A Stable and Adaptable Machine Learning Framework for Phishing Detection

Sheela S Maharajpet

Department of MCA, Acharya Institute of Technology, Bangalore – 560107, India.

Shivi Dixit

Department of Computer Applications, Acharya Institute of Graduate Studies, Bangalore - 560107, India.

Hrishikesh Sharma *

Department of MCA, Acharya Institute of Technology, Bangalore – 560107, India.

*Author to whom correspondence should be addressed.


Abstract

Background: Phishing contributes to over one-third of security incidents globally, highlighting the urgent need for robust detection systems.

Aims: The aim of this study is to design and validate a phishing detection system that ensures accuracy, adaptability, and real-time deployment suitability. The system targets institutional and enterprise-level use, focusing on overcoming the shortcomings of traditional rule-based and blacklist approaches.

Study Design: An experimental research study was conducted to evaluate multiple machine learning algorithms for phishing detection. The study adopted a comparative design to identify the most stable and efficient model.

Place and Duration of Study: The work was carried out at the Department of MCA, Acharya Institute of Technology, Bangalore, India, between July 2025 and September 2025.

Methodology: A dataset of SMS messages, consisting of 5,559 messages labelled (phishing and legitimate), was pre-processed using tokenisation, stop-word removal, and vectorisation (TF-IDF and BoW). Lexical, structural, statistical, and semantic features were engineered. Six classifiers—Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbours (kNN), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM)—were trained and evaluated using Accuracy, Precision, Recall, and F1-score. Cross-validation was applied for stability testing. A Django-based web interface was implemented for real-time predictions.

Results: The proposed method uses many algorithms of machine learning with feature engineering to find phishing sites. Support Vector Machine achieved the best stability with 99.99% Accuracy, 98.99% Precision, 99.12% Recall, and 99.05% F1-score. MNB, kNN, and LSTM achieved near-perfect results, while CNN performed relatively lower (Accuracy 91.02%). Real-time system testing showed an average response time of 0.05 seconds per message.

Conclusion: The proposed phishing detection system demonstrates strong accuracy, efficiency, and adaptability. Its lightweight design and real-time performance make it suitable for deployment in institutional servers, email systems, and organisational networks, providing an effective defence against evolving phishing attacks.

Keywords: Phishing detection, machine learning, support vector machine, cyber threats, URL analysis


How to Cite

Maharajpet, S. S., Dixit, S., & Sharma, H. (2026). A Stable and Adaptable Machine Learning Framework for Phishing Detection. Machine Learning for the Real World: Applications and Insights, 41–51. https://doi.org/10.9734/bpi/mono/978-81-999106-5-2/CH3