Hybrid Explainable Phishing URL Detection Using Transformer-Based Embeddings
- DOI
- 10.2991/978-94-6239-678-4_26How to use a DOI?
- Keywords
- Phishing URL Detection; Threat Detection Systems; Semantic-Structural Fusion; Hybrid Machine Learning Model; Semantic Feature Representation; Cybersecurity Rule Engine; Trust Index Evaluation; XAI; SHAP
- Abstract
Phishing has always been a prevalent cybersecurity threat, using human trust and vulnerabilities on the internet to acquire sensitive information. Standard machine learning and deep learning models have improved the accuracy of phishing URL detection. However, they continue to strive to adjust to the growing severe attack patterns and integration with real world security systems and lack explainability. This paper introduces a hybrid framework for detecting phishing URLs that blends transformer based semantic comprehension with rule-based cybersecurity intelligence to improve robustness and Legibility. Our methodology improves the BERT Phish Finder model by applying MiniLM embeddings for optimized semantic representation, along with lexical, structural, and heuristic URL characteristics. A Random Forest classifier, combined with a bespoke Trust Index, rule-engine and Deep Learning Model delivers multi-dimensional scoring to categorize URLs as Safe, Suspicious, or Phishing. Additionally, by visualizing the model’s decision factors, Explainable AI (XAI) with Sharley Additive exPlanations (SHAP) improves transparency. Real-time detection capabilities and interpretable outputs are demonstrated by the initial implementation using streamlet. In order to lay the groundwork for cross-domain integration across network monitoring, database systems, and big data security analytics, this research attempts to reduce the gap between pure AI models and useful cybersecurity applications.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Pragati Priyadarshinee AU - S. Aravind Chandra AU - P. Varun Reddy AU - Jyothika Thunam PY - 2026 DA - 2026/05/28 TI - Hybrid Explainable Phishing URL Detection Using Transformer-Based Embeddings BT - Proceedings of the 2nd International Conference on Recent Advancement and Modernization in Sustainable Intelligent Technologies & Applications (RAMSITA-2026) PB - Atlantis Press SP - 331 EP - 341 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6239-678-4_26 DO - 10.2991/978-94-6239-678-4_26 ID - Priyadarshinee2026 ER -