SLIIT, Sri Lanka info@nextgensoc.xyz
Note : We help you to Protect your Business

99

Success in getting happy customer

25

Thousands of successful business

120

Total clients who love HighTech

5

Stars reviews given by satisfied clients

Research Methodology


Methodology (Combined Approach)
This research adopts a modular and integrated machine learning (ML) and deep learning (DL) framework tailored to detect and classify multiple types of cyberattacks. Each attack vector (SQLi/XSS, Spear-Phishing, Trojans, and DDoS) has a specific model pipeline, while all components are unified under a centralized Security Operations Center (SOC) system for reporting and visualization.

Step-by-Step Methodology
1. Data Collection
Web Injection (SQLi, XSS): Collected 60,120 samples from public datasets. Spear-Phishing Emails: Extracted from phishing email corpora including Enron and Nazario datasets. Trojan Detection: Network traffic datasets from Kaggle and CIC-IDS2017. DDoS Detection: CIC-DDoS2019 dataset for volumetric attack patterns.
2. Data Preprocessing
Duplicate and null value removal. Tokenization using: BERT/RoBERTa: WordPiece XLNet: SentencePiece Normalization using StandardScaler for numerical data (Trojan/DDoS). Class balancing with RandomUnderSampler and custom weights.
3. Feature Extraction
For NLP-based models: Embedding generation using Transformers. For network traffic: Flow-based, timing, and packet-level features (>80 features). Domain analysis using: Levenshtein Distance Homoglyph detection
4. Model Architecture
Web Injection: BERT, RoBERTa, XLNet and a hybrid RoBERTa+XLNet transformer model. Spear-Phishing: Bi-LSTM + BERT with domain similarity and pattern analysis. Trojan Detection: Autoencoder (256→16 bottleneck) + Random Forest Classifier. DDoS Detection: Logistic Regression + Autoencoder for anomaly-based classification.
5. Training and Optimization
Loss Function: CrossEntropyLoss for classification. Optimizer: AdamW with learning rate scheduling. Techniques used: Gradient Clipping Dropout Regularization Frozen transformer layers to reduce computation.
6. Evaluation
Accuracy, Precision, Recall, F1-Score Confusion Matrix Feature importance analysis Risk scoring for anomalous traffic
7. Remediation & Reporting Module
Automatically generates: Attack Type Payload/Source Reason for flagging Suggested remediation steps Dashboards created using Seaborn/Matplotlib.
8. Integration into SOC Framework
All components feed into a central UI. Real-time alert visualization, case management, and analyst feedback loop. Modular deployment support (containerized services).