SP
BravenNow
A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection
| USA | technology | ✓ Verified - arxiv.org

A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection

#Cyberbullying detection #BanglaBERT #LSTM #Multilabel classification #Natural language processing #Low-resource languages #Machine learning

📌 Key Takeaways

  • Researchers developed a fusion model combining BanglaBERT-Large with two-layer stacked LSTM for cyberbullying detection
  • The model addresses multilabel classification, recognizing that comments can contain multiple types of abuse simultaneously
  • The research specifically focuses on Bangla language, a low-resource language in this domain
  • The model was evaluated using multiple metrics and 5-fold cross-validation to ensure robustness

📖 Full Retelling

Researchers Mirza Raquib, Asif Pervez Polok, Kedar Nath Biswas, Rahat Uddin Azad, Saydul Akbar Murad, and Nick Rahimi developed a fusion architecture combining BanglaBERT-Large with two-layer stacked LSTM for multi-label cyberbullying detection in a paper submitted to arXiv on February 25, 2026, addressing the challenge of detecting multiple overlapping forms of cyberbullying in Bangla language that has received limited attention. The paper highlights that cyberbullying has become a serious and growing concern in today's virtual world, with adverse consequences for social and mental health when left unnoticed. Most existing approaches use single-label classification, assuming each comment contains only one type of abuse, while in reality, a single comment may include overlapping forms such as threats, hate speech, and harassment. The researchers note that developing a generalized model with moderate accuracy remains challenging in low-resource languages like Bangla where robust pre-trained models are scarce. Their proposed solution addresses the limitations of existing approaches by combining transformers, which offer strong contextual understanding but may miss sequential dependencies, with LSTM models that capture temporal flow but lack semantic depth.

🏷️ Themes

Cyberbullying detection, Natural language processing, Low-resource languages

📚 Related People & Topics

Natural language processing

Processing of natural language by a computer

Natural language processing (NLP) is the processing of natural language information by a computer. NLP is a subfield of computer science and is closely associated with artificial intelligence. NLP is also related to information retrieval, knowledge representation, computational linguistics, and ling...

View Profile → Wikipedia ↗
Long short-term memory

Long short-term memory

Recurrent neural network architecture

Long short-term memory (LSTM) is a type of recurrent neural network (RNN) aimed at mitigating the vanishing gradient problem commonly encountered by traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, hidden Markov models, and other sequence learning methods....

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Natural language processing:

🌐 Languages of Africa 1 shared
🌐 Systematic review 1 shared
🌐 Machine learning 1 shared
🌐 Curriculum learning 1 shared
🌐 Artificial intelligence 1 shared
View full profile
Original Source
--> Computer Science > Computation and Language arXiv:2602.22449 [Submitted on 25 Feb 2026] Title: A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection Authors: Mirza Raquib , Asif Pervez Polok , Kedar Nath Biswas , Rahat Uddin Azad , Saydul Akbar Murad , Nick Rahimi View a PDF of the paper titled A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection, by Mirza Raquib and 5 other authors View PDF HTML Abstract: Cyberbullying has become a serious and growing concern in todays virtual world. When left unnoticed, it can have adverse consequences for social and mental health. Researchers have explored various types of cyberbullying, but most approaches use single-label classification, assuming that each comment contains only one type of abuse. In reality, a single comment may include overlapping forms such as threats, hate speech, and harassment. Therefore, multilabel detection is both realistic and essential. However, multilabel cyberbullying detection has received limited attention, especially in low-resource languages like Bangla, where robust pre-trained models are scarce. Developing a generalized model with moderate accuracy remains challenging. Transformers offer strong contextual understanding but may miss sequential dependencies, while LSTM models capture temporal flow but lack semantic depth. To address these limitations, we propose a fusion architecture that combines BanglaBERT-Large with a two-layer stacked LSTM. We analyze their behavior to jointly model context and sequence. The model is fine-tuned and evaluated on a publicly available multilabel Bangla cyberbullying dataset covering cyberbully, sexual harassment, threat, and spam. We apply different sampling strategies to address class imbalance. Evaluation uses multiple metrics, including accuracy, precision, recall, F1-score, Hamming loss, Cohens kappa, and AUC-ROC. We emplo...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine