Filtering Anti-Female Joke on Social Media Space: Natural Language Processing Approach

James Idara; Ekong Anietie; Udoh Abigail; Udoeka Ifreke

James Idara Computer Science Department, Akwa Ibom State University, Ikot Akpaden, Nigeria
Ekong Anietie Computer Science Department, Akwa Ibom State University, Ikot Akpaden, Nigeria
Udoh Abigail Computer Science Department, Akwa Ibom State University, Ikot Akpaden, Nigeria
Udoeka Ifreke Computer Science Department, Akwa Ibom State University, Ikot Akpaden, Nigeria

Keywords: Anti-female, Filtering, Jokes, Natural Language Processing, Machine Learning Algorithms

Abstract

The growing threat of abuse from obscene jokes and other types of objectifying content especially among women has caused harassment and created a hostile environment for some users of social media space. To reduce the rate of hostility, filtering, therefore, becomes necessary for checking uncontrolled posting of contents of obscene jokes. The primary objective of this paper is to develop an intelligent filtering system of anti-female jokes on social media space using Natural Language Processing. 1500 one-liner anti-female jokes were sourced from social media sites, and expressed with characteristics attributes of human-centeredness and polarity orientation. The binning of these attributes was centered on: human-centric vocabulary, negation, negative orientation, sexiest terms, professional communities and private parts. The applicable dataset was divided utilizing k-fold cross-validation for the training process. A filtering system was developed utilizing the algorithm that exhibited the highest level of accuracy. The model was developed in Python, employing various Natural Language Processing techniques. Its performance was assessed using metrics such as precision, recall, and F1-score to ensure evaluation of its effectiveness. Results of the experiments showed that Random Forest algorithm produced the best accuracy with 95.3%. Therefore, the model could be adopted for intelligent filtering of anti-female jokes on social media.

References

Ang, J. C., Mirzal, A., Haron, H., & Hamed, H. N. A. (2015). Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics, 13(5), 971-989. https://doi.org/10.1109/TCBB.2015.2478454
Attardo, S., & Chabanne, J. C. (1992). Jokes as a text type. Humor, 5(1-2). https://doi.org/10.1515/humr.1992.5.1-2.165
Bahri, N., Bach Tobji, M. A., & Ben Yaghlane, B. (2021). ECFAR: A Rule-Based Collaborative Filtering System Dealing with Evidential Data. In International Conference on Intelligent Systems Design and Applications (pp. 944-955). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-96308-8_88
Belkin, N. J., & Croft, W. B. (1992). Information filtering and information retrieval: Two sides of the same coin?. Communications of the ACM, 35(12), 29-38. https://doi.org/10.1145/138859.138861
Billig, M. (2005). Laughter and ridicule: Towards a social critique of humour. Sage Publications.
Brunvand, J. H. (1985). Sex and Grammar Jokes. New York Folklore, 11(1), 49.
Chen, P. Y., & Soo, V. W. (2018). Humor recognition using deep learning. In Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: Human language technologies (Vol. 2, pp. 113-117). https://doi.org/10.18653/v1/N18-2018
Dangeti, P. (2017). Statistics for Machine Learning. Packt Publishing Ltd.
Davies, C. (2010). Jokes as the truth about Soviet socialism. Folklore: Electronic Journal of Folklore, 46, 9-32.
Duan, S., Li, Y., Wan, Y., Wang, P., Wang, Z., & Li, N. (2019). Sensitivity analysis and classification algorithms comparison for underground target detection. IEEE Access, 7, 116227-116246. https://doi.org/10.1109/ACCESS.2019.2936132
Ekong, A., Silas, A., & Inyang, S. (2022). A Machine Learning Approach for Prediction of Students’ Admissibility for Post-Secondary Education using Artificial Neural Network. International Journal of Computer Applications, 184, 44-49.
James, I. I., & Osubor, V. I. (2022). Hostile social media harassment: A machine learning framework for filtering anti-female jokes. Nigerian Journal of Technology, 41(2), 311-317. https://doi.org/10.4314/njt.v41i2.13
Joy, J., & Pillai, R. V. G. (2022). Review and classification of content recommenders in E-learning environment. Journal of King Saud University-Computer and Information Sciences, 34(9), 7670-7685. https://doi.org/10.1016/j.jksuci.2021.06.009
Kumar, V., & Garg, M. L. (2018). Predictive analytics: a review of trends and techniques. International Journal of Computer Applications, 182(1), 31-37.
Maghfiroh, V. S., & Muqoddam, F. (2019, January). Dynamics of sexual harassment on social media. In International Conference of Mental Health, Neuroscience, and Cyber-psychology (pp. 154-162). Fakultas Ilmu Pendidikan. https://doi.org/10.32698/25272
Martin, R. A., & Ford, T. (2018). The psychology of humor: An integrative approach. Academic press.
Mihalcea, R., & Pulman, S. (2007). Characterizing humour: An exploration of features in humorous texts. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 337-347). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_30
Novalita, N., Herdiani, A., Lukmana, I., & Puspandari, D. (2019, March). Cyberbullying identification on twitter using random forest classifier. In Journal of physics: conference series (Vol. 1192, No. 1, p. 012029). IOP Publishing. https://doi.org/10.1088/1742-6596/1192/1/012029
Papadakis, H., Papagrigoriou, A., Panagiotakis, C., Kosmas, E., & Fragopoulou, P. (2022). Collaborative filtering recommender systems taxonomy. Knowledge and Information Systems, 64(1), 35-74. https://doi.org/10.1007/s10115-021-01628-7
Patel, R., Goodell, J. W., Oriani, M. E., Paltrinieri, A., & Yarovaya, L. (2022). A bibliometric review of financial market integration literature. International Review of Financial Analysis, 80, 102035. https://doi.org/10.1016/j.irfa.2022.102035
Tambe, U. S., Kakada, N. R., Suryawanshi, S. J., & Bhamre, S. S. (2021). Content Filtering of Social Media Sites Using Machine Learning Techniques. https://doi.org/10.3233/APC210226
Weller, O., & Seppi, K. (2019). Humor detection: A transformer gets the last laugh. arXiv preprint arXiv:1909.00252. https://doi.org/10.48550/arXiv.1909.00252
Yadav, S., & Shukla, S. (2016). Analysis of k-fold cross-validation over hold-out validation on colossal datasets for quality classification. In 2016 IEEE 6th International conference on advanced computing (IACC) (pp. 78-83). IEEE. https://doi.org/10.1109/IACC.2016.25
Yang, J., Xiu, P., Sun, L., Ying, L., & Muthu, B. (2022). Social media data analytics for business decision making system to competitive analysis. Information Processing & Management, 59(1), 102751. https://doi.org/10.1016/j.ipm.2021.102751