International Journal of Computer Science and Informatics IJCSI

ISSN: 2231-5292

ijcct journal

Abstracting and Indexing

Crossref logo
IIMT Bhubaneswar

IJCSI

IMPROVING SPAM EMAIL FILTERING EFFICIENCY USING BAYESIAN BACKWARD APPROACH PROJECT


M. SHESHIKALA
SREC Engineering College,Warangal


Abstract

Unethical e-mail senders bear little or no cost for mass distribution of messages, yet normal e-mail users are forced to spend time and effort in reading undesirable messages from their mailboxes. Due to the rapid increase of electronic mail (or e-mail), several people and companies found it an easy way to distribute a massive amount of undesired messages to a tremendous number of users at a very low cost. These unwanted bulk messages or junk e-mails are called spam messages .Several machine learning approaches have been applied to this problem. In this paper, we explore a new approach based on Bayesian classification that can automatically classify e-mail messages as spam or legitimate. We study its performance for various datasets

Recommended Citation

[1] L.F. Cranor, and B.A. LaMacchia, “Spam!”Communications of the ACM, vol. 41 no. 8, pp. 74-83, Aug. 1998. [2] MessageLabs, “MessageLabs Intelligence: 2006 Annual Security Report,” 2006. http://www.messagelabs.com/resources/mlireports [3] GFi, “How to keep spam off your network,” 2007. http://www.gfi.com/whitepapers/ [4] B. Hoanca, “How good are our weapons in the spam wars?” IEEE Technology and Society Magazine, vol. 25, no. 1, pp. 22-30, Spring 2006. [5] M. Siponen and C. Stucke, “Effective anti-spam strategies in companies: an international study,” In Proc. of the 39th Annual Hawaii Int. Conf. on System Sciences, 2006. [6] The British Computer Society. Anti-spam blamed for 5M lost hours, 2007. http://www.bcs.org/ [7] X.-L. Wang, and I. Cloete, “Learning to classify e-mail: A survey,” In Proc. of the 4th Int. Conf. on Machine Learning and Cybernetics, Guangzhou, Aug. 2005. [8] H. Drucker, D. Wu, and V.N. Vapnik, "Support vector machines for spam categorization," IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 1048 – 1054, Sept. 1999. [9] G. Sakkis, I. Androutsopoulos, G. Paliouras, “A memorybased approach to anti-spam filtering,” Information Retrieval, vol. 6, pp. 49-73, 2003. [10] D.C. Trudgian, “Spam Classification Using Nearest Neighbour Techniques,” In Proc. of the Fifth Int. Conf. on Intelligent Data Engineering and Automated Learning (IDEAL04), UK, 2004. [11] W. Zhao, and Z. Zhang, "An e-mail classification model based on rough set theory," In Proc. of the Int. Conf. on Active Media Technology, 2005. [12] J. Clark, I. Koprinska, and J. Poon, “A neural network based approach to automated e-mail classification,” In Proc. of the IEEE/WIC Int. Conf. on Web Intelligence (WI’03), 2003. [13] M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, “A Bayesian approach to filtering junk e-mail,” In Proc. of AAAI’98 Workshop on Learning for Text Categorization, Madison, WI, July 1998. [14] J. Provost, “Naïve-Bayes vs. rule-learning in classification of e-mail,” The University of Texas at Austin, Department of Computer Sciences Rep. AI-TR-99-284, 1999. [15] I. Androutsopoulos, J. Koutsias, V. Chandrinos, and D. Dpyropoulos, “An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages,” In Proc. of the 23rd Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2000. [16] I. Androutsopoulos, G. Paliouras, V. Karkaletsis and G. Sakkis, “Learning to filter spam e-mail: A comparison of a naïve Bayesian and a memory-based approach,” In Proc. of the 4th European Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD), 2000[17] GFi, “Why Bayesian filtering is the most effective anti-spam technology,” 2007. http://www.gfi.com/whitepapers/ [18] R. Hunt and J. Carpinter, “Current and new developments in spam filtering,” In Proc. of the 14th IEEE Int. Conf. on Networks (ICON’06), Sept. 2006. [19] P. Langley, I. Wayne and K. Thompson, “An analysis of Bayesian classifiers,” In Proc. of the 10th National Conf. on Artificial Intelligence, San Jose, California, 1992. [20] W.W. Cohen, “Learning rules that classify e-mail,” InProc. of AAAI’96 Spring Symposium on Machine Learning in Information Access, Stanford, California, April 1996. [21] T. Joachims, “Text categorization with support vector machines: learning with many relevant features,” In Proc. of the 10th European Conf. on Machine Learning (ECML-98), 1998. [22] M.M. Fuad, D. Deb, and M.S. Hossain, “A trainable fuzzy spam detection system,” In Proc. of the 7th Int. Conf. on Computer and Information Technology, 2004

Download pdf viewer for your browser, if the PDF cannot be displayed.