International Journal of Communication Networks and Security IJCNS
ISSN: 2231-1882
Abstracting and Indexing
IJCNS
EXTRACTING ACCURATE DATA FROM MULTIPLE CONFLICTING INFORMATION ON WEB SOURCES
AKSHATA ANGADI
Computer Science and Engineering Department, K.L.E.I.T.,India
KARUNA GULL
K.L.E.I.T. Hubli, India
PADMASHRI DESAI
Computer Science and Engineering Department, B.V.B.C.E.T. Hubli, India
Abstract
For The World-Wide Web has become the most important information source for most of us. As different websites often provide conflicting information there is no guarantee for the correctness of the data. Among multiple conflict results, can we automatically identify which one is likely the true fact?, In this paper our experiments show that Fact finder, a supporter for user to resolve the problem, successfully finds true facts among conflicting information, and identifies Trust worthy websites better than the popular search engines. In our paper we give ratings based on two things- popularity or the hits & number of occurrences of same data. As we can’t give preference only to popularity, we have considered another rating i.e. about number of occurrences of same data in several other websites, which are less popular. This paper helps user to get resolved by conflicting facts from multiple websites on two basis. Further by considering few more relations we can develop a search engine that truly helps the user to resolve the Veracity problem.
Recommended Citation
[1] Princeton Survey Research Associates International, “Leap of faith: Using the Internet Despite the Dangers,” Results of a Nat’l Survey of Internet Users for Consumer Reports Web Watch, Oct. 2005. [2] X. Yin, J. Han, and P. S. Yu, “Truth Discovery with Multiple Conflicting Information Providers on the Web”, IEEE Transactions On Knowledge And Data Engineering, Vol. 20, No. 6, June 2008. [3] R.Y. Wang and D.M. Strong, “Beyond Accuracy: What Data Quality Means to Data Consumers,” J. Management Information Systems, vol. 12, no. 4, pp. 5-34, 1997. [4] T. Mandl, “Implementation and Evaluation of a QualityBased Search Engine,” Proc. 17th ACM Conf. Hypertext and Hypermedia, Aug. 2006. [5] X. Zhu and S. Gauch, “Incorporating Quality Metrics in Centralized/Distributed Information Retrieval on the World Wide Web,” Proc. ACM SIGIR ’00, July 2000. [6] L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank Citation Ranking: Bringing Order to the Web,” technical report, Stanford Digital Library Technologies Project, 1998. [7] J.M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” J. ACM, vol. 46, no. 5, pp. 604-632, 1999. [8] B. Amento, L.G. Terveen, and W.C. Hill, “Does ‘Authority’ Mean Quality? Predicting Expert Quality Ratings of Web Documents,” Proc. ACM SIGIR ’00, July 2000. [9] A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, “Link Analysis Ranking: Algorithms, Theory, and Experiments,” ACM Trans. Internet Technology, vol. 5, no. 1, pp. 231-297, 2005. [10] J.S. Breese, D. Heckerman, and C. Kadie, “Empirical Analysis of Predictive Algorithms for Collaborative Filtering,” technical report, Microsoft Research, 1998. [11] G. Jeh and J. Widom, “SimRank: A Measure of StructuralContext Similarity,” Proc. ACM SIGKDD ’02, July 2002. [12] Yin, J. Han, and P.S. Yu, “LinkClus: Efficient Clustering via Heterogeneous Semantic Links,” Proc. 32nd Int’l Conf. Very Large Data Bases (VLDB ’06), Sept. 2006. [13] R. Guha, R. Kumar, P. Raghavan, and A. Tomkins, “Propagation of Trust and Distrust,” Proc. 13th Int’l Conf. World Wide Web (WWW), 2004. [14] M. Blaze, J. Feigenbaum, and J. Lacy, “Decentralized Trust Management,” Proc. IEEE Symp. Security and Privacy (ISSP ’96), May 1996. [15] Logistical Equation from Wolfram MathWorld, http:// mathworld.wolfram.com/LogisticEquation.html, 2008. [16] Sigmoid Function from Wolfram MathWorld, http://mathworld.wolfram.com/SigmoidFunction.html, 2008.