Reducing Human Effort: Web Data Mining, Learning a New Characteristics from Big Data

Mr.M.Srinivasan., Priyadarshini Engineering College, Vaniyambadi, Vellore, India; Dr.S.Koteeswaran ,VelTechUniversity,Chennai, India

Big Data, DOM, Extraction Pattern, Wrapper Learning & Adaption

This paper presents a Reducing Human Effort: Web Data Mining, Learning a New Characteristics from Big data, reducing human effort in extracting precise information from undetected Web sites. Our approach aims at automatically adapting the information extraction knowledge previously learned from a source Web site to a new undetected site, at the same time, discovering previously undetected attributes. There is a two kinds of text related evidences from the source Web site are considered. The first kind of evidences is obtained from the extraction pattern contained in the previously learned wrapper. The second kind of evidences is derived from the previously extracted or collected items. A generative model for the generation of the web site independent content information and the site dependent layout format of the text fragments related to attribute values contained in a Web page is designed to connect the insecurity involved. We have conducted extensive experiments from more than 50 real world Web sites in more than five different domains to demonstrate the effectiveness of our context.
Paper ID: GRDJEV01I010011
Published in: Volume : 1, Issue : 1
Publication Date: 2016-01-01
Page(s): 13 - 19