Saturday, March 6, 2010

Screen Scrapers that help collect data

Screen scrappers are useful computer software that helps collect data that are character based from the output displayed by other programs. Screen scrappers are designed to extract and collect specific data, and to present the collected data in a richer display format using tables or graphs. They can also simply collect data to be indexed for storage. Screen scrappers are increasing in popularity and usage and are also referred to by other names such as a content miner, website ripper, automated data collector, web extractor, website scraper and HTML scrapper.

When activated, a screen scraper will search through website codes, filtering out extraneous codes to provide a better looking presentation. A scrapper only looks for useful data, ignoring the other codes that are useful for presenting the original page in its original layout. A web scrapper just collects the data and presents it without all the accessories that come with the original HTML code.

Screen scrappers are used for a number of applications. A popular example of its use can be seen in the way search engine spiders work. Search engine spiders crawl millions of websites and their pages, collecting data and indexing them. When a person conducts a search, the indexed data are presented as search engine results.

A large number of screen scrappers search through the HTML codes of websites to collect data. Some can however search through scripting languages apart from HTML such as PHP and JavaScript. The collected or mined data will then be presented as HTML, which can be accessed using a web browser or can be stored as text to be accessed offline.

Screen scrappers save a lot of time and energy. People no longer need to search for appropriate sites, click through links to search and collect needed data. The web miner will automatically search through websites based on relevant keywords and generate charts, spreadsheets, graphs and other data needed to compare or use in presentations and reports. Screen scrappers can also effectively access information stored on system that can no longer be accessed, because of incompatibility issues caused by new software or hardware.

While screen scrappers are very useful to legitimate businesses and website owners, they can also be used for illegal and unfavorable purposes. Legitimate business, website owners and search engines make good use web miners to provide useful services and to effectively collate needed data quickly and with relatively less effort. However, some individuals, companies and web owners wrongly use screen scrappers to mine and collect email addresses from websites to use for spam advertising.

The wrong use of screen scrappers by some have led to an ongoing argument within the web community about the ethics and legalities involved with using screen scrappers. Some argument also exists over copyright issues as screen saver can copy the hard work of one person from a website, and then present it in another format on another website. Since screen scrappers neglect data such as adverts on the webpage, people who rely on adverts to generate revenue are complaining because their ads get left out. For these reasons, many website owners are taking measures to prevent their website from being scrapped. At the end of the day, even though it is true that some make use of screen scrappers for negative purpose, it remains a very handy tool that can effectively and legitimately save you time and money.

No comments:

Post a Comment