tag:blogger.com,1999:blog-8811566939478105312.post8470310624075701651..comments2024-03-15T02:52:05.177-07:00Comments on Web Data Harvesting: Web Data MinerUnknownnoreply@blogger.comBlogger2125tag:blogger.com,1999:blog-8811566939478105312.post-8513985015921143342009-08-15T12:49:13.303-07:002009-08-15T12:49:13.303-07:00I am building a directory of all programs so that ...I am building a directory of all programs so that you can pick the best fit for your needs, however the only program that I have used to do this with is Mozenda. The prices are by far the lowest that I have seen, but not free. If you are looking for free, perl would be the best scripting language to parse the html - with grep you can do it. Give me a week and http://www.theeasybee.com will have a good/open directory to help.Jonathan Thralowhttps://www.blogger.com/profile/08060897901415511562noreply@blogger.comtag:blogger.com,1999:blog-8811566939478105312.post-16537171585096936562009-08-10T21:32:03.528-07:002009-08-10T21:32:03.528-07:00I have a tool that crawls a site and has the abili...I have a tool that crawls a site and has the ability to specify a content start tag and a footer begin tag so that it only pulls the content and not the presentation/design. Also uses tidy html etc to clean it up. My problem is that this exports to an xml file with s a very specific layout to a proprietary product. My questions is, are there any other cheap or free products out there that will do this to an more universal output?poorpaddyhttps://www.blogger.com/profile/15104813484172672435noreply@blogger.com