image218 Harvest the Web with OutWit Harvesting the web: is what OutWit is about and for those who are clueless as to what harvesting has to do with the web and how is OutWit related to it, well OutWit simply a collection engine that eradicates the need to scroll down page after page to collect all the information that you need to gather from the page.

In points OutWit eases the task of fetching:

  • List of links
  • Images
  • Emails
  • Data
  • Text

Just to name a few, and it lists them all up neatly and separately, letting you ‘harvest’ what exactly you need to fetch. Easy is definitely not the right to define how easy it makes the entire task involved at getting all the information.

Currently in its beta version, the platform provides quite a lot as OutWit says of itself:

The question, when looking for anything on the Internet, is two-fold: find the pertinent data and make it usable for your purposes. Both processes can prove extremely time-consuming and both can be vastly improved using OutWit Hub. Originally conceived for researchers and data managers, the program is bringing Web scraping tools to everyone for both business and personal use. Just browse the Web for pages that include the information you are looking for; OutWit will scan the pages to recognize the data structure and format it into tables, allowing you to rate it and easily export it to files, spreadsheets or databases for later use.

Pretty handy, as there is no need to open multiple windows to retrieve information all the copy/paste/cut functions are easily amalgamated into a single Window. Initially I thought of it to be pretty absurd, for a single search in Google may bring out a gazillion links on a single page, but I was impressed seeing the filter embedded within it; which enables you to thin out selections by specifying what information one really wants to grab.

  • In your search engine, search for any topic and click on the Next in Series arrow once or twice. Not amazed? Then click on the Browse button (double arrow): The program will automatically browse through all the result pages for your query.
  • Then type the name of your favorite movie star in the address bar (watch the gray text on the left of the bar as you do), then hit return.
  • Finally, ask your favorite search engine for large images of your star, click on the Slideshow button in the toolbar and sit back.

image thumb195 Harvest the Web with OutWit

It appears we have a worker to plough and harvest our search queries, and compared to the traditional search; where one has to click on different pages and even there one has to scroll and move across pages to find the right thing (Let’s consider images for now). OutWit simply puts browsing across pages easier with hitting the buttons for moving across pages and loading images (and other data) automatically. And the results can be hit to view them in the browser (Download it here).

There is a whole lot to discover and a few more features include, browse and pick that lets you, do exactly the same as you can with images; search, fetch, sort and structure as well as collecting lists and tables:

In a single click, OutWit scans Web pages containing lists and tries to recognize the structure of the data. It will then format the information for seamless transfer to spreadsheets, databases or text files. In many cases the automatic mode (‘tables,’, ‘lists’ or ‘guess’) will give excellent results in a matter of seconds. If needed, a manual mode (‘scraper’) allows the user to specify the parameters for OutWit Hub to handle the data on a particular Web site or page.
You browse to Web pages containing lists of contact addresses, product catalogs or any other series of items of interest.

image thumb196 Harvest the Web with OutWit

OutWit definitely makes things far more easier and fast; enabling the collection and acquiring the needed information in an efficient manner. Its still in beta and it can improve much more. I however found one drawback; which was, I couldn’t copy images directly through the list. Perhaps I am still getting the feel of it and I may discover how exactly to ‘copy and paste’; but for the time being I would like OutWit to shed some light on it as well.