Friday, July 21, 2006

 

Formatting Fiasco

One of PageScrape's stalwart users has reported a bug in the output formatting option (-f). The functionality only works for the first 9 regular expression buffers, it does not work for buffers 10 and higher. For example consider the following:

pscrape -u"www.webscrape.com" -e"([^<]*)" -f"The Title is: \$1"

This will return the title of the webscrape web page as follows:

The Title is: PageScrape - A HTML Screen Scrape Utility for Web Pages

In the format string, \$1 refers to the first result buffer and causes the contents of this buffer to be inserted into the output string on a successful match. This works, and so will \$9 for the ninth buffer (if there is one in the corresponding Regular Expression), however it will not work for buffers donated by \$10 and higher.

The formatting algorithm only checks for one decimal digit!

This bug should be corrected soon.

Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?