faqts : Computers : Programming : Languages : PHP

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

3 of 3 people (100%) answered Yes
Recently 3 of 3 people (100%) answered Yes

Entry

I get a CGI Timeout error when running a larg php script that is Pulling Information From a Website
CGI Time Out Error
Using cURL and get timeout error

Mar 28th, 2008 17:22
ha mo, Brandon Kozak, http://php.net,


Server Info
Windows Server 2003 Enterprise Edition (With No Windows Updates Clean 
Install)
PHP 4.4.4 with cURL Library, Using PHP ISAPI
MysSql 4.x
To start with the server configuration:
1.) Install IIS
2.) Install PHP 4.4.4
3.) Install cURL Library
4.) Change IIS to use php4isapi.dll and not php.exe for web server 
extension
5.) Install iis60rkt.exe pck from microsoft for IIS Metabase Management
6.) Add a DWORD into IIS Metabase for ConnectionTimeout = 1800, and 
CGITimeout = 1800,
7.) Change php.ini for max_execution_time = 1800, max_input_time = 
1800, memory_limit = 900M, post_max_size = 900M, upload_max_filesize = 
900M
You don't have to use 900M this is just an example.
What Just happened?
Well we changed IIS CGI to not timeout unless it is ran past 1800sec,
we changed php to allow a longer connection time and file sizes. We 
installed PHP using ISAPI and NOT CGI (php.exe) this is the big point.
If we were to use php.exe php would be processed as a CGI Application 
and will not follow the timeout limits we specified.
NOW
Lets say we are trying to pull text based information from a website 
using regEx (Regular Exspression Matching). For instance:
Sample Code:
<a href="http://google.com">Google</a>
We want to get what is between href=" and the next occurance of the " 
which for the example above we would want to get:
           http://google.com
Im not going to discuss regEx it can be found here -> 
http://www.regular-expressions.info.
So Lets say the page we are trying to pull info from has a lot of 
hyperlinks (<a href=""></a>) tags.
Using many arrays and code like:
$row1=0;
foreach($Filters[4] as $key => $value)
{
    preg_match_all($value, $html1, $price);
    foreach($price[$row1] as $keys => $Valuess)
    {
        foreach($filters_regex[4] as $key1 => $value1)
        {
            $Valuess = preg_replace($value1, '', $Valuess);
	}
	$Price[] = $Valuess;
    }
    $row1++;
}
where $Filters[4] is a regEx like '/\<A HREF\=\".+?\"\>/' which would 
match between the " and the next occurance of" if it exsists.
preg_match_all() what is this?
It takes $value which is the regex from above and matches it with 
$html1 or the info you want ot search in and stores the output into an 
array called $prices.
Next, we have foreach($price[$row1] as $keys => $Valuess)
whcih goes through all the item we matched from the $html1 and uses
foreach($filters_regex[4] as $key1 => $value1)
where $filters_regex[4] is = array('/\<A HREF\=\"/','/\"/')
and replaces it to get the final outcome of just http://google.com
thats basicaly how the code works.
I kept getting CGI Timeout errors while running this code but pulling 
miltiple items from the page at 1 time.
So I had about 7 of these
$row1=0;
foreach($Filters[4] as $key => $value)
{
    preg_match_all($value, $html1, $price);
    foreach($price[$row1] as $keys => $Valuess)
    {
        foreach($filters_regex[4] as $key1 => $value1)
        {
            $Valuess = preg_replace($value1, '', $Valuess);
	}
	$Price[] = $Valuess;
    }
    $row1++;
}
in the script.
How to fix it!
DO NOT USE php.exe for your extention in IIS use php4isapi.dll this 
maked php not run as a CGI applocation and you can set the php.ini for 
execution times which is what we did on installation. Then when you add 
the Metabase info this allows the server to keep the connection alive 
whil the php script is processing so you WON'T timeout.
And thats about it. If you have any more questions about this email me 
[email protected]
Thanks,
Brandon Kozak
http://www.zobab.com
http://www.tantofa.com
http://www.fantofa.com
http://www.mantofa.com
http://www.tanpola.com
http://www.tampola.com
http://www.yamot.com
http://www.mozmar.com
http://www.templatestemp.com