While testing some crawler script on GoDaddy shared hosting I noticed that the script is quitting w/o any notice at random points. Both web and CLI execution modes where affected. The script was previously tested on XAMPP server where it worked fine.
Lately, I identified that script always quits after calling one of regular expression functions (PRCE) like preg_replace, preg_match and preg_match_all. The script called them hundreds of times and one of the calls became fatal.
UPDATE: Actually it appears to be some kind of general problem with long string operations. But switching to multi-byte string regular expression functions helped in most scenarios.
I wrote a simple proof-of-concept script:
set_time_limit(0); $a = str_repeat('pattern', 100000); $i=0; while(1) { preg_replace('/pattern/is', 'new', $a); echo("{$i}<br />\n"); flush(); $i++; }
The script runs infinitely on my local server but exits after 400-500 iterations on GoDaddy shared hosting. I have no idea what causes it.
Fortunately, there is an ability to switch to multibyte string functions like mb_erg_replace, mb_ereg_search_init and mb_ereg_search_regs. After replacing every call to PRCE function with it’s equivalent from multibyte extension the works like a charm.
Unfortunately, multibyte regular expression functions aren’t compatible with PRCE functions, so it’ll be a lot of work to edit every function call. I recommend to define a wrapper functions that will accept parameters just like PRCE functions.